51
|
Zhang Z, Li Y, Shin BS. Learning generalizable visual representation via adaptive spectral random convolution for medical image segmentation. Comput Biol Med 2023; 167:107580. [PMID: 39491380 DOI: 10.1016/j.compbiomed.2023.107580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 09/27/2023] [Accepted: 10/15/2023] [Indexed: 11/05/2024]
Abstract
Medical image segmentation models often fail to generalize well when applied to new datasets, hindering their usage in clinical practice. Existing random-convolution-based domain generalization approaches, which involve randomizing the convolutional kernel weights in the initial layers of CNN models, have shown promise in improving model generalizability. Nevertheless, the indiscriminate introduction of high-frequency noise during early feature extraction may pollute the critical fine details and degrade the model's performance on new datasets. To mitigate this problem, we propose an adaptive spectral random convolution (ASRConv) module designed to selectively randomize low-frequency features while avoiding the introduction of high-frequency artifacts. Unlike prior arts, ASRConv dynamically generates convolution kernel weights, enabling more effective control over feature frequencies than randomized kernels. Specifically, ASRConv achieves this selective randomization through a novel weight generation module conditioned on random noise inputs. The adversarial domain augmentation strategy guides the weight generation module in adaptively suppressing high-frequency noise during training, allowing ASRConv to improve feature diversity and reduce overfitting to specific domains. Extensive experimental results show that our proposed ASRConv method consistently outperforms the state-of-the-art methods, with average DSC improvements of 3.07% and 1.18% on fundus and polyp datasets, respectively. We also qualitatively demonstrate the robustness of our model against domain distribution shifts. All these results validate the effectiveness of the proposed ASRConv in learning domain-invariant representations for robust medical image segmentation.
Collapse
Affiliation(s)
- Zuyu Zhang
- Department of Electrical and Computer Engineering, Inha University, Incheon, 22212, South Korea
| | - Yan Li
- Department of Electrical and Computer Engineering, Inha University, Incheon, 22212, South Korea
| | - Byeong-Seok Shin
- Department of Electrical and Computer Engineering, Inha University, Incheon, 22212, South Korea.
| |
Collapse
|
52
|
Qiu L, Cheng J, Gao H, Xiong W, Ren H. Federated Semi-Supervised Learning for Medical Image Segmentation via Pseudo-Label Denoising. IEEE J Biomed Health Inform 2023; 27:4672-4683. [PMID: 37155394 DOI: 10.1109/jbhi.2023.3274498] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Distributed big data and digital healthcare technologies have great potential to promote medical services, but challenges arise when it comes to learning predictive model from diverse and complex e-health datasets. Federated Learning (FL), as a collaborative machine learning technique, aims to address the challenges by learning a joint predictive model across multi-site clients, especially for distributed medical institutions or hospitals. However, most existing FL methods assume that clients possess fully labeled data for training, which is often not the case in e-health datasets due to high labeling costs or expertise requirement. Therefore, this work proposes a novel and feasible approach to learn a Federated Semi-Supervised Learning (FSSL) model from distributed medical image domains, where a federated pseudo-labeling strategy for unlabeled clients is developed based on the embedded knowledge learned from labeled clients. This greatly mitigates the annotation deficiency at unlabeled clients and leads to a cost-effective and efficient medical image analysis tool. We demonstrated the effectiveness of our method by achieving significant improvements compared to the state-of-the-art in both fundus image and prostate MRI segmentation tasks, resulting in the highest Dice scores of 89.23% and 91.95% respectively even with only a few labeled clients participating in model training. This reveals the superiority of our method for practical deployment, ultimately facilitating the wider use of FL in healthcare and leading to better patient outcomes.
Collapse
|
53
|
Gu R, Wang G, Lu J, Zhang J, Lei W, Chen Y, Liao W, Zhang S, Li K, Metaxas DN, Zhang S. CDDSA: Contrastive domain disentanglement and style augmentation for generalizable medical image segmentation. Med Image Anal 2023; 89:102904. [PMID: 37506556 DOI: 10.1016/j.media.2023.102904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 06/06/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023]
Abstract
Generalization to previously unseen images with potential domain shifts is essential for clinically applicable medical image segmentation. Disentangling domain-specific and domain-invariant features is key for Domain Generalization (DG). However, existing DG methods struggle to achieve effective disentanglement. To address this problem, we propose an efficient framework called Contrastive Domain Disentanglement and Style Augmentation (CDDSA) for generalizable medical image segmentation. First, a disentangle network decomposes the image into domain-invariant anatomical representation and domain-specific style code, where the former is sent for further segmentation that is not affected by domain shift, and the disentanglement is regularized by a decoder that combines the anatomical representation and style code to reconstruct the original image. Second, to achieve better disentanglement, a contrastive loss is proposed to encourage the style codes from the same domain and different domains to be compact and divergent, respectively. Finally, to further improve generalizability, we propose a style augmentation strategy to synthesize images with various unseen styles in real time while maintaining anatomical information. Comprehensive experiments on a public multi-site fundus image dataset and an in-house multi-site Nasopharyngeal Carcinoma Magnetic Resonance Image (NPC-MRI) dataset show that the proposed CDDSA achieved remarkable generalizability across different domains, and it outperformed several state-of-the-art methods in generalizable segmentation. Code is available at https://github.com/HiLab-git/DAG4MIA.
Collapse
Affiliation(s)
- Ran Gu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Guotai Wang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; Shanghai AI Lab, Shanghai, China.
| | - Jiangshan Lu
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Jingyang Zhang
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China; School of Biomedical Engineering, ShanghaiTech University, Shanghai, China
| | - Wenhui Lei
- School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, China; Shanghai AI Lab, Shanghai, China
| | - Yinan Chen
- SenseTime Research, Shanghai, China; West China Hospital-SenseTime Joint Lab, West China Biomedical Big Data Center, Sichuan University, Chengdu, China
| | - Wenjun Liao
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, University of Electronic Science and Technology of China, Chengdu, China
| | - Shichuan Zhang
- Department of Radiation Oncology, Sichuan Cancer Hospital and Institute, University of Electronic Science and Technology of China, Chengdu, China
| | - Kang Li
- West China Hospital-SenseTime Joint Lab, West China Biomedical Big Data Center, Sichuan University, Chengdu, China
| | - Dimitris N Metaxas
- Department of Computer Science, Rutgers University, Piscataway NJ 08854, USA
| | - Shaoting Zhang
- School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, China; SenseTime Research, Shanghai, China; Shanghai AI Lab, Shanghai, China.
| |
Collapse
|
54
|
Cellini F, Caamaño D, Carrasco B, Juberías JR, Ossa C, Bringas R, de la Fuente F, Franco P, Coronado D, Pastor JC. Deep Learning Application to Detect Glaucoma with a Mixed Training Approach: Public Database and Expert-Labeled Glaucoma Population. Ophthalmic Res 2023; 66:1278-1285. [PMID: 37778337 DOI: 10.1159/000534251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 09/18/2023] [Indexed: 10/03/2023]
Abstract
INTRODUCTION Artificial intelligence has real potential for early identification of ocular diseases such as glaucoma. An important challenge is the requirement for large databases properly selected, which are not easily obtained. We used a relatively original strategy: a glaucoma recognition algorithm trained with fundus images from public databases and then tested and retrained with a carefully selected patient database. METHODS The study's supervised deep learning method was an adapted version of the ResNet-50 architecture previously trained from 10,658 optic head images (glaucomatous or non-glaucomatous) from seven public databases. A total of 1,158 new images labeled by experts from 616 patients were added. The images were categorized after clinical examination including visual fields in 304 (26%) control images or those with ocular hypertension and 347 (30%) images with early, 290 (25%) with moderate, and 217 (19%) with advanced glaucoma. The initial algorithm was tested using 30% of the selected glaucoma database and then re-trained with 70% of this database and tested again. RESULTS The results in the initial sample showed an area under the curve (AUC) of 76% for all images, and 66% for early, 82% for moderate, and 84% for advanced glaucoma. After retraining the algorithm, the respective AUC results were 82%, 72%, 89%, and 91%. CONCLUSION Using combined data from public databases and data selected and labeled by experts facilitated improvement of the system's precision and identified interesting possibilities for obtaining tools for automatic screening of glaucomatous eyes more affordably.
Collapse
Affiliation(s)
- Florencia Cellini
- Instituto de Oftalmobiología Aplicada (IOBA), University of Valladolid, Valladolid, Spain
| | - Deborah Caamaño
- Instituto de Oftalmobiología Aplicada (IOBA), University of Valladolid, Valladolid, Spain
| | - Belen Carrasco
- Ophthalmology Department, Hospital Clinico Universitario (HCUV), Valladolid, Spain
| | - José R Juberías
- Instituto de Oftalmobiología Aplicada (IOBA), University of Valladolid, Valladolid, Spain
- Ophthalmology Department, Hospital Clinico Universitario (HCUV), Valladolid, Spain
| | - Carolina Ossa
- Instituto de Oftalmobiología Aplicada (IOBA), University of Valladolid, Valladolid, Spain
| | - Ramón Bringas
- Ophthalmology Department, Hospital Universitario Río Hortega (HURH), Valladolid, Spain
| | | | | | | | - Jose Carlos Pastor
- Instituto de Oftalmobiología Aplicada (IOBA), University of Valladolid, Valladolid, Spain
| |
Collapse
|
55
|
Yi Y, Jiang Y, Zhou B, Zhang N, Dai J, Huang X, Zeng Q, Zhou W. C2FTFNet: Coarse-to-fine transformer network for joint optic disc and cup segmentation. Comput Biol Med 2023; 164:107215. [PMID: 37481947 DOI: 10.1016/j.compbiomed.2023.107215] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/07/2023] [Accepted: 06/25/2023] [Indexed: 07/25/2023]
Abstract
Glaucoma is a leading cause of worldwide blindness and visual impairment, making early screening and diagnosis is crucial to prevent vision loss. Cup-to-Disk Ratio (CDR) evaluation serves as a widely applied approach for effective glaucoma screening. At present, deep learning methods have exhibited outstanding performance in optic disk (OD) and optic cup (OC) segmentation and maturely deployed in CAD system. However, owning to the complexity of clinical data, these techniques could be constrained. Therefore, an original Coarse-to-Fine Transformer Network (C2FTFNet) is designed to segment OD and OC jointly , which is composed of two stages. In the coarse stage, to eliminate the effects of irrelevant organization on the segmented OC and OD regions, we employ U-Net and Circular Hough Transform (CHT) to segment the Region of Interest (ROI) of OD. Meanwhile, a TransUnet3+ model is designed in the fine segmentation stage to extract the OC and OD regions more accurately from ROI. In this model, to alleviate the limitation of the receptive field caused by traditional convolutional methods, a Transformer module is introduced into the backbone to capture long-distance dependent features for retaining more global information. Then, a Multi-Scale Dense Skip Connection (MSDC) module is proposed to fuse the low-level and high-level features from different layers for reducing the semantic gap among different level features. Comprehensive experiments conducted on DRIONS-DB, Drishti-GS, and REFUGE datasets validate the superior effectiveness of the proposed C2FTFNet compared to existing state-of-the-art approaches.
Collapse
Affiliation(s)
- Yugen Yi
- School of Software, Jiangxi Normal University, Nanchang, 330022, China; Jiangxi Provincial Engineering Research Center of Blockchain Data Security and Governance, Nanchang, 330022, China
| | - Yan Jiang
- School of Software, Jiangxi Normal University, Nanchang, 330022, China
| | - Bin Zhou
- School of Software, Jiangxi Normal University, Nanchang, 330022, China
| | - Ningyi Zhang
- School of Software, Jiangxi Normal University, Nanchang, 330022, China
| | - Jiangyan Dai
- School of Computer Engineering, Weifang University, 261061, China.
| | - Xin Huang
- School of Software, Jiangxi Normal University, Nanchang, 330022, China
| | - Qinqin Zeng
- Department of Ophthalmology, The Second Affiliated Hospital of Nanchang University, Nanchang, 330006, China
| | - Wei Zhou
- College of Computer Science, Shenyang Aerospace University, Shenyang, 110136, China.
| |
Collapse
|
56
|
Zhang J, Gu R, Xue P, Liu M, Zheng H, Zheng Y, Ma L, Wang G, Gu L. S 3R: Shape and Semantics-Based Selective Regularization for Explainable Continual Segmentation Across Multiple Sites. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2539-2551. [PMID: 37030841 DOI: 10.1109/tmi.2023.3260974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
In clinical practice, it is desirable for medical image segmentation models to be able to continually learn on a sequential data stream from multiple sites, rather than a consolidated dataset, due to storage cost and privacy restrictions. However, when learning on a new site, existing methods struggle with a weak memorizability for previous sites with complex shape and semantic information, and a poor explainability for the memory consolidation process. In this work, we propose a novel Shape and Semantics-based Selective Regularization ( [Formula: see text]) method for explainable cross-site continual segmentation to maintain both shape and semantic knowledge of previously learned sites. Specifically, [Formula: see text] method adopts a selective regularization scheme to penalize changes of parameters with high Joint Shape and Semantics-based Importance (JSSI) weights, which are estimated based on the parameter sensitivity to shape properties and reliable semantics of the segmentation object. This helps to prevent the related shape and semantic knowledge from being forgotten. Moreover, we propose an Importance Activation Mapping (IAM) method for memory interpretation, which indicates the spatial support for important parameters to visualize the memorized content. We have extensively evaluated our method on prostate segmentation and optic cup and disc segmentation tasks. Our method outperforms other comparison methods in reducing model forgetting and increasing explainability. Our code is available at https://github.com/jingyzhang/S3R.
Collapse
|
57
|
Li Z, Zhao C, Han Z, Hong C. TUNet and domain adaptation based learning for joint optic disc and cup segmentation. Comput Biol Med 2023; 163:107209. [PMID: 37442009 DOI: 10.1016/j.compbiomed.2023.107209] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Revised: 06/02/2023] [Accepted: 06/25/2023] [Indexed: 07/15/2023]
Abstract
Glaucoma is a chronic disorder that harms the optic nerves and causes irreversible blindness. The calculation of optic cup (OC) to optic disc (OD) ratio plays an important role in the primary screening and diagnosis of glaucoma. Thus, automatic and precise segmentations of OD and OC is highly preferable. Recently, deep neural networks demonstrate remarkable progress in the OD and OC segmentation, however, they are severely hindered in generalizing across different scanners and image resolution. In this work, we propose a novel domain adaptation-based framework to mitigate the performance degradation in OD and OC segmentation. We first devise an effective transformer-based segmentation network as a backbone to accurately segment the OD and OC regions. Then, to address the issue of domain shift, we introduce domain adaptation into the learning paradigm to encourage domain-invariant features. Since the segmentation-based domain adaptation loss is insufficient for capturing segmentation details, we further propose an auxiliary classifier to enable the discrimination on segmentation details. Exhaustive experiments on three public retinal fundus image datasets, i.e., REFUGE, Drishti-GS and RIM-ONE-r3, demonstrate our superior performance on the segmentation of OD and OC. These results suggest that our proposal has great potential to be an important component for an automated glaucoma screening system.
Collapse
Affiliation(s)
- Zhuorong Li
- Hangzhou City University, Hangzhou, 310015, Zhejiang, China.
| | - Chen Zhao
- Zhejiang University, Hangzhou, 310027, Zhejiang, China
| | - Zhike Han
- Hangzhou City University, Hangzhou, 310015, Zhejiang, China.
| | - Chaoyang Hong
- Zhejiang Provincial People's Hospital, Hangzhou, 310014, Zhejiang, China
| |
Collapse
|
58
|
Liu Z, Lv Q, Yang Z, Li Y, Lee CH, Shen L. Recent progress in transformer-based medical image analysis. Comput Biol Med 2023; 164:107268. [PMID: 37494821 DOI: 10.1016/j.compbiomed.2023.107268] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 05/30/2023] [Accepted: 07/16/2023] [Indexed: 07/28/2023]
Abstract
The transformer is primarily used in the field of natural language processing. Recently, it has been adopted and shows promise in the computer vision (CV) field. Medical image analysis (MIA), as a critical branch of CV, also greatly benefits from this state-of-the-art technique. In this review, we first recap the core component of the transformer, the attention mechanism, and the detailed structures of the transformer. After that, we depict the recent progress of the transformer in the field of MIA. We organize the applications in a sequence of different tasks, including classification, segmentation, captioning, registration, detection, enhancement, localization, and synthesis. The mainstream classification and segmentation tasks are further divided into eleven medical image modalities. A large number of experiments studied in this review illustrate that the transformer-based method outperforms existing methods through comparisons with multiple evaluation metrics. Finally, we discuss the open challenges and future opportunities in this field. This task-modality review with the latest contents, detailed information, and comprehensive comparison may greatly benefit the broad MIA community.
Collapse
Affiliation(s)
- Zhaoshan Liu
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| | - Qiujie Lv
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore; School of Intelligent Systems Engineering, Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, 518107, China.
| | - Ziduo Yang
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore; School of Intelligent Systems Engineering, Sun Yat-sen University, No. 66, Gongchang Road, Guangming District, 518107, China.
| | - Yifan Li
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| | - Chau Hung Lee
- Department of Radiology, Tan Tock Seng Hospital, 11 Jalan Tan Tock Seng, Singapore, 308433, Singapore.
| | - Lei Shen
- Department of Mechanical Engineering, National University of Singapore, 9 Engineering Drive 1, Singapore, 117575, Singapore.
| |
Collapse
|
59
|
Rezaei M, Näppi JJ, Bischl B, Yoshida H. Bayesian uncertainty estimation for detection of long-tailed and unseen conditions in medical images. J Med Imaging (Bellingham) 2023; 10:054501. [PMID: 37818179 PMCID: PMC10560997 DOI: 10.1117/1.jmi.10.5.054501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 09/19/2023] [Accepted: 09/20/2023] [Indexed: 10/12/2023] Open
Abstract
Purpose Deep supervised learning provides an effective approach for developing robust models for various computer-aided diagnosis tasks. However, there is often an underlying assumption that the frequencies of the samples between the different classes of the training dataset are either similar or balanced. In real-world medical data, the samples of positive classes often occur too infrequently to satisfy this assumption. Thus, there is an unmet need for deep-learning systems that can automatically identify and adapt to the real-world conditions of imbalanced data. Approach We propose a deep Bayesian ensemble learning framework to address the representation learning problem of long-tailed and out-of-distribution (OOD) samples when training from medical images. By estimating the relative uncertainties of the input data, our framework can adapt to imbalanced data for learning generalizable classifiers. We trained and tested our framework on four public medical imaging datasets with various imbalance ratios and imaging modalities across three different learning tasks: semantic medical image segmentation, OOD detection, and in-domain generalization. We compared the performance of our framework with those of state-of-the-art comparator methods. Results Our proposed framework outperformed the comparator models significantly across all performance metrics (pairwise t -test: p < 0.01 ) in the semantic segmentation of high-resolution CT and MR images as well as in the detection of OOD samples (p < 0.01 ), thereby showing significant improvement in handling the associated long-tailed data distribution. The results of the in-domain generalization also indicated that our framework can enhance the prediction of retinal glaucoma, contributing to clinical decision-making processes. Conclusions Training of the proposed deep Bayesian ensemble learning framework with dynamic Monte-Carlo dropout and a combination of losses yielded the best generalization to unseen samples from imbalanced medical imaging datasets across different learning tasks.
Collapse
Affiliation(s)
- Mina Rezaei
- LMU Munich, Department of Statistics, Munich, Germany
- Munich Center for Machine Learning, Munich, Germany
| | - Janne J. Näppi
- Massachusetts General Hospital, Harvard Medical School, 3D Imaging Research, Department of Radiology, Boston, Massachusetts, United States
| | - Bernd Bischl
- LMU Munich, Department of Statistics, Munich, Germany
- Munich Center for Machine Learning, Munich, Germany
| | - Hiroyuki Yoshida
- Massachusetts General Hospital, Harvard Medical School, 3D Imaging Research, Department of Radiology, Boston, Massachusetts, United States
| |
Collapse
|
60
|
Xie Y, Wan Q, Xie H, Xu Y, Wang T, Wang S, Lei B. Fundus Image-Label Pairs Synthesis and Retinopathy Screening via GANs With Class-Imbalanced Semi-Supervised Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2714-2725. [PMID: 37030825 DOI: 10.1109/tmi.2023.3263216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Retinopathy is the primary cause of irreversible yet preventable blindness. Numerous deep-learning algorithms have been developed for automatic retinal fundus image analysis. However, existing methods are usually data-driven, which rarely consider the costs associated with fundus image collection and annotation, along with the class-imbalanced distribution that arises from the relative scarcity of disease-positive individuals in the population. Semi-supervised learning on class-imbalanced data, despite a realistic problem, has been relatively little studied. To fill the existing research gap, we explore generative adversarial networks (GANs) as a potential answer to that problem. Specifically, we present a novel framework, named CISSL-GANs, for class-imbalanced semi-supervised learning (CISSL) by leveraging a dynamic class-rebalancing (DCR) sampler, which exploits the property that the classifier trained on class-imbalanced data produces high-precision pseudo-labels on minority classes to leverage the bias inherent in pseudo-labels. Also, given the well-known difficulty of training GANs on complex data, we investigate three practical techniques to improve the training dynamics without altering the global equilibrium. Experimental results demonstrate that our CISSL-GANs are capable of simultaneously improving fundus image class-conditional generation and classification performance under a typical label insufficient and imbalanced scenario. Our code is available at: https://github.com/Xyporz/CISSL-GANs.
Collapse
|
61
|
Hua K, Fang X, Tang Z, Cheng Y, Yu Z. DCAM-NET:A novel domain generalization optic cup and optic disc segmentation pipeline with multi-region and multi-scale convolution attention mechanism. Comput Biol Med 2023; 163:107076. [PMID: 37379616 DOI: 10.1016/j.compbiomed.2023.107076] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Revised: 04/27/2023] [Accepted: 05/27/2023] [Indexed: 06/30/2023]
Abstract
Fundus images are an essential basis for diagnosing ocular diseases, and using convolutional neural networks has shown promising results in achieving accurate fundus image segmentation. However, the difference between the training data (source domain) and the testing data (target domain) will significantly affect the final segmentation performance. This paper proposes a novel framework named DCAM-NET for fundus domain generalization segmentation, which substantially improves the generalization ability of the segmentation model to the target domain data and enhances the extraction of detailed information on the source domain data. This model can effectively overcome the problem of poor model performance due to cross-domain segmentation. To enhance the adaptability of the segmentation model to target domain data, this paper proposes a multi-scale attention mechanism module (MSA) that functions at the feature extraction level. Extracting different attribute features to enter the corresponding scale attention module further captures the critical features in channel, position, and spatial regions. The MSA attention mechanism module also integrates the characteristics of the self-attention mechanism, it can capture dense context information, and the aggregation of multi-feature information effectively enhances the generalization of the model when dealing with unknown domain data. In addition, this paper proposes the multi-region weight fusion convolution module (MWFC), which is essential for the segmentation model to extract feature information from the source domain data accurately. Fusing multiple region weights and convolutional kernel weights on the image to enhance the model adaptability to information at different locations on the image, the fusion of weights deepens the capacity and depth of the model. It enhances the learning ability of the model for multiple regions on the source domain. Our experiments on fundus data for cup/disc segmentation show that the introduction of MSA and MWFC modules in this paper effectively improves the segmentation ability of the segmentation model on the unknown domain. And the performance of the proposed method is significantly better than other methods in the current domain generalization segmentation of the optic cup/disc.
Collapse
Affiliation(s)
- Kaiwen Hua
- School of Computer Science and Engineering, Anhui University of Science and Technology, 232001, Huainan, Anhui, China
| | - Xianjin Fang
- School of Computer Science and Engineering, Anhui University of Science and Technology, 232001, Huainan, Anhui, China.
| | - Zhiri Tang
- Academy for Engineering and Technology, Fudan University, 200433, Shanghai, China
| | - Ying Cheng
- School of Artificial Intelligence Academy, Anhui University of Science and Technology, 232001, Huainan, Anhui, China
| | - Zekuan Yu
- Academy for Engineering and Technology, Fudan University, 200433, Shanghai, China.
| |
Collapse
|
62
|
Huang X, Kong X, Shen Z, Ouyang J, Li Y, Jin K, Ye J. GRAPE: A multi-modal dataset of longitudinal follow-up visual field and fundus images for glaucoma management. Sci Data 2023; 10:520. [PMID: 37543686 PMCID: PMC10404253 DOI: 10.1038/s41597-023-02424-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 07/28/2023] [Indexed: 08/07/2023] Open
Abstract
As one of the leading causes of irreversible blindness worldwide, glaucoma is characterized by structural damage and functional loss. Glaucoma patients often have a long follow-up and prognosis prediction is an important part in treatment. However, existing public glaucoma datasets are almost cross-sectional, concentrating on segmentation on optic disc (OD) and glaucoma diagnosis. With the development of artificial intelligence (AI), the deep learning model can already provide accurate prediction of future visual field (VF) and its progression with the support of longitudinal datasets. Here, we proposed a public longitudinal glaucoma real-world appraisal progression ensemble (GRAPE) dataset. The GRAPE dataset contains 1115 follow-up records from 263 eyes, with VFs, fundus images, OCT measurements and clinical information, and OD segmentation and VF progression are annotated. Two baseline models demonstrated the feasibility in prediction of VF and its progression. This dataset will advance AI research in glaucoma management.
Collapse
Affiliation(s)
- Xiaoling Huang
- Eye Center, The Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310003, China
| | - Xiangyin Kong
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, 310013, China
| | - Ziyan Shen
- Zhejiang Baima Lake Laboratory Co., Ltd, Hangzhou, 310051, China
| | - Jing Ouyang
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, 310013, China
| | - Yunxiang Li
- Department of Radiation Oncology, UT Southwestern Medical Center, Dallas, TX, 75235, USA
| | - Kai Jin
- Eye Center, The Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310003, China.
| | - Juan Ye
- Eye Center, The Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310003, China.
| |
Collapse
|
63
|
Bhattacharya R, Hussain R, Chatterjee A, Paul D, Chatterjee S, Dey D. PY-Net: Rethinking segmentation frameworks with dense pyramidal operations for optic disc and cup segmentation from retinal fundus images. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
|
64
|
Zhou M, Xu Z, Tong RKY. Superpixel-guided class-level denoising for unsupervised domain adaptive fundus image segmentation without source data. Comput Biol Med 2023; 162:107061. [PMID: 37263152 DOI: 10.1016/j.compbiomed.2023.107061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 05/11/2023] [Accepted: 05/20/2023] [Indexed: 06/03/2023]
Abstract
Unsupervised domain adaptation (UDA), which is used to alleviate the domain shift between the source domain and target domain, has attracted substantial research interest. Previous studies have proposed effective UDA methods which require both labeled source data and unlabeled target data to achieve desirable distribution alignment. However, due to privacy concerns, the vendor side often can only trade the pretrained source model without providing the source data to the targeted client, leading to failed adaptation by classical UDA techniques. To address this issue, in this paper, a novel Superpixel-guided Class-level Denoised self-training framework (SCD) is proposed, aiming at effectively adapting the pretrained source model to the target domain in the absence of source data. Since the source data is unavailable, the model can only be trained on the target domain with the pseudo labels obtained from the pretrained source model. However, due to domain shift, the predictions obtained by the source model on the target domain are noisy. Considering this, we propose three mutual-reinforcing components tailored to our self-training framework: (i) an adaptive class-aware thresholding strategy for more balanced pseudo label generation, (ii) a masked superpixel-guided clustering method for generating multiple content-adaptive and spatial-adaptive feature centroids that enhance the discriminability of final prototypes for effective prototypical label denoising, and (iii) adaptive learning schemes for suspected noisy-labeled and correct-labeled pixels to effectively utilize the valuable information available. Comprehensive experiments on multi-site fundus image segmentation demonstrate the superior performance of our approach and the effectiveness of each component.
Collapse
Affiliation(s)
- Meng Zhou
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong, China
| | - Zhe Xu
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong, China.
| | - Raymond Kai-Yu Tong
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong, China.
| |
Collapse
|
65
|
Shamshad F, Khan S, Zamir SW, Khan MH, Hayat M, Khan FS, Fu H. Transformers in medical imaging: A survey. Med Image Anal 2023; 88:102802. [PMID: 37315483 DOI: 10.1016/j.media.2023.102802] [Citation(s) in RCA: 88] [Impact Index Per Article: 88.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 03/11/2023] [Accepted: 03/23/2023] [Indexed: 06/16/2023]
Abstract
Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as de facto operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, restoration, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field's current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at https://github.com/fahadshamshad/awesome-transformers-in-medical-imaging.
Collapse
Affiliation(s)
- Fahad Shamshad
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates.
| | - Salman Khan
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates; CECS, Australian National University, Canberra ACT 0200, Australia
| | - Syed Waqas Zamir
- Inception Institute of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | | | - Munawar Hayat
- Faculty of IT, Monash University, Clayton VIC 3800, Australia
| | - Fahad Shahbaz Khan
- MBZ University of Artificial Intelligence, Abu Dhabi, United Arab Emirates; Computer Vision Laboratory, Linköping University, Sweden
| | - Huazhu Fu
- Institute of High Performance Computing, Agency for Science, Technology and Research (A*STAR), Singapore
| |
Collapse
|
66
|
Matta S, Lamard M, Conze PH, Le Guilcher A, Lecat C, Carette R, Basset F, Massin P, Rottier JB, Cochener B, Quellec G. Towards population-independent, multi-disease detection in fundus photographs. Sci Rep 2023; 13:11493. [PMID: 37460629 DOI: 10.1038/s41598-023-38610-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 07/11/2023] [Indexed: 07/20/2023] Open
Abstract
Independent validation studies of automatic diabetic retinopathy screening systems have recently shown a drop of screening performance on external data. Beyond diabetic retinopathy, this study investigates the generalizability of deep learning (DL) algorithms for screening various ocular anomalies in fundus photographs, across heterogeneous populations and imaging protocols. The following datasets are considered: OPHDIAT (France, diabetic population), OphtaMaine (France, general population), RIADD (India, general population) and ODIR (China, general population). Two multi-disease DL algorithms were developed: a Single-Dataset (SD) network, trained on the largest dataset (OPHDIAT), and a Multiple-Dataset (MD) network, trained on multiple datasets simultaneously. To assess their generalizability, both algorithms were evaluated whenever training and test data originate from overlapping datasets or from disjoint datasets. The SD network achieved a mean per-disease area under the receiver operating characteristic curve (mAUC) of 0.9571 on OPHDIAT. However, it generalized poorly to the other three datasets (mAUC < 0.9). When all four datasets were involved in training, the MD network significantly outperformed the SD network (p = 0.0058), indicating improved generality. However, in leave-one-dataset-out experiments, performance of the MD network was significantly lower on populations unseen during training than on populations involved in training (p < 0.0001), indicating imperfect generalizability.
Collapse
Affiliation(s)
- Sarah Matta
- Université de Bretagne Occidentale, Brest, Bretagne, France.
- INSERM, UMR 1101, Brest, F-29 200, France.
| | - Mathieu Lamard
- Université de Bretagne Occidentale, Brest, Bretagne, France
- INSERM, UMR 1101, Brest, F-29 200, France
| | - Pierre-Henri Conze
- INSERM, UMR 1101, Brest, F-29 200, France
- IMT Atlantique, Brest, F-29200, France
| | | | - Clément Lecat
- Evolucare Technologies, Villers-Bretonneux, F-80800, France
| | | | - Fabien Basset
- Evolucare Technologies, Villers-Bretonneux, F-80800, France
| | - Pascale Massin
- Service d'Ophtalmologie, Hôpital Lariboisière, APHP, Paris, F-75475, France
| | - Jean-Bernard Rottier
- Bâtiment de consultation porte 14 Pôle Santé Sud CMCM, 28 Rue de Guetteloup, Le Mans, F-72100, France
| | - Béatrice Cochener
- Université de Bretagne Occidentale, Brest, Bretagne, France
- INSERM, UMR 1101, Brest, F-29 200, France
- Service d'Ophtalmologie, CHRU Brest, Brest, F-29200, France
| | | |
Collapse
|
67
|
Calabrèse A, Fournet V, Dours S, Matonti F, Castet E, Kornprobst P. A New Vessel-Based Method to Estimate Automatically the Position of the Nonfunctional Fovea on Altered Retinography From Maculopathies. Transl Vis Sci Technol 2023; 12:9. [PMID: 37418249 PMCID: PMC10337789 DOI: 10.1167/tvst.12.7.9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 06/01/2023] [Indexed: 07/08/2023] Open
Abstract
Purpose The purpose of this study was to validate a new automated method to locate the fovea on normal and pathological fundus images. Compared to the normative anatomic measures (NAMs), our vessel-based fovea localization (VBFL) approach relies on the retina's vessel structure to make predictions. Methods The spatial relationship between the fovea location and vessel characteristics is learnt from healthy fundus images and then used to predict fovea location in new images. We evaluate the VBFL method on three categories of fundus images: healthy images acquired with different head orientations and fixation locations, healthy images with simulated macular lesions, and pathological images from age-related macular degeneration (AMD). Results For healthy images taken with the head tilted to the side, the NAM estimation error is significantly multiplied by 4, whereas VBFL yields no significant increase, representing a 73% reduction in prediction error. With simulated lesions, VBFL performance decreases significantly as lesion size increases and remains better than NAM until lesion size reaches 200 degrees2. For pathological images, average prediction error was 2.8 degrees, with 64% of the images yielding an error of 2.5 degrees or less. VBFL was not robust for images showing darker regions and/or incomplete representation of the optic disk. Conclusions The vascular structure provides enough information to precisely locate the fovea in fundus images in a way that is robust to head tilt, eccentric fixation location, missing vessels, and actual macular lesions. Translational Relevance The VBFL method should allow researchers and clinicians to assess automatically the eccentricity of a newly developed area of fixation in fundus images with macular lesions.
Collapse
Affiliation(s)
- Aurélie Calabrèse
- Aix-Marseille Univ, CNRS, LPC, Marseille, France
- Université Côte d'Azur, Inria, France
| | | | | | - Frédéric Matonti
- Centre Monticelli Paradis d'Ophtalmologie, Marseille, France
- Aix-Marseille Univ, CNRS, INT, Marseille, France
- Groupe Almaviva Santé, Clinique Juge, Marseille, France
| | - Eric Castet
- Aix-Marseille Univ, CNRS, LPC, Marseille, France
| | | |
Collapse
|
68
|
Zedan MJM, Zulkifley MA, Ibrahim AA, Moubark AM, Kamari NAM, Abdani SR. Automated Glaucoma Screening and Diagnosis Based on Retinal Fundus Images Using Deep Learning Approaches: A Comprehensive Review. Diagnostics (Basel) 2023; 13:2180. [PMID: 37443574 DOI: 10.3390/diagnostics13132180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 06/16/2023] [Accepted: 06/17/2023] [Indexed: 07/15/2023] Open
Abstract
Glaucoma is a chronic eye disease that may lead to permanent vision loss if it is not diagnosed and treated at an early stage. The disease originates from an irregular behavior in the drainage flow of the eye that eventually leads to an increase in intraocular pressure, which in the severe stage of the disease deteriorates the optic nerve head and leads to vision loss. Medical follow-ups to observe the retinal area are needed periodically by ophthalmologists, who require an extensive degree of skill and experience to interpret the results appropriately. To improve on this issue, algorithms based on deep learning techniques have been designed to screen and diagnose glaucoma based on retinal fundus image input and to analyze images of the optic nerve and retinal structures. Therefore, the objective of this paper is to provide a systematic analysis of 52 state-of-the-art relevant studies on the screening and diagnosis of glaucoma, which include a particular dataset used in the development of the algorithms, performance metrics, and modalities employed in each article. Furthermore, this review analyzes and evaluates the used methods and compares their strengths and weaknesses in an organized manner. It also explored a wide range of diagnostic procedures, such as image pre-processing, localization, classification, and segmentation. In conclusion, automated glaucoma diagnosis has shown considerable promise when deep learning algorithms are applied. Such algorithms could increase the accuracy and efficiency of glaucoma diagnosis in a better and faster manner.
Collapse
Affiliation(s)
- Mohammad J M Zedan
- Department of Electrical, Electronic and Systems Engineering, Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia
- Computer and Information Engineering Department, College of Electronics Engineering, Ninevah University, Mosul 41002, Iraq
| | - Mohd Asyraf Zulkifley
- Department of Electrical, Electronic and Systems Engineering, Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia
| | - Ahmad Asrul Ibrahim
- Department of Electrical, Electronic and Systems Engineering, Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia
| | - Asraf Mohamed Moubark
- Department of Electrical, Electronic and Systems Engineering, Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia
| | - Nor Azwan Mohamed Kamari
- Department of Electrical, Electronic and Systems Engineering, Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia
| | - Siti Raihanah Abdani
- School of Computing Sciences, College of Computing, Informatics and Media, Universiti Teknologi MARA, Shah Alam 40450, Selangor, Malaysia
| |
Collapse
|
69
|
Hemelings R, Elen B, Schuster AK, Blaschko MB, Barbosa-Breda J, Hujanen P, Junglas A, Nickels S, White A, Pfeiffer N, Mitchell P, De Boever P, Tuulonen A, Stalmans I. A generalizable deep learning regression model for automated glaucoma screening from fundus images. NPJ Digit Med 2023; 6:112. [PMID: 37311940 PMCID: PMC10264390 DOI: 10.1038/s41746-023-00857-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 06/01/2023] [Indexed: 06/15/2023] Open
Abstract
A plethora of classification models for the detection of glaucoma from fundus images have been proposed in recent years. Often trained with data from a single glaucoma clinic, they report impressive performance on internal test sets, but tend to struggle in generalizing to external sets. This performance drop can be attributed to data shifts in glaucoma prevalence, fundus camera, and the definition of glaucoma ground truth. In this study, we confirm that a previously described regression network for glaucoma referral (G-RISK) obtains excellent results in a variety of challenging settings. Thirteen different data sources of labeled fundus images were utilized. The data sources include two large population cohorts (Australian Blue Mountains Eye Study, BMES and German Gutenberg Health Study, GHS) and 11 publicly available datasets (AIROGS, ORIGA, REFUGE1, LAG, ODIR, REFUGE2, GAMMA, RIM-ONEr3, RIM-ONE DL, ACRIMA, PAPILA). To minimize data shifts in input data, a standardized image processing strategy was developed to obtain 30° disc-centered images from the original data. A total of 149,455 images were included for model testing. Area under the receiver operating characteristic curve (AUC) for BMES and GHS population cohorts were at 0.976 [95% CI: 0.967-0.986] and 0.984 [95% CI: 0.980-0.991] on participant level, respectively. At a fixed specificity of 95%, sensitivities were at 87.3% and 90.3%, respectively, surpassing the minimum criteria of 85% sensitivity recommended by Prevent Blindness America. AUC values on the eleven publicly available data sets ranged from 0.854 to 0.988. These results confirm the excellent generalizability of a glaucoma risk regression model trained with homogeneous data from a single tertiary referral center. Further validation using prospective cohort studies is warranted.
Collapse
Affiliation(s)
- Ruben Hemelings
- Research Group Ophthalmology, Department of Neurosciences, KU Leuven, Herestraat 49, 3000, Leuven, Belgium.
- Flemish Institute for Technological Research (VITO), Boeretang 200, 2400, Mol, Belgium.
| | - Bart Elen
- Flemish Institute for Technological Research (VITO), Boeretang 200, 2400, Mol, Belgium
| | - Alexander K Schuster
- Department of Ophthalmology, University Medical Center Mainz, Langenbeckstr. 1, 55131, Mainz, Germany
| | | | - João Barbosa-Breda
- Research Group Ophthalmology, Department of Neurosciences, KU Leuven, Herestraat 49, 3000, Leuven, Belgium
- Cardiovascular R&D Center, Faculty of Medicine of the University of Porto, Alameda Prof. Hernâni Monteiro, 4200-319, Porto, Portugal
- Department of Ophthalmology, Centro Hospitalar e Universitário São João, Alameda Prof. Hernâni Monteiro, 4200-319, Porto, Portugal
| | - Pekko Hujanen
- Tays Eye Centre, Tampere University Hospital, Tampere, Finland
| | - Annika Junglas
- Department of Ophthalmology, University Medical Center Mainz, Langenbeckstr. 1, 55131, Mainz, Germany
| | - Stefan Nickels
- Department of Ophthalmology, University Medical Center Mainz, Langenbeckstr. 1, 55131, Mainz, Germany
| | - Andrew White
- Department of Ophthalmology, The University of Sydney, Sydney, NSW, Australia
| | - Norbert Pfeiffer
- Department of Ophthalmology, University Medical Center Mainz, Langenbeckstr. 1, 55131, Mainz, Germany
| | - Paul Mitchell
- Department of Ophthalmology, The University of Sydney, Sydney, NSW, Australia
| | - Patrick De Boever
- Centre for Environmental Sciences, Hasselt University, Agoralaan building D, 3590, Diepenbeek, Belgium
- University of Antwerp, Department of Biology, 2610, Wilrijk, Belgium
| | - Anja Tuulonen
- Tays Eye Centre, Tampere University Hospital, Tampere, Finland
| | - Ingeborg Stalmans
- Research Group Ophthalmology, Department of Neurosciences, KU Leuven, Herestraat 49, 3000, Leuven, Belgium
- Ophthalmology Department, UZ Leuven, Herestraat 49, 3000, Leuven, Belgium
| |
Collapse
|
70
|
Shi P, Qiu J, Abaxi SMD, Wei H, Lo FPW, Yuan W. Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation. Diagnostics (Basel) 2023; 13:1947. [PMID: 37296799 PMCID: PMC10252742 DOI: 10.3390/diagnostics13111947] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 05/26/2023] [Accepted: 05/31/2023] [Indexed: 06/12/2023] Open
Abstract
Medical image analysis plays an important role in clinical diagnosis. In this paper, we examine the recent Segment Anything Model (SAM) on medical images, and report both quantitative and qualitative zero-shot segmentation results on nine medical image segmentation benchmarks, covering various imaging modalities, such as optical coherence tomography (OCT), magnetic resonance imaging (MRI), and computed tomography (CT), as well as different applications including dermatology, ophthalmology, and radiology. Those benchmarks are representative and commonly used in model development. Our experimental results indicate that while SAM presents remarkable segmentation performance on images from the general domain, its zero-shot segmentation ability remains restricted for out-of-distribution images, e.g., medical images. In addition, SAM exhibits inconsistent zero-shot segmentation performance across different unseen medical domains. For certain structured targets, e.g., blood vessels, the zero-shot segmentation of SAM completely failed. In contrast, a simple fine-tuning of it with a small amount of data could lead to remarkable improvement of the segmentation quality, showing the great potential and feasibility of using fine-tuned SAM to achieve accurate medical image segmentation for a precision diagnostics. Our study indicates the versatility of generalist vision foundation models on medical imaging, and their great potential to achieve desired performance through fine-turning and eventually address the challenges associated with accessing large and diverse medical datasets in support of clinical diagnostics.
Collapse
Affiliation(s)
- Peilun Shi
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China; (P.S.); (J.Q.); (S.M.D.A.); (H.W.)
| | - Jianing Qiu
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China; (P.S.); (J.Q.); (S.M.D.A.); (H.W.)
- Department of Computing, Imperial College London, London SW7 2AZ, UK
| | - Sai Mu Dalike Abaxi
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China; (P.S.); (J.Q.); (S.M.D.A.); (H.W.)
| | - Hao Wei
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China; (P.S.); (J.Q.); (S.M.D.A.); (H.W.)
| | - Frank P.-W. Lo
- Hamlyn Centre, Department of Surgery and Cancer, Imperial College London, London SW7 2AZ, UK;
| | - Wu Yuan
- Department of Biomedical Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China; (P.S.); (J.Q.); (S.M.D.A.); (H.W.)
| |
Collapse
|
71
|
Chłopowiec AR, Karanowski K, Skrzypczak T, Grzesiuk M, Chłopowiec AB, Tabakov M. Counteracting Data Bias and Class Imbalance-Towards a Useful and Reliable Retinal Disease Recognition System. Diagnostics (Basel) 2023; 13:diagnostics13111904. [PMID: 37296756 DOI: 10.3390/diagnostics13111904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 05/22/2023] [Accepted: 05/25/2023] [Indexed: 06/12/2023] Open
Abstract
Multiple studies presented satisfactory performances for the treatment of various ocular diseases. To date, there has been no study that describes a multiclass model, medically accurate, and trained on large diverse dataset. No study has addressed a class imbalance problem in one giant dataset originating from multiple large diverse eye fundus image collections. To ensure a real-life clinical environment and mitigate the problem of biased medical image data, 22 publicly available datasets were merged. To secure medical validity only Diabetic Retinopathy (DR), Age-Related Macular Degeneration (AMD) and Glaucoma (GL) were included. The state-of-the-art models ConvNext, RegNet and ResNet were utilized. In the resulting dataset, there were 86,415 normal, 3787 GL, 632 AMD and 34,379 DR fundus images. ConvNextTiny achieved the best results in terms of recognizing most of the examined eye diseases with the most metrics. The overall accuracy was 80.46 ± 1.48. Specific accuracy values were: 80.01 ± 1.10 for normal eye fundus, 97.20 ± 0.66 for GL, 98.14 ± 0.31 for AMD, 80.66 ± 1.27 for DR. A suitable screening model for the most prevalent retinal diseases in ageing societies was designed. The model was developed on a diverse, combined large dataset which made the obtained results less biased and more generalizable.
Collapse
Affiliation(s)
- Adam R Chłopowiec
- Department of Artificial Intelligence, Wroclaw University of Science and Technology, Wybrzeże Wyspianskiego 27, 50-370 Wroclaw, Poland
| | - Konrad Karanowski
- Department of Artificial Intelligence, Wroclaw University of Science and Technology, Wybrzeże Wyspianskiego 27, 50-370 Wroclaw, Poland
| | - Tomasz Skrzypczak
- Faculty of Medicine, Wroclaw Medical University, Wybrzeże Ludwika Pasteura 1, 50-367 Wroclaw, Poland
| | - Mateusz Grzesiuk
- Department of Artificial Intelligence, Wroclaw University of Science and Technology, Wybrzeże Wyspianskiego 27, 50-370 Wroclaw, Poland
| | - Adrian B Chłopowiec
- Department of Artificial Intelligence, Wroclaw University of Science and Technology, Wybrzeże Wyspianskiego 27, 50-370 Wroclaw, Poland
| | - Martin Tabakov
- Department of Artificial Intelligence, Wroclaw University of Science and Technology, Wybrzeże Wyspianskiego 27, 50-370 Wroclaw, Poland
| |
Collapse
|
72
|
Abramovich O, Pizem H, Van Eijgen J, Oren I, Melamed J, Stalmans I, Blumenthal EZ, Behar JA. FundusQ-Net: A regression quality assessment deep learning algorithm for fundus images quality grading. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 239:107522. [PMID: 37285697 DOI: 10.1016/j.cmpb.2023.107522] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 03/23/2023] [Accepted: 03/30/2023] [Indexed: 06/09/2023]
Abstract
OBJECTIVE Ophthalmological pathologies such as glaucoma, diabetic retinopathy and age-related macular degeneration are major causes of blindness and vision impairment. There is a need for novel decision support tools that can simplify and speed up the diagnosis of these pathologies. A key step in this process is to automatically estimate the quality of the fundus images to make sure these are interpretable by a human operator or a machine learning model. We present a novel fundus image quality scale and deep learning (DL) model that can estimate fundus image quality relative to this new scale. METHODS A total of 1245 images were graded for quality by two ophthalmologists within the range 1-10, with a resolution of 0.5. A DL regression model was trained for fundus image quality assessment. The architecture used was Inception-V3. The model was developed using a total of 89,947 images from 6 databases, of which 1245 were labeled by the specialists and the remaining 88,702 images were used for pre-training and semi-supervised learning. The final DL model was evaluated on an internal test set (n=209) as well as an external test set (n=194). RESULTS The final DL model, denoted FundusQ-Net, achieved a mean absolute error of 0.61 (0.54-0.68) on the internal test set. When evaluated as a binary classification model on the public DRIMDB database as an external test set the model obtained an accuracy of 99%. SIGNIFICANCE the proposed algorithm provides a new robust tool for automated quality grading of fundus images.
Collapse
Affiliation(s)
- Or Abramovich
- The Faculty of Biomedical Engineering, Technion-IIT, Haifa, Israel
| | - Hadas Pizem
- Rambam Medical Center: Rambam Health Care Campus, Israel
| | - Jan Van Eijgen
- Research Group of Ophthalmology, Department of Neurosciences, KU Leuven, Oude Markt 13, 3000 Leuven; Department of Ophthalmology, University Hospitals UZ Leuven, Herestraat 49, 3000 Leuven, Belgium
| | - Ilan Oren
- The Faculty of Biomedical Engineering, Technion-IIT, Haifa, Israel
| | - Joshua Melamed
- The Faculty of Biomedical Engineering, Technion-IIT, Haifa, Israel
| | - Ingeborg Stalmans
- Research Group of Ophthalmology, Department of Neurosciences, KU Leuven, Oude Markt 13, 3000 Leuven; Department of Ophthalmology, University Hospitals UZ Leuven, Herestraat 49, 3000 Leuven, Belgium
| | | | - Joachim A Behar
- The Faculty of Biomedical Engineering, Technion-IIT, Haifa, Israel.
| |
Collapse
|
73
|
Krzywicki T, Brona P, Zbrzezny AM, Grzybowski AE. A Global Review of Publicly Available Datasets Containing Fundus Images: Characteristics, Barriers to Access, Usability, and Generalizability. J Clin Med 2023; 12:jcm12103587. [PMID: 37240693 DOI: 10.3390/jcm12103587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 04/29/2023] [Accepted: 05/17/2023] [Indexed: 05/28/2023] Open
Abstract
This article provides a comprehensive and up-to-date overview of the repositories that contain color fundus images. We analyzed them regarding availability and legality, presented the datasets' characteristics, and identified labeled and unlabeled image sets. This study aimed to complete all publicly available color fundus image datasets to create a central catalog of available color fundus image datasets.
Collapse
Affiliation(s)
- Tomasz Krzywicki
- Faculty of Mathematics and Computer Science, University of Warmia and Mazury, 10-710 Olsztyn, Poland
| | - Piotr Brona
- Department of Ophthalmology, Poznan City Hospital, 61-285 Poznań, Poland
| | - Agnieszka M Zbrzezny
- Faculty of Mathematics and Computer Science, University of Warmia and Mazury, 10-710 Olsztyn, Poland
- Faculty of Design, SWPS University of Social Sciences and Humanities, Chodakowska 19/31, 03-815 Warsaw, Poland
| | - Andrzej E Grzybowski
- Institute for Research in Ophthalmology, Foundation for Ophthalmology Development, 60-836 Poznań, Poland
| |
Collapse
|
74
|
Tang Y, Wang S, Qu Y, Cui Z, Zhang W. Consistency and adversarial semi-supervised learning for medical image segmentation. Comput Biol Med 2023; 161:107018. [PMID: 37216776 DOI: 10.1016/j.compbiomed.2023.107018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 04/15/2023] [Accepted: 05/05/2023] [Indexed: 05/24/2023]
Abstract
Medical image segmentation based on deep learning has made enormous progress in recent years. However, the performance of existing methods generally heavily relies on a large amount of labeled data, which are commonly expensive and time-consuming to obtain. To settle above issue, in this paper, a novel semi-supervised medical image segmentation method is proposed, in which the adversarial training mechanism and the collaborative consistency learning strategy are introduced into the mean teacher model. With the adversarial training mechanism, the discriminator can generate confidence maps for unlabeled data, such that more reliable supervised information for the student network is exploited. In the process of adversarial training, we further propose a collaborative consistency learning strategy by which the auxiliary discriminator can assist the primary discriminator in achieving supervised information with higher quality. We extensively evaluate our method on three representative yet challenging medical image segmentation tasks: (1) skin lesion segmentation from dermoscopy images in the International Skin Imaging Collaboration (ISIC) 2017 dataset; (2) optic cup and optic disk (OC/OD) segmentation from fundus images in the Retinal Fundus Glaucoma Challenge (REFUGE) dataset; and (3) tumor segmentation from lower-grade glioma (LGG) tumors images. The experimental results validate the superiority and effectiveness of our proposal when compared with the state-of-the-art semi-supervised medical image segmentation methods.
Collapse
Affiliation(s)
- Yongqiang Tang
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China.
| | - Shilei Wang
- School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan, Shanxi, 030024, China.
| | - Yuxun Qu
- College of Computer Science, Nankai University, Tianjin, 300350, China.
| | - Zhihua Cui
- School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan, Shanxi, 030024, China.
| | - Wensheng Zhang
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China.
| |
Collapse
|
75
|
Tadisetty S, Chodavarapu R, Jin R, Clements RJ, Yu M. Identifying the Edges of the Optic Cup and the Optic Disc in Glaucoma Patients by Segmentation. SENSORS (BASEL, SWITZERLAND) 2023; 23:4668. [PMID: 37430580 PMCID: PMC10221430 DOI: 10.3390/s23104668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 05/08/2023] [Accepted: 05/10/2023] [Indexed: 07/12/2023]
Abstract
With recent advancements in artificial intelligence, fundus diseases can be classified automatically for early diagnosis, and this is an interest of many researchers. The study aims to detect the edges of the optic cup and the optic disc of fundus images taken from glaucoma patients, which has further applications in the analysis of the cup-to-disc ratio (CDR). We apply a modified U-Net model architecture on various fundus datasets and use segmentation metrics to evaluate the model. We apply edge detection and dilation to post-process the segmentation and better visualize the optic cup and optic disc. Our model results are based on ORIGA, RIM-ONE v3, REFUGE, and Drishti-GS datasets. Our results show that our methodology obtains promising segmentation efficiency for CDR analysis.
Collapse
Affiliation(s)
- Srikanth Tadisetty
- Department of Computer Science, Kent State University, Kent, OH 44242, USA; (S.T.); (R.C.)
| | - Ranjith Chodavarapu
- Department of Computer Science, Kent State University, Kent, OH 44242, USA; (S.T.); (R.C.)
| | - Ruoming Jin
- Department of Computer Science, Kent State University, Kent, OH 44242, USA; (S.T.); (R.C.)
| | - Robert J. Clements
- Department of Biological Sciences, Kent State University, Kent, OH 44242, USA;
| | - Minzhong Yu
- Department of Ophthalmology, University Hospitals, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
76
|
Luo G, Liu T, Lu J, Chen X, Yu L, Wu J, Chen DZ, Cai W. Influence of Data Distribution on Federated Learning Performance in Tumor Segmentation. Radiol Artif Intell 2023; 5:e220082. [PMID: 37293342 PMCID: PMC10245185 DOI: 10.1148/ryai.220082] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 03/17/2023] [Accepted: 04/13/2023] [Indexed: 06/10/2023]
Abstract
Purpose To investigate the correlation between differences in data distributions and federated deep learning (Fed-DL) algorithm performance in tumor segmentation on CT and MR images. Materials and Methods Two Fed-DL datasets were retrospectively collected (from November 2020 to December 2021): one dataset of liver tumor CT images (Federated Imaging in Liver Tumor Segmentation [or, FILTS]; three sites, 692 scans) and one publicly available dataset of brain tumor MR images (Federated Tumor Segmentation [or, FeTS]; 23 sites, 1251 scans). Scans from both datasets were grouped according to site, tumor type, tumor size, dataset size, and tumor intensity. To quantify differences in data distributions, the following four distance metrics were calculated: earth mover's distance (EMD), Bhattacharyya distance (BD), χ2 distance (CSD), and Kolmogorov-Smirnov distance (KSD). Both federated and centralized nnU-Net models were trained by using the same grouped datasets. Fed-DL model performance was evaluated by using the ratio of Dice coefficients, θ, between federated and centralized models trained and tested on the same 80:20 split datasets. Results The Dice coefficient ratio (θ) between federated and centralized models was strongly negatively correlated with the distances between data distributions, with correlation coefficients of -0.920 for EMD, -0.893 for BD, and -0.899 for CSD. However, KSD was weakly correlated with θ, with a correlation coefficient of -0.479. Conclusion Performance of Fed-DL models in tumor segmentation on CT and MRI datasets was strongly negatively correlated with the distances between data distributions.Keywords: CT, Abdomen/GI, Liver, Comparative Studies, MR Imaging, Brain/Brain Stem, Convolutional Neural Network (CNN), Federated Deep Learning, Tumor Segmentation, Data Distribution Supplemental material is available for this article. © RSNA, 2023See also the commentary by Kwak and Bai in this issue.
Collapse
|
77
|
Wu F, Zhuang X. Minimizing Estimated Risks on Unlabeled Data: A New Formulation for Semi-Supervised Medical Image Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:6021-6036. [PMID: 36251907 DOI: 10.1109/tpami.2022.3215186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Supervised segmentation can be costly, particularly in applications of biomedical image analysis where large scale manual annotations from experts are generally too expensive to be available. Semi-supervised segmentation, able to learn from both the labeled and unlabeled images, could be an efficient and effective alternative for such scenarios. In this work, we propose a new formulation based on risk minimization, which makes full use of the unlabeled images. Different from most of the existing approaches which solely explicitly guarantee the minimization of prediction risks from the labeled training images, the new formulation also considers the risks on unlabeled images. Particularly, this is achieved via an unbiased estimator, based on which we develop a general framework for semi-supervised image segmentation. We validate this framework on three medical image segmentation tasks, namely cardiac segmentation on ACDC2017, optic cup and disc segmentation on REFUGE dataset and 3D whole heart segmentation on MM-WHS dataset. Results show that the proposed estimator is effective, and the segmentation method achieves superior performance and demonstrates great potential compared to the other state-of-the-art approaches. Our code and data will be released via https://zmiclab.github.io/projects.html, once the manuscript is accepted for publication.
Collapse
|
78
|
Son J, Shin JY, Kong ST, Park J, Kwon G, Kim HD, Park KH, Jung KH, Park SJ. An interpretable and interactive deep learning algorithm for a clinically applicable retinal fundus diagnosis system by modelling finding-disease relationship. Sci Rep 2023; 13:5934. [PMID: 37045856 PMCID: PMC10097752 DOI: 10.1038/s41598-023-32518-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 03/28/2023] [Indexed: 04/14/2023] Open
Abstract
The identification of abnormal findings manifested in retinal fundus images and diagnosis of ophthalmic diseases are essential to the management of potentially vision-threatening eye conditions. Recently, deep learning-based computer-aided diagnosis systems (CADs) have demonstrated their potential to reduce reading time and discrepancy amongst readers. However, the obscure reasoning of deep neural networks (DNNs) has been the leading cause to reluctance in its clinical use as CAD systems. Here, we present a novel architectural and algorithmic design of DNNs to comprehensively identify 15 abnormal retinal findings and diagnose 8 major ophthalmic diseases from macula-centered fundus images with the accuracy comparable to experts. We then define a notion of counterfactual attribution ratio (CAR) which luminates the system's diagnostic reasoning, representing how each abnormal finding contributed to its diagnostic prediction. By using CAR, we show that both quantitative and qualitative interpretation and interactive adjustment of the CAD result can be achieved. A comparison of the model's CAR with experts' finding-disease diagnosis correlation confirms that the proposed model identifies the relationship between findings and diseases similarly as ophthalmologists do.
Collapse
Affiliation(s)
| | - Joo Young Shin
- Department of Ophthalmology, Seoul Metropolitan Government Seoul National University Boramae Medical Center, Seoul, Republic of Korea
| | | | | | | | - Hoon Dong Kim
- Department of Ophthalmology, College of Medicine, Soonchunhyang University, Cheonan, Republic of Korea
| | - Kyu Hyung Park
- Department of Ophthalmology, Seoul National University College of Medicine, Seoul National University Bundang Hospital, 82, Gumi-ro 173 Beon-gil, Bundang-gu, Seongnam-si, Gyeonggi-do, 13620, Republic of Korea
| | - Kyu-Hwan Jung
- Department of Medical Device Research and Management, Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, 81 Irwon-ro, Gangnam-gu, Seoul, Republic of Korea.
| | - Sang Jun Park
- Department of Ophthalmology, Seoul National University College of Medicine, Seoul National University Bundang Hospital, 82, Gumi-ro 173 Beon-gil, Bundang-gu, Seongnam-si, Gyeonggi-do, 13620, Republic of Korea.
| |
Collapse
|
79
|
Septiarini A, Hamdani H, Setyaningsih E, Junirianto E, Utaminingrum F. Automatic Method for Optic Disc Segmentation Using Deep Learning on Retinal Fundus Images. Healthc Inform Res 2023; 29:145-151. [PMID: 37190738 DOI: 10.4258/hir.2023.29.2.145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Accepted: 02/17/2023] [Indexed: 05/17/2023] Open
Abstract
OBJECTIVES The optic disc is part of the retinal fundus image structure, which influences the extraction of glaucoma features. This study proposes a method that automatically segments the optic disc area in retinal fundus images using deep learning based on a convolutional neural network (CNN). METHODS This study used private and public datasets containing retinal fundus images. The private dataset consisted of 350 images, while the public dataset was the Retinal Fundus Glaucoma Challenge (REFUGE). The proposed method was based on a CNN with a single-shot multibox detector (MobileNetV2) to form images of the region-of-interest (ROI) using the original image resized into 640 × 640 input data. A pre-processing sequence was then implemented, including augmentation, resizing, and normalization. Furthermore, a U-Net model was applied for optic disc segmentation with 128 × 128 input data. RESULTS The proposed method was appropriately applied to the datasets used, as shown by the values of the F1-score, dice score, and intersection over union of 0.9880, 0.9852, and 0.9763 for the private dataset, respectively, and 0.9854, 0.9838 and 0.9712 for the REFUGE dataset. CONCLUSIONS The optic disc area produced by the proposed method was similar to that identified by an ophthalmologist. Therefore, this method can be considered for implementing automatic segmentation of the optic disc area.
Collapse
Affiliation(s)
- Anindita Septiarini
- Department of Informatics, Faculty of Engineering, Mulawarman University, Samarinda, Indonesia
| | - Hamdani Hamdani
- Department of Informatics, Faculty of Engineering, Mulawarman University, Samarinda, Indonesia
| | - Emy Setyaningsih
- Department of Computer, System Engineering, Institut Sains & Teknologi AKPRIND, Yogyakarta, Indonesia
| | - Eko Junirianto
- Departmen of Information Technology, Samarinda Polytechnic of Agriculture, Samarinda, Indonesia
| | - Fitri Utaminingrum
- Computer Vision Research Group, Faculty of Computer Science, Brawijaya University, Malang, Indonesia
| |
Collapse
|
80
|
Li J, Chen J, Tang Y, Wang C, Landman BA, Zhou SK. Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives. Med Image Anal 2023; 85:102762. [PMID: 36738650 PMCID: PMC10010286 DOI: 10.1016/j.media.2023.102762] [Citation(s) in RCA: 38] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 01/18/2023] [Accepted: 01/27/2023] [Indexed: 02/01/2023]
Abstract
Transformer, one of the latest technological advances of deep learning, has gained prevalence in natural language processing or computer vision. Since medical imaging bear some resemblance to computer vision, it is natural to inquire about the status quo of Transformers in medical imaging and ask the question: can the Transformer models transform medical imaging? In this paper, we attempt to make a response to the inquiry. After a brief introduction of the fundamentals of Transformers, especially in comparison with convolutional neural networks (CNNs), and highlighting key defining properties that characterize the Transformers, we offer a comprehensive review of the state-of-the-art Transformer-based approaches for medical imaging and exhibit current research progresses made in the areas of medical image segmentation, recognition, detection, registration, reconstruction, enhancement, etc. In particular, what distinguishes our review lies in its organization based on the Transformer's key defining properties, which are mostly derived from comparing the Transformer and CNN, and its type of architecture, which specifies the manner in which the Transformer and CNN are combined, all helping the readers to best understand the rationale behind the reviewed approaches. We conclude with discussions of future perspectives.
Collapse
Affiliation(s)
- Jun Li
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Junyu Chen
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins Medical Institutes, Baltimore, MD, USA
| | - Yucheng Tang
- Department of Electrical and Computer Engineering, Vanderbilt University, Nashville, TN, USA
| | - Ce Wang
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China
| | - Bennett A Landman
- Department of Electrical and Computer Engineering, Vanderbilt University, Nashville, TN, USA
| | - S Kevin Zhou
- Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China; School of Biomedical Engineering & Suzhou Institute for Advanced Research, Center for Medical Imaging, Robotics, and Analytic Computing & Learning (MIRACLE), University of Science and Technology of China, Suzhou 215123, China.
| |
Collapse
|
81
|
Zhou W, Ji J, Jiang Y, Wang J, Qi Q, Yi Y. EARDS: EfficientNet and attention-based residual depth-wise separable convolution for joint OD and OC segmentation. Front Neurosci 2023; 17:1139181. [PMID: 36968487 PMCID: PMC10033527 DOI: 10.3389/fnins.2023.1139181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 02/14/2023] [Indexed: 03/29/2023] Open
Abstract
Background Glaucoma is the leading cause of irreversible vision loss. Accurate Optic Disc (OD) and Optic Cup (OC) segmentation is beneficial for glaucoma diagnosis. In recent years, deep learning has achieved remarkable performance in OD and OC segmentation. However, OC segmentation is more challenging than OD segmentation due to its large shape variability and cryptic boundaries that leads to performance degradation when applying the deep learning models to segment OC. Moreover, the OD and OC are segmented independently, or pre-requirement is necessary to extract the OD centered region with pre-processing procedures. Methods In this paper, we suggest a one-stage network named EfficientNet and Attention-based Residual Depth-wise Separable Convolution (EARDS) for joint OD and OC segmentation. In EARDS, EfficientNet-b0 is regarded as an encoder to capture more effective boundary representations. To suppress irrelevant regions and highlight features of fine OD and OC regions, Attention Gate (AG) is incorporated into the skip connection. Also, Residual Depth-wise Separable Convolution (RDSC) block is developed to improve the segmentation performance and computational efficiency. Further, a novel decoder network is proposed by combining AG, RDSC block and Batch Normalization (BN) layer, which is utilized to eliminate the vanishing gradient problem and accelerate the convergence speed. Finally, the focal loss and dice loss as a weighted combination is designed to guide the network for accurate OD and OC segmentation. Results and discussion Extensive experimental results on the Drishti-GS and REFUGE datasets indicate that the proposed EARDS outperforms the state-of-the-art approaches. The code is available at https://github.com/M4cheal/EARDS.
Collapse
Affiliation(s)
- Wei Zhou
- College of Computer Science, Shenyang Aerospace University, Shenyang, China
| | - Jianhang Ji
- College of Computer Science, Shenyang Aerospace University, Shenyang, China
| | - Yan Jiang
- School of Software, Jiangxi Normal University, Nanchang, China
| | - Jing Wang
- Shenyang Aier Excellence Eye Hospital Co., Ltd., Shenyang, China
| | - Qi Qi
- Party School of Liaoning Provincial Party Committee, Shenyang, China
| | - Yugen Yi
- School of Software, Jiangxi Normal University, Nanchang, China
| |
Collapse
|
82
|
Chen Y, Tang Y, Huang J, Xiong S. Multi-scale Triplet Hashing for Medical Image Retrieval. Comput Biol Med 2023; 155:106633. [PMID: 36827786 DOI: 10.1016/j.compbiomed.2023.106633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Revised: 01/12/2023] [Accepted: 02/04/2023] [Indexed: 02/10/2023]
Abstract
For medical image retrieval task, deep hashing algorithms are widely applied in large-scale datasets for auxiliary diagnosis due to the retrieval efficiency advantage of hash codes. Most of which focus on features learning, whilst neglecting the discriminate area of medical images and hierarchical similarity for deep features and hash codes. In this paper, we tackle these dilemmas with a new Multi-scale Triplet Hashing (MTH) algorithm, which can leverage multi-scale information, convolutional self-attention and hierarchical similarity to learn effective hash codes simultaneously. The MTH algorithm first designs multi-scale DenseBlock module to learn multi-scale information of medical images. Meanwhile, a convolutional self-attention mechanism is developed to perform information interaction of the channel domain, which can capture the discriminate area of medical images effectively. On top of the two paths, a novel loss function is proposed to not only conserve the category-level information of deep features and the semantic information of hash codes in the learning process, but also capture the hierarchical similarity for deep features and hash codes. Extensive experiments on the Curated X-ray Dataset, Skin Cancer MNIST Dataset and COVID-19 Radiography Dataset illustrate that the MTH algorithm can further enhance the effect of medical retrieval compared to other state-of-the-art medical image retrieval algorithms.
Collapse
Affiliation(s)
- Yaxiong Chen
- School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan 430070, China; Sanya Science and Education Innovation Park, Wuhan University of Technology, Sanya 572000, China; Wuhan University of Technology Chongqing Research Institute, Chongqing 401120, China
| | - Yibo Tang
- School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan 430070, China
| | - Jinghao Huang
- School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan 430070, China; Sanya Science and Education Innovation Park, Wuhan University of Technology, Sanya 572000, China
| | - Shengwu Xiong
- School of Computer Science and Artificial Intelligence, Wuhan University of Technology, Wuhan 430070, China; Sanya Science and Education Innovation Park, Wuhan University of Technology, Sanya 572000, China.
| |
Collapse
|
83
|
Zhao A, Su H, She C, Huang X, Li H, Qiu H, Jiang Z, Huang G. Joint optic disc and cup segmentation based on elliptical-like morphological feature and spatial geometry constraint. Comput Biol Med 2023; 158:106796. [PMID: 36989744 DOI: 10.1016/j.compbiomed.2023.106796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 02/16/2023] [Accepted: 03/20/2023] [Indexed: 03/28/2023]
Abstract
Glaucoma is a chronic degenerative disease that is the second leading cause of irreversible blindness worldwide. For a precise and automatic screening of glaucoma, detecting the optic disc and cup precisely is significant. In this paper, combining the elliptical-like morphological features of the disc and cup, we reformulate the segmentation task from a perspective of ellipse detection to explicitly segment and directly get the glaucoma screening indicator. We detect the minimum bounding boxes of ellipses firstly, and then learn the ellipse parameters of these regions to achieve optic disc and cup segmentation. Considering the spatial geometry prior knowledge that the cup should be within the disc region, Paired-Box RPN is introduced to simultaneously detect the disc and cup coupled. In addition, boundary attention module is introduced to use edges of the disc and cup as an important guide for context aggregation to improve the accuracy. Comprehensive experiments clearly show that our method outperforms the state-of-the-art methods for optic disc and cup segmentation. Simultaneously, the proposed method also obtains the good glaucoma screening performance with calculated vCDR value. Joint optic disc and cup segmentation, which utilizes the elliptical-like morphological features and spatial geometry constraint, could improve the performance of optic disc and cup segmentation.
Collapse
|
84
|
Sangeethaa S. Presumptive discerning of the severity level of glaucoma through clinical fundus images using hybrid PolyNet. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
85
|
Lu S, Zhao H, Liu H, Li H, Wang N. PKRT-Net: Prior Knowledge-based Relation Transformer Network for Optic Cup and Disc Segmentation. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.03.044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
86
|
Chákṣu: A glaucoma specific fundus image database. Sci Data 2023; 10:70. [PMID: 36737439 PMCID: PMC9898274 DOI: 10.1038/s41597-023-01943-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 01/06/2023] [Indexed: 02/05/2023] Open
Abstract
We introduce Chákṣu-a retinal fundus image database for the evaluation of computer-assisted glaucoma prescreening techniques. The database contains 1345 color fundus images acquired using three brands of commercially available fundus cameras. Each image is provided with the outlines for the optic disc (OD) and optic cup (OC) using smooth closed contours and a decision of normal versus glaucomatous by five expert ophthalmologists. In addition, segmentation ground-truths of the OD and OC are provided by fusing the expert annotations using the mean, median, majority, and Simultaneous Truth and Performance Level Estimation (STAPLE) algorithm. The performance indices show that the ground-truth agreement with the experts is the best with STAPLE algorithm, followed by majority, median, and mean. The vertical, horizontal, and area cup-to-disc ratios are provided based on the expert annotations. Image-wise glaucoma decisions are also provided based on majority voting among the experts. Chákṣu is the largest Indian-ethnicity-specific fundus image database with expert annotations and would aid in the development of artificial intelligence based glaucoma diagnostics.
Collapse
|
87
|
Atasever S, Azginoglu N, Terzi DS, Terzi R. A comprehensive survey of deep learning research on medical image analysis with focus on transfer learning. Clin Imaging 2023; 94:18-41. [PMID: 36462229 DOI: 10.1016/j.clinimag.2022.11.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 10/17/2022] [Accepted: 11/01/2022] [Indexed: 11/13/2022]
Abstract
This survey aims to identify commonly used methods, datasets, future trends, knowledge gaps, constraints, and limitations in the field to provide an overview of current solutions used in medical image analysis in parallel with the rapid developments in transfer learning (TL). Unlike previous studies, this survey grouped the last five years of current studies for the period between January 2017 and February 2021 according to different anatomical regions and detailed the modality, medical task, TL method, source data, target data, and public or private datasets used in medical imaging. Also, it provides readers with detailed information on technical challenges, opportunities, and future research trends. In this way, an overview of recent developments is provided to help researchers to select the most effective and efficient methods and access widely used and publicly available medical datasets, research gaps, and limitations of the available literature.
Collapse
Affiliation(s)
- Sema Atasever
- Computer Engineering Department, Nevsehir Hacı Bektas Veli University, Nevsehir, Turkey.
| | - Nuh Azginoglu
- Computer Engineering Department, Kayseri University, Kayseri, Turkey.
| | | | - Ramazan Terzi
- Computer Engineering Department, Amasya University, Amasya, Turkey.
| |
Collapse
|
88
|
Fea AM, Ricardi F, Novarese C, Cimorosi F, Vallino V, Boscia G. Precision Medicine in Glaucoma: Artificial Intelligence, Biomarkers, Genetics and Redox State. Int J Mol Sci 2023; 24:2814. [PMID: 36769127 PMCID: PMC9917798 DOI: 10.3390/ijms24032814] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 01/07/2023] [Accepted: 01/18/2023] [Indexed: 02/05/2023] Open
Abstract
Glaucoma is a multifactorial neurodegenerative illness requiring early diagnosis and strict monitoring of the disease progression. Current exams for diagnosis and prognosis are based on clinical examination, intraocular pressure (IOP) measurements, visual field tests, and optical coherence tomography (OCT). In this scenario, there is a critical unmet demand for glaucoma-related biomarkers to enhance clinical testing for early diagnosis and tracking of the disease's development. The introduction of validated biomarkers would allow for prompt intervention in the clinic to help with prognosis prediction and treatment response monitoring. This review aims to report the latest acquisitions on biomarkers in glaucoma, from imaging analysis to genetics and metabolic markers.
Collapse
|
89
|
Meng Y, Zhang H, Zhao Y, Gao D, Hamill B, Patri G, Peto T, Madhusudhan S, Zheng Y. Dual Consistency Enabled Weakly and Semi-Supervised Optic Disc and Cup Segmentation With Dual Adaptive Graph Convolutional Networks. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:416-429. [PMID: 36044486 DOI: 10.1109/tmi.2022.3203318] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Glaucoma is a progressive eye disease that results in permanent vision loss, and the vertical cup to disc ratio (vCDR) in colour fundus images is essential in glaucoma screening and assessment. Previous fully supervised convolution neural networks segment the optic disc (OD) and optic cup (OC) from color fundus images and then calculate the vCDR offline. However, they rely on a large set of labeled masks for training, which is expensive and time-consuming to acquire. To address this, we propose a weakly and semi-supervised graph-based network that investigates geometric associations and domain knowledge between segmentation probability maps (PM), modified signed distance function representations (mSDF), and boundary region of interest characteristics (B-ROI) in three aspects. Firstly, we propose a novel Dual Adaptive Graph Convolutional Network (DAGCN) to reason the long-range features of the PM and the mSDF w.r.t. the regional uniformity. Secondly, we propose a dual consistency regularization-based semi-supervised learning paradigm. The regional consistency between the PM and the mSDF, and the marginal consistency between the derived B-ROI from each of them boost the proposed model's performance due to the inherent geometric associations. Thirdly, we exploit the task-specific domain knowledge via the oval shapes of OD & OC, where a differentiable vCDR estimating layer is proposed. Furthermore, without additional annotations, the supervision on vCDR serves as weakly-supervisions for segmentation tasks. Experiments on six large-scale datasets demonstrate our model's superior performance on OD & OC segmentation and vCDR estimation. The implementation code has been made available.https://github.com/smallmax00/Dual_Adaptive_Graph_Reasoning.
Collapse
|
90
|
Chen D, Ran Ran A, Fang Tan T, Ramachandran R, Li F, Cheung CY, Yousefi S, Tham CCY, Ting DSW, Zhang X, Al-Aswad LA. Applications of Artificial Intelligence and Deep Learning in Glaucoma. Asia Pac J Ophthalmol (Phila) 2023; 12:80-93. [PMID: 36706335 DOI: 10.1097/apo.0000000000000596] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 12/06/2022] [Indexed: 01/28/2023] Open
Abstract
Diagnosis and detection of progression of glaucoma remains challenging. Artificial intelligence-based tools have the potential to improve and standardize the assessment of glaucoma but development of these algorithms is difficult given the multimodal and variable nature of the diagnosis. Currently, most algorithms are focused on a single imaging modality, specifically screening and diagnosis based on fundus photos or optical coherence tomography images. Use of anterior segment optical coherence tomography and goniophotographs is limited. The majority of algorithms designed for disease progression prediction are based on visual fields. No studies in our literature search assessed the use of artificial intelligence for treatment response prediction and no studies conducted prospective testing of their algorithms. Additional challenges to the development of artificial intelligence-based tools include scarcity of data and a lack of consensus in diagnostic criteria. Although research in the use of artificial intelligence for glaucoma is promising, additional work is needed to develop clinically usable tools.
Collapse
Affiliation(s)
- Dinah Chen
- Department of Ophthalmology, NYU Langone Health, New York City, NY
- Genentech Inc, South San Francisco, CA
| | - An Ran Ran
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China
- Lam Kin Chung, Jet King-Shing Ho Glaucoma Treatment And Research Centre, The Chinese University of Hong Kong, Hong Kong, China
| | - Ting Fang Tan
- Singapore Eye Research Institute, Singapore
- Singapore National Eye Center, Singapore
| | | | - Fei Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Carol Y Cheung
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China
- Lam Kin Chung, Jet King-Shing Ho Glaucoma Treatment And Research Centre, The Chinese University of Hong Kong, Hong Kong, China
| | - Siamak Yousefi
- Department of Ophthalmology, The University of Tennessee Health Science Center, Memphis, TN
| | - Clement C Y Tham
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China
- Lam Kin Chung, Jet King-Shing Ho Glaucoma Treatment And Research Centre, The Chinese University of Hong Kong, Hong Kong, China
| | - Daniel S W Ting
- Singapore Eye Research Institute, Singapore
- Singapore National Eye Center, Singapore
- Duke-NUS Medical School, National University of Singapore, Singapore
| | - Xiulan Zhang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | | |
Collapse
|
91
|
Gu R, Zhang J, Wang G, Lei W, Song T, Zhang X, Li K, Zhang S. Contrastive Semi-Supervised Learning for Domain Adaptive Segmentation Across Similar Anatomical Structures. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:245-256. [PMID: 36155435 DOI: 10.1109/tmi.2022.3209798] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance for medical image segmentation, yet need plenty of manual annotations for training. Semi-Supervised Learning (SSL) methods are promising to reduce the requirement of annotations, but their performance is still limited when the dataset size and the number of annotated images are small. Leveraging existing annotated datasets with similar anatomical structures to assist training has a potential for improving the model's performance. However, it is further challenged by the cross-anatomy domain shift due to the image modalities and even different organs in the target domain. To solve this problem, we propose Contrastive Semi-supervised learning for Cross Anatomy Domain Adaptation (CS-CADA) that adapts a model to segment similar structures in a target domain, which requires only limited annotations in the target domain by leveraging a set of existing annotated images of similar structures in a source domain. We use Domain-Specific Batch Normalization (DSBN) to individually normalize feature maps for the two anatomical domains, and propose a cross-domain contrastive learning strategy to encourage extracting domain invariant features. They are integrated into a Self-Ensembling Mean-Teacher (SE-MT) framework to exploit unlabeled target domain images with a prediction consistency constraint. Extensive experiments show that our CS-CADA is able to solve the challenging cross-anatomy domain shift problem, achieving accurate segmentation of coronary arteries in X-ray images with the help of retinal vessel images and cardiac MR images with the help of fundus images, respectively, given only a small number of annotations in the target domain. Our code is available at https://github.com/HiLab-git/DAG4MIA.
Collapse
|
92
|
Hu S, Liao Z, Zhang J, Xia Y. Domain and Content Adaptive Convolution Based Multi-Source Domain Generalization for Medical Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:233-244. [PMID: 36155434 DOI: 10.1109/tmi.2022.3210133] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
The domain gap caused mainly by variable medical image quality renders a major obstacle on the path between training a segmentation model in the lab and applying the trained model to unseen clinical data. To address this issue, domain generalization methods have been proposed, which however usually use static convolutions and are less flexible. In this paper, we propose a multi-source domain generalization model based on the domain and content adaptive convolution (DCAC) for the segmentation of medical images across different modalities. Specifically, we design the domain adaptive convolution (DAC) module and content adaptive convolution (CAC) module and incorporate both into an encoder-decoder backbone. In the DAC module, a dynamic convolutional head is conditioned on the predicted domain code of the input to make our model adapt to the unseen target domain. In the CAC module, a dynamic convolutional head is conditioned on the global image features to make our model adapt to the test image. We evaluated the DCAC model against the baseline and four state-of-the-art domain generalization methods on the prostate segmentation, COVID-19 lesion segmentation, and optic cup/optic disc segmentation tasks. Our results not only indicate that the proposed DCAC model outperforms all competing methods on each segmentation task but also demonstrate the effectiveness of the DAC and CAC modules. Code is available at https://git.io/DCAC.
Collapse
|
93
|
Zhang X, Song J, Wang C, Zhou Z. Convolutional autoencoder joint boundary and mask adversarial learning for fundus image segmentation. Front Hum Neurosci 2022; 16:1043569. [PMID: 36561837 PMCID: PMC9765310 DOI: 10.3389/fnhum.2022.1043569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 11/18/2022] [Indexed: 12/12/2022] Open
Abstract
The precise segmentation of the optic cup (OC) and the optic disc (OD) is important for glaucoma screening. In recent years, medical image segmentation based on convolutional neural networks (CNN) has achieved remarkable results. However, many traditional CNN methods do not consider the cross-domain problem, i.e., generalization on datasets of different domains. In this paper, we propose a novel unsupervised domain-adaptive segmentation architecture called CAE-BMAL. Firstly, we enhance the source domain with a convolutional autoencoder to improve the generalization ability of the model. Then, we introduce an adversarial learning-based boundary discrimination branch to reduce the impact of the complex environment during segmentation. Finally, we evaluate the proposed method on three datasets, Drishti-GS, RIM-ONE-r3, and REFUGE. The experimental evaluations outperform most state-of-the-art methods in accuracy and generalization. We further evaluate the cup-to-disk ratio performance in OD and OC segmentation, which indicates the effectiveness of glaucoma discrimination.
Collapse
Affiliation(s)
- Xu Zhang
- Department of Computer Science and Technology, Chongqing University of Posts and Technology, Chongqing, China,Key Laboratory of Tourism Multisource Data Perception and Decision, Ministry of Culture and Tourism, Chongqing, China
| | - Jiaqi Song
- Department of Computer Science and Technology, Chongqing University of Posts and Technology, Chongqing, China,Key Laboratory of Tourism Multisource Data Perception and Decision, Ministry of Culture and Tourism, Chongqing, China
| | - Chengrui Wang
- Chongqing Telecom System Integration Co., Ltd., Chongqing, China
| | - Zhen Zhou
- Tianjin Eye Hospital, Tianjin, China,Tianjin Key Laboratory of Ophthalmology and Vision Science, Tianjin, China,Nankai University Affiliated Eye Hospital, Tianjin, China,Clinical College of Ophthalmology Tianjin Medical University, Tianjin, China,*Correspondence: Zhen Zhou
| |
Collapse
|
94
|
Alfonso-Francia G, Pedraza-Ortega JC, Badillo-Fernández M, Toledano-Ayala M, Aceves-Fernandez MA, Rodriguez-Resendiz J, Ko SB, Tovar-Arriaga S. Performance Evaluation of Different Object Detection Models for the Segmentation of Optical Cups and Discs. Diagnostics (Basel) 2022; 12:diagnostics12123031. [PMID: 36553037 PMCID: PMC9777130 DOI: 10.3390/diagnostics12123031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 11/22/2022] [Accepted: 11/25/2022] [Indexed: 12/12/2022] Open
Abstract
Glaucoma is an eye disease that gradually deteriorates vision. Much research focuses on extracting information from the optic disc and optic cup, the structure used for measuring the cup-to-disc ratio. These structures are commonly segmented with deeplearning techniques, primarily using Encoder-Decoder models, which are hard to train and time-consuming. Object detection models using convolutional neural networks can extract features from fundus retinal images with good precision. However, the superiority of one model over another for a specific task is still being determined. The main goal of our approach is to compare object detection model performance to automate segment cups and discs on fundus images. This study brings the novelty of seeing the behavior of different object detection models in the detection and segmentation of the disc and the optical cup (Mask R-CNN, MS R-CNN, CARAFE, Cascade Mask R-CNN, GCNet, SOLO, Point_Rend), evaluated on Retinal Fundus Images for Glaucoma Analysis (REFUGE), and G1020 datasets. Reported metrics were Average Precision (AP), F1-score, IoU, and AUCPR. Several models achieved the highest AP with a perfect 1.000 when the threshold for IoU was set up at 0.50 on REFUGE, and the lowest was Cascade Mask R-CNN with an AP of 0.997. On the G1020 dataset, the best model was Point_Rend with an AP of 0.956, and the worst was SOLO with 0.906. It was concluded that the methods reviewed achieved excellent performance with high precision and recall values, showing efficiency and effectiveness. The problem of how many images are needed was addressed with an initial value of 100, with excellent results. Data augmentation, multi-scale handling, and anchor box size brought improvements. The capability to translate knowledge from one database to another shows promising results too.
Collapse
Affiliation(s)
- Gendry Alfonso-Francia
- Faculty of Engineering, Autonomous University of Querétaro, Santiago de Querétaro 76010, Mexico
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK S7N 5A9, Canada
| | | | - Mariana Badillo-Fernández
- Instituto Mexicano de Oftalmología (IMO) I.A.P., Circuito Exterior Estadio Corregidora sn, Centro Sur, Santiago de Querétaro 76010, Mexico
| | - Manuel Toledano-Ayala
- Faculty of Engineering, Autonomous University of Querétaro, Santiago de Querétaro 76010, Mexico
| | | | | | - Seok-Bum Ko
- Department of Electrical and Computer Engineering, University of Saskatchewan, 57 Campus Drive, Saskatoon, SK S7N 5A9, Canada
| | - Saul Tovar-Arriaga
- Faculty of Engineering, Autonomous University of Querétaro, Santiago de Querétaro 76010, Mexico
- Correspondence:
| |
Collapse
|
95
|
Haider A, Arsalan M, Park C, Sultan H, Park KR. Exploring deep feature-blending capabilities to assist glaucoma screening. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
96
|
Lyu J, Zhang Y, Huang Y, Lin L, Cheng P, Tang X. AADG: Automatic Augmentation for Domain Generalization on Retinal Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:3699-3711. [PMID: 35862336 DOI: 10.1109/tmi.2022.3193146] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Convolutional neural networks have been widely applied to medical image segmentation and have achieved considerable performance. However, the performance may be significantly affected by the domain gap between training data (source domain) and testing data (target domain). To address this issue, we propose a data manipulation based domain generalization method, called Automated Augmentation for Domain Generalization (AADG). Our AADG framework can effectively sample data augmentation policies that generate novel domains and diversify the training set from an appropriate search space. Specifically, we introduce a novel proxy task maximizing the diversity among multiple augmented novel domains as measured by the Sinkhorn distance in a unit sphere space, making automated augmentation tractable. Adversarial training and deep reinforcement learning are employed to efficiently search the objectives. Quantitative and qualitative experiments on 11 publicly-accessible fundus image datasets (four for retinal vessel segmentation, four for optic disc and cup (OD/OC) segmentation and three for retinal lesion segmentation) are comprehensively performed. Two OCTA datasets for retinal vasculature segmentation are further involved to validate cross-modality generalization. Our proposed AADG exhibits state-of-the-art generalization performance and outperforms existing approaches by considerable margins on retinal vessel, OD/OC and lesion segmentation tasks. The learned policies are empirically validated to be model-agnostic and can transfer well to other models. The source code is available at https://github.com/CRazorback/AADG.
Collapse
|
97
|
Bragança CP, Torres JM, Soares CPDA, Macedo LO. Detection of Glaucoma on Fundus Images Using Deep Learning on a New Image Set Obtained with a Smartphone and Handheld Ophthalmoscope. Healthcare (Basel) 2022; 10:healthcare10122345. [PMID: 36553869 PMCID: PMC9778370 DOI: 10.3390/healthcare10122345] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 11/18/2022] [Accepted: 11/18/2022] [Indexed: 11/24/2022] Open
Abstract
Statistics show that an estimated 64 million people worldwide suffer from glaucoma. To aid in the detection of this disease, this paper presents a new public dataset containing eye fundus images that was developed for glaucoma pattern-recognition studies using deep learning (DL). The dataset, denoted Brazil Glaucoma, comprises 2000 images obtained from 1000 volunteers categorized into two groups: those with glaucoma (50%) and those without glaucoma (50%). All images were captured with a smartphone attached to a Welch Allyn panoptic direct ophthalmoscope. Further, a DL approach for the automatic detection of glaucoma was developed using the new dataset as input to a convolutional neural network ensemble model. The accuracy between positive and negative glaucoma detection, sensitivity, and specificity were calculated using five-fold cross-validation to train and refine the classification model. The results showed that the proposed method can identify glaucoma from eye fundus images with an accuracy of 90.0%. Thus, the combination of fundus images obtained using a smartphone attached to a portable panoptic ophthalmoscope and artificial intelligence algorithms yielded satisfactory results in the overall accuracy of glaucoma detection tests. Consequently, the proposed approach can contribute to the development of technologies aimed at massive population screening of the disease.
Collapse
Affiliation(s)
- Clerimar Paulo Bragança
- ISUS Unit, Faculdade de Ciência e Tecnologia, Universidade Fernando Pessoa, 4249-004 Porto, Portugal
- Correspondence: ; Tel.: +351-22-507-1300
| | - José Manuel Torres
- ISUS Unit, Faculdade de Ciência e Tecnologia, Universidade Fernando Pessoa, 4249-004 Porto, Portugal
- Artificial Intelligence and Computer Science Laboratory, LIACC, University of Porto, 4100-000 Porto, Portugal
| | - Christophe Pinto de Almeida Soares
- ISUS Unit, Faculdade de Ciência e Tecnologia, Universidade Fernando Pessoa, 4249-004 Porto, Portugal
- Artificial Intelligence and Computer Science Laboratory, LIACC, University of Porto, 4100-000 Porto, Portugal
| | - Luciano Oliveira Macedo
- Department of Ophthalmology, Eye Hospital of Southern Minas Gerais State, R. Joaquim Rosa, 14, Itanhandu 37464-000, MG, Brazil
| |
Collapse
|
98
|
Zhang F, Li S, Deng J. Unsupervised Domain Adaptation with Shape Constraint and Triple Attention for Joint Optic Disc and Cup Segmentation. SENSORS (BASEL, SWITZERLAND) 2022; 22:8748. [PMID: 36433345 PMCID: PMC9695107 DOI: 10.3390/s22228748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 11/06/2022] [Accepted: 11/09/2022] [Indexed: 06/16/2023]
Abstract
Currently, glaucoma has become an important cause of blindness. At present, although glaucoma cannot be cured, early treatment can prevent it from getting worse. A reliable way to detect glaucoma is to segment the optic disc and cup and then measure the cup-to-disc ratio (CDR). Many deep neural network models have been developed to autonomously segment the optic disc and the optic cup to help in diagnosis. However, their performance degrades when subjected to domain shift. While many domain-adaptation methods have been exploited to address this problem, they are apt to produce malformed segmentation results. In this study, it is suggested that the segmentation network be adjusted using a constrained formulation that embeds prior knowledge about the shape of the segmentation areas that is domain-invariant. Based on IOSUDA (i.e., Input and Output Space Unsupervised Domain Adaptation), a novel unsupervised joint optic cup-to-disc segmentation framework with shape constraints is proposed, called SCUDA (short for Shape-Constrained Unsupervised Domain Adaptation). A shape constrained loss function is novelly proposed in this paper which utilizes domain-invariant prior knowledge concerning the segmentation region of the joint optic cup-optical disc of fundus images to constrain the segmentation result during network training. In addition, a convolutional triple attention module is designed to improve the segmentation network, which captures cross-dimensional interactions and provides a rich feature representation to improve the segmentation accuracy. Experiments on the RIM-ONE_r3 and Drishti-GS datasets demonstrate that the algorithm outperforms existing approaches for segmenting optic discs and cups.
Collapse
|
99
|
Guo X, Li J, Lin Q, Tu Z, Hu X, Che S. Joint optic disc and cup segmentation using feature fusion and attention. Comput Biol Med 2022; 150:106094. [PMID: 36122442 DOI: 10.1016/j.compbiomed.2022.106094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Revised: 08/18/2022] [Accepted: 09/03/2022] [Indexed: 11/23/2022]
Abstract
Currently, glaucoma is one of the leading causes of irreversible vision loss. So far, glaucoma is incurable, but early treatment can stop the progression of the condition and slow down the speed and extent of vision loss. Early detection and treatment are crucial to prevent glaucoma from developing into blindness. It is an effective method for glaucoma diagnosis to measure Cup to Disc Ratio (CDR) by the segmentation of Optic Disc (OD) and Optic Cup (OC). Compared with OD segmentation, OC segmentation still faces difficulties in segmentation accuracy. In this paper, a deep learning architecture named FAU-Net (feature fusion and attention U-Net) is proposed for the joint segmentation of OD and OC. It is an improved architecture based on U-Net. By adding a feature fusion module in U-Net, information loss in feature extraction can be reduced. The channel and spatial attention mechanisms are combined to highlight the important features related to the segmentation task and suppress the expression of irrelevant regional features. Finally, a multi-label loss is used to generate the final joint segmentation of OD and OC. Experimental results show that the proposed FAU-Net outperforms the state-of-the-art segmentation of OD and OC on Drishti-GS1, REFUGE, RIM-ONE and ODIR datasets.
Collapse
Affiliation(s)
- Xiaoxin Guo
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China; College of Computer Science and Technology, Jilin University, Changchun 130012, China.
| | - Jiahui Li
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China; College of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Qifeng Lin
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China; College of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Zhenchuan Tu
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China; College of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Xiaoying Hu
- Ophthalmology Department, Bethune First Hospital of Jilin University, Changchun 130021, China
| | - Songtian Che
- Ophthalmology Department, Bethune Second Hospital of Jilin University, Changchun 130041, China
| |
Collapse
|
100
|
Fang H, Li F, Fu H, Sun X, Cao X, Lin F, Son J, Kim S, Quellec G, Matta S, Shankaranarayana SM, Chen YT, Wang CH, Shah NA, Lee CY, Hsu CC, Xie H, Lei B, Baid U, Innani S, Dang K, Shi W, Kamble R, Singhal N, Wang CW, Lo SC, Orlando JI, Bogunovic H, Zhang X, Xu Y. ADAM Challenge: Detecting Age-Related Macular Degeneration From Fundus Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2828-2847. [PMID: 35507621 DOI: 10.1109/tmi.2022.3172773] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Age-related macular degeneration (AMD) is the leading cause of visual impairment among elderly in the world. Early detection of AMD is of great importance, as the vision loss caused by this disease is irreversible and permanent. Color fundus photography is the most cost-effective imaging modality to screen for retinal disorders. Cutting edge deep learning based algorithms have been recently developed for automatically detecting AMD from fundus images. However, there are still lack of a comprehensive annotated dataset and standard evaluation benchmarks. To deal with this issue, we set up the Automatic Detection challenge on Age-related Macular degeneration (ADAM), which was held as a satellite event of the ISBI 2020 conference. The ADAM challenge consisted of four tasks which cover the main aspects of detecting and characterizing AMD from fundus images, including detection of AMD, detection and segmentation of optic disc, localization of fovea, and detection and segmentation of lesions. As part of the ADAM challenge, we have released a comprehensive dataset of 1200 fundus images with AMD diagnostic labels, pixel-wise segmentation masks for both optic disc and AMD-related lesions (drusen, exudates, hemorrhages and scars, among others), as well as the coordinates corresponding to the location of the macular fovea. A uniform evaluation framework has been built to make a fair comparison of different models using this dataset. During the ADAM challenge, 610 results were submitted for online evaluation, with 11 teams finally participating in the onsite challenge. This paper introduces the challenge, the dataset and the evaluation methods, as well as summarizes the participating methods and analyzes their results for each task. In particular, we observed that the ensembling strategy and the incorporation of clinical domain knowledge were the key to improve the performance of the deep learning models.
Collapse
|