1
|
A review of deep learning-based information fusion techniques for multimodal medical image classification. Comput Biol Med 2024; 177:108635. [PMID: 38796881 DOI: 10.1016/j.compbiomed.2024.108635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 03/18/2024] [Accepted: 05/18/2024] [Indexed: 05/29/2024]
Abstract
Multimodal medical imaging plays a pivotal role in clinical diagnosis and research, as it combines information from various imaging modalities to provide a more comprehensive understanding of the underlying pathology. Recently, deep learning-based multimodal fusion techniques have emerged as powerful tools for improving medical image classification. This review offers a thorough analysis of the developments in deep learning-based multimodal fusion for medical classification tasks. We explore the complementary relationships among prevalent clinical modalities and outline three main fusion schemes for multimodal classification networks: input fusion, intermediate fusion (encompassing single-level fusion, hierarchical fusion, and attention-based fusion), and output fusion. By evaluating the performance of these fusion techniques, we provide insight into the suitability of different network architectures for various multimodal fusion scenarios and application domains. Furthermore, we delve into challenges related to network architecture selection, handling incomplete multimodal data management, and the potential limitations of multimodal fusion. Finally, we spotlight the promising future of Transformer-based multimodal fusion techniques and give recommendations for future research in this rapidly evolving field.
Collapse
|
2
|
Automated tear film break-up time measurement for dry eye diagnosis using deep learning. Sci Rep 2024; 14:11723. [PMID: 38778145 PMCID: PMC11111799 DOI: 10.1038/s41598-024-62636-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 05/20/2024] [Indexed: 05/25/2024] Open
Abstract
In the realm of ophthalmology, precise measurement of tear film break-up time (TBUT) plays a crucial role in diagnosing dry eye disease (DED). This study aims to introduce an automated approach utilizing artificial intelligence (AI) to mitigate subjectivity and enhance the reliability of TBUT measurement. We employed a dataset of 47 slit lamp videos for development, while a test dataset of 20 slit lamp videos was used for evaluating the proposed approach. The multistep approach for TBUT estimation involves the utilization of a Dual-Task Siamese Network for classifying video frames into tear film breakup or non-breakup categories. Subsequently, a postprocessing step incorporates a Gaussian filter to smooth the instant breakup/non-breakup predictions effectively. Applying a threshold to the smoothed predictions identifies the initiation of tear film breakup. Our proposed method demonstrates on the evaluation dataset a precise breakup/non-breakup classification of video frames, achieving an Area Under the Curve of 0.870. At the video level, we observed a strong Pearson correlation coefficient (r) of 0.81 between TBUT assessments conducted using our approach and the ground truth. These findings underscore the potential of AI-based approaches in quantifying TBUT, presenting a promising avenue for advancing diagnostic methodologies in ophthalmology.
Collapse
|
3
|
DISCOVER: 2-D multiview summarization of Optical Coherence Tomography Angiography for automatic diabetic retinopathy diagnosis. Artif Intell Med 2024; 149:102803. [PMID: 38462293 DOI: 10.1016/j.artmed.2024.102803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 12/19/2023] [Accepted: 02/03/2024] [Indexed: 03/12/2024]
Abstract
Diabetic Retinopathy (DR), an ocular complication of diabetes, is a leading cause of blindness worldwide. Traditionally, DR is monitored using Color Fundus Photography (CFP), a widespread 2-D imaging modality. However, DR classifications based on CFP have poor predictive power, resulting in suboptimal DR management. Optical Coherence Tomography Angiography (OCTA) is a recent 3-D imaging modality offering enhanced structural and functional information (blood flow) with a wider field of view. This paper investigates automatic DR severity assessment using 3-D OCTA. A straightforward solution to this task is a 3-D neural network classifier. However, 3-D architectures have numerous parameters and typically require many training samples. A lighter solution consists in using 2-D neural network classifiers processing 2-D en-face (or frontal) projections and/or 2-D cross-sectional slices. Such an approach mimics the way ophthalmologists analyze OCTA acquisitions: (1) en-face flow maps are often used to detect avascular zones and neovascularization, and (2) cross-sectional slices are commonly analyzed to detect macular edemas, for instance. However, arbitrary data reduction or selection might result in information loss. Two complementary strategies are thus proposed to optimally summarize OCTA volumes with 2-D images: (1) a parametric en-face projection optimized through deep learning and (2) a cross-sectional slice selection process controlled through gradient-based attribution. The full summarization and DR classification pipeline is trained from end to end. The automatic 2-D summary can be displayed in a viewer or printed in a report to support the decision. We show that the proposed 2-D summarization and classification pipeline outperforms direct 3-D classification with the advantage of improved interpretability.
Collapse
|
4
|
Quantitative gait analysis and prediction using artificial intelligence for patients with gait disorders. Sci Rep 2023; 13:23099. [PMID: 38155189 PMCID: PMC10754876 DOI: 10.1038/s41598-023-49883-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 12/13/2023] [Indexed: 12/30/2023] Open
Abstract
Quantitative Gait Analysis (QGA) is considered as an objective measure of gait performance. In this study, we aim at designing an artificial intelligence that can efficiently predict the progression of gait quality using kinematic data obtained from QGA. For this purpose, a gait database collected from 734 patients with gait disorders is used. As the patient walks, kinematic data is collected during the gait session. This data is processed to generate the Gait Profile Score (GPS) for each gait cycle. Tracking potential GPS variations enables detecting changes in gait quality. In this regard, our work is driven by predicting such future variations. Two approaches were considered: signal-based and image-based. The signal-based one uses raw gait cycles, while the image-based one employs a two-dimensional Fast Fourier Transform (2D FFT) representation of gait cycles. Several architectures were developed, and the obtained Area Under the Curve (AUC) was above 0.72 for both approaches. To the best of our knowledge, our study is the first to apply neural networks for gait prediction tasks.
Collapse
|
5
|
Hybrid Fusion of High-Resolution and Ultra-Widefield OCTA Acquisitions for the Automatic Diagnosis of Diabetic Retinopathy. Diagnostics (Basel) 2023; 13:2770. [PMID: 37685306 PMCID: PMC10486731 DOI: 10.3390/diagnostics13172770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 08/19/2023] [Accepted: 08/24/2023] [Indexed: 09/10/2023] Open
Abstract
Optical coherence tomography angiography (OCTA) can deliver enhanced diagnosis for diabetic retinopathy (DR). This study evaluated a deep learning (DL) algorithm for automatic DR severity assessment using high-resolution and ultra-widefield (UWF) OCTA. Diabetic patients were examined with 6×6 mm2 high-resolution OCTA and 15×15 mm2 UWF-OCTA using PLEX®Elite 9000. A novel DL algorithm was trained for automatic DR severity inference using both OCTA acquisitions. The algorithm employed a unique hybrid fusion framework, integrating structural and flow information from both acquisitions. It was trained on data from 875 eyes of 444 patients. Tested on 53 patients (97 eyes), the algorithm achieved a good area under the receiver operating characteristic curve (AUC) for detecting DR (0.8868), moderate non-proliferative DR (0.8276), severe non-proliferative DR (0.8376), and proliferative/treated DR (0.9070). These results significantly outperformed detection with the 6×6 mm2 (AUC = 0.8462, 0.7793, 0.7889, and 0.8104, respectively) or 15×15 mm2 (AUC = 0.8251, 0.7745, 0.7967, and 0.8786, respectively) acquisitions alone. Thus, combining high-resolution and UWF-OCTA acquisitions holds the potential for improved early and late-stage DR detection, offering a foundation for enhancing DR management and a clear path for future works involving expanded datasets and integrating additional imaging modalities.
Collapse
|
6
|
Towards population-independent, multi-disease detection in fundus photographs. Sci Rep 2023; 13:11493. [PMID: 37460629 DOI: 10.1038/s41598-023-38610-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 07/11/2023] [Indexed: 07/20/2023] Open
Abstract
Independent validation studies of automatic diabetic retinopathy screening systems have recently shown a drop of screening performance on external data. Beyond diabetic retinopathy, this study investigates the generalizability of deep learning (DL) algorithms for screening various ocular anomalies in fundus photographs, across heterogeneous populations and imaging protocols. The following datasets are considered: OPHDIAT (France, diabetic population), OphtaMaine (France, general population), RIADD (India, general population) and ODIR (China, general population). Two multi-disease DL algorithms were developed: a Single-Dataset (SD) network, trained on the largest dataset (OPHDIAT), and a Multiple-Dataset (MD) network, trained on multiple datasets simultaneously. To assess their generalizability, both algorithms were evaluated whenever training and test data originate from overlapping datasets or from disjoint datasets. The SD network achieved a mean per-disease area under the receiver operating characteristic curve (mAUC) of 0.9571 on OPHDIAT. However, it generalized poorly to the other three datasets (mAUC < 0.9). When all four datasets were involved in training, the MD network significantly outperformed the SD network (p = 0.0058), indicating improved generality. However, in leave-one-dataset-out experiments, performance of the MD network was significantly lower on populations unseen during training than on populations involved in training (p < 0.0001), indicating imperfect generalizability.
Collapse
|
7
|
Federated Learning for Diabetic Retinopathy Detection in a Multi-center Fundus Screening Network. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38082571 DOI: 10.1109/embc40787.2023.10340772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Federated learning (FL) is a machine learning framework that allows remote clients to collaboratively learn a global model while keeping their training data localized. It has emerged as an effective tool to solve the problem of data privacy protection. In particular, in the medical field, it is gaining relevance for achieving collaborative learning while protecting sensitive data. In this work, we demonstrate the feasibility of FL in the development of a deep learning model for screening diabetic retinopathy (DR) in fundus photographs. To this end, we conduct a simulated FL framework using nearly 700,000 fundus photographs collected from OPHDIAT, a French multi-center screening network for detecting DR. We develop two FL algorithms: 1) a cross-center FL algorithm using data distributed across the OPHDIAT centers and 2) a cross-grader FL algorithm using data distributed across the OPHDIAT graders. We explore and assess different FL strategies and compare them to a conventional learning algorithm, namely centralized learning (CL), where all the data is stored in a centralized repository. For the task of referable DR detection, our simulated FL algorithms achieved similar performance to CL, in terms of area under the ROC curve (AUC): AUC =0.9482 for CL, AUC = 0.9317 for cross-center FL and AUC = 0.9522 for cross-grader FL. Our work indicates that the FL algorithm is a viable and reliable framework that can be applied in a screening network.Clinical relevance- Given that data sharing is regarded as an essential component of modern medical research, achieving collaborative learning while protecting sensitive data is key.
Collapse
|
8
|
Automation of dry eye disease quantitative assessment: A review. Clin Exp Ophthalmol 2022; 50:653-666. [PMID: 35656580 PMCID: PMC9542292 DOI: 10.1111/ceo.14119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 05/09/2022] [Accepted: 05/14/2022] [Indexed: 12/11/2022]
Abstract
Dry eye disease (DED) is a common eye condition worldwide and a primary reason for visits to the ophthalmologist. DED diagnosis is performed through a combination of tests, some of which are unfortunately invasive, non‐reproducible and lack accuracy. The following review describes methods that diagnose and measure the extent of eye dryness, enabling clinicians to quantify its severity. Our aim with this paper is to review classical methods as well as those that incorporate automation. For only four ways of quantifying DED, we take a deeper look into what main elements can benefit from automation and the different ways studies have incorporated it. Like numerous medical fields, Artificial Intelligence (AI) appears to be the path towards quality DED diagnosis. This review categorises diagnostic methods into the following: classical, semi‐automated and promising AI‐based automated methods.
Collapse
|
9
|
Automatic Screening for Ocular Anomalies Using Fundus Photographs. Optom Vis Sci 2022; 99:281-291. [PMID: 34897234 DOI: 10.1097/opx.0000000000001845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
SIGNIFICANCE Screening for ocular anomalies using fundus photography is key to prevent vision impairment and blindness. With the growing and aging population, automated algorithms that can triage fundus photographs and provide instant referral decisions are relevant to scale-up screening and face the shortage of ophthalmic expertise. PURPOSE This study aimed to develop a deep learning algorithm that detects any ocular anomaly in fundus photographs and to evaluate this algorithm for "normal versus anomalous" eye examination classification in the diabetic and general populations. METHODS The deep learning algorithm was developed and evaluated in two populations: the diabetic and general populations. Our patient cohorts consist of 37,129 diabetic patients from the OPHDIAT diabetic retinopathy screening network in Paris, France, and 7356 general patients from the OphtaMaine private screening network, in Le Mans, France. Each data set was divided into a development subset and a test subset of more than 4000 examinations each. For ophthalmologist/algorithm comparison, a subset of 2014 examinations from the OphtaMaine test subset was labeled by a second ophthalmologist. First, the algorithm was trained on the OPHDIAT development subset. Then, it was fine-tuned on the OphtaMaine development subset. RESULTS On the OPHDIAT test subset, the area under the receiver operating characteristic curve for normal versus anomalous classification was 0.9592. On the OphtaMaine test subset, the area under the receiver operating characteristic curve was 0.8347 before fine-tuning and 0.9108 after fine-tuning. On the ophthalmologist/algorithm comparison subset, the second ophthalmologist achieved a specificity of 0.8648 and a sensitivity of 0.6682. For the same specificity, the fine-tuned algorithm achieved a sensitivity of 0.8248. CONCLUSIONS The proposed algorithm compares favorably with human performance for normal versus anomalous eye examination classification using fundus photography. Artificial intelligence, which previously targeted a few retinal pathologies, can be used to screen for ocular anomalies comprehensively.
Collapse
|
10
|
ExplAIn: Explanatory artificial intelligence for diabetic retinopathy diagnosis. Med Image Anal 2021; 72:102118. [PMID: 34126549 DOI: 10.1016/j.media.2021.102118] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 03/15/2021] [Accepted: 05/19/2021] [Indexed: 10/21/2022]
Abstract
In recent years, Artificial Intelligence (AI) has proven its relevance for medical decision support. However, the "black-box" nature of successful AI algorithms still holds back their wide-spread deployment. In this paper, we describe an eXplanatory Artificial Intelligence (XAI) that reaches the same level of performance as black-box AI, for the task of classifying Diabetic Retinopathy (DR) severity using Color Fundus Photography (CFP). This algorithm, called ExplAIn, learns to segment and categorize lesions in images; the final image-level classification directly derives from these multivariate lesion segmentations. The novelty of this explanatory framework is that it is trained from end to end, with image supervision only, just like black-box AI algorithms: the concepts of lesions and lesion categories emerge by themselves. For improved lesion localization, foreground/background separation is trained through self-supervision, in such a way that occluding foreground pixels transforms the input image into a healthy-looking image. The advantage of such an architecture is that automatic diagnoses can be explained simply by an image and/or a few sentences. ExplAIn is evaluated at the image level and at the pixel level on various CFP image datasets. We expect this new framework, which jointly offers high classification performance and explainability, to facilitate AI deployment.
Collapse
|
11
|
Towards improved breast mass detection using dual-view mammogram matching. Med Image Anal 2021; 71:102083. [PMID: 33979759 DOI: 10.1016/j.media.2021.102083] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 02/18/2021] [Accepted: 04/14/2021] [Indexed: 11/18/2022]
Abstract
Breast cancer screening benefits from the visual analysis of multiple views of routine mammograms. As for clinical practice, computer-aided diagnosis (CAD) systems could be enhanced by integrating multi-view information. In this work, we propose a new multi-tasking framework that combines craniocaudal (CC) and mediolateral-oblique (MLO) mammograms for automatic breast mass detection. Rather than addressing mass recognition only, we exploit multi-tasking properties of deep networks to jointly learn mass matching and classification, towards better detection performance. Specifically, we propose a unified Siamese network that combines patch-level mass/non-mass classification and dual-view mass matching to take full advantage of multi-view information. This model is exploited in a full image detection pipeline based on You-Only-Look-Once (YOLO) region proposals. We carry out exhaustive experiments to highlight the contribution of dual-view matching for both patch-level classification and examination-level detection scenarios. Results demonstrate that mass matching highly improves the full-pipeline detection performance by outperforming conventional single-task schemes with 94.78% as Area Under the Curve (AUC) score and a classification accuracy of 0.8791. Interestingly, mass classification also improves the performance of mass matching, which proves the complementarity of both tasks. Our method further guides clinicians by providing accurate dual-view mass correspondences, which suggests that it could act as a relevant second opinion for mammogram interpretation and breast cancer diagnosis.
Collapse
|
12
|
Cascaded multi-scale convolutional encoder-decoders for breast mass segmentation in high-resolution mammograms. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2019:6738-6741. [PMID: 31947387 DOI: 10.1109/embc.2019.8857167] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
This paper addresses breast mass segmentation from high-resolution mammograms. To cope with strong class imbalance, huge diversity of size, shape, texture and contour as well as limited receptive field, mass segmentation is achieved through a multi-scale cascade of deep convolutional encoder-decoders without any pre-detection scheme. Multi-scale information is integrated using auto-context to make long-range spatial context arising from lower scale impact training at higher resolution. The pipeline is trained end-to-end to benefit from simultaneous segmentation refinement performed at each level. It incorporates transfer learning and fine tuning from DDSM-CBIS to INbreast datasets to further improve mass delineations. The comprehensive evaluation provided for high-resolution INbreast images highlights promising model generalizability against standard encoder-decoder strategies.
Collapse
|
13
|
Automatic detection of rare pathologies in fundus photographs using few-shot learning. Med Image Anal 2020; 61:101660. [PMID: 32028213 DOI: 10.1016/j.media.2020.101660] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Revised: 11/19/2019] [Accepted: 01/23/2020] [Indexed: 12/24/2022]
Abstract
In the last decades, large datasets of fundus photographs have been collected in diabetic retinopathy (DR) screening networks. Through deep learning, these datasets were used to train automatic detectors for DR and a few other frequent pathologies, with the goal to automate screening. One challenge limits the adoption of such systems so far: automatic detectors ignore rare conditions that ophthalmologists currently detect, such as papilledema or anterior ischemic optic neuropathy. The reason is that standard deep learning requires too many examples of these conditions. However, this limitation can be addressed with few-shot learning, a machine learning paradigm where a classifier has to generalize to a new category not seen in training, given only a few examples of this category. This paper presents a new few-shot learning framework that extends convolutional neural networks (CNNs), trained for frequent conditions, with an unsupervised probabilistic model for rare condition detection. It is based on the observation that CNNs often perceive photographs containing the same anomalies as similar, even though these CNNs were trained to detect unrelated conditions. This observation was based on the t-SNE visualization tool, which we decided to incorporate in our probabilistic model. Experiments on a dataset of 164,660 screening examinations from the OPHDIAT screening network show that 37 conditions, out of 41, can be detected with an area under the ROC curve (AUC) greater than 0.8 (average AUC: 0.938). In particular, this framework significantly outperforms other frameworks for detecting rare conditions, including multitask learning, transfer learning and Siamese networks, another few-shot learning solution. We expect these richer predictions to trigger the adoption of automated eye pathology screening, which will revolutionize clinical practice in ophthalmology.
Collapse
|
14
|
CATARACTS: Challenge on automatic tool annotation for cataRACT surgery. Med Image Anal 2018; 52:24-41. [PMID: 30468970 DOI: 10.1016/j.media.2018.11.008] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Revised: 11/13/2018] [Accepted: 11/15/2018] [Indexed: 12/29/2022]
Abstract
Surgical tool detection is attracting increasing attention from the medical image analysis community. The goal generally is not to precisely locate tools in images, but rather to indicate which tools are being used by the surgeon at each instant. The main motivation for annotating tool usage is to design efficient solutions for surgical workflow analysis, with potential applications in report generation, surgical training and even real-time decision support. Most existing tool annotation algorithms focus on laparoscopic surgeries. However, with 19 million interventions per year, the most common surgical procedure in the world is cataract surgery. The CATARACTS challenge was organized in 2017 to evaluate tool annotation algorithms in the specific context of cataract surgery. It relies on more than nine hours of videos, from 50 cataract surgeries, in which the presence of 21 surgical tools was manually annotated by two experts. With 14 participating teams, this challenge can be considered a success. As might be expected, the submitted solutions are based on deep learning. This paper thoroughly evaluates these solutions: in particular, the quality of their annotations are compared to that of human interpretations. Next, lessons learnt from the differential analysis of these solutions are discussed. We expect that they will guide the design of efficient surgery monitoring tools in the near future.
Collapse
|
15
|
A Comparative Evaluation of a New Generation of Diffractive Trifocal and Extended Depth of Focus Intraocular Lenses. J Refract Surg 2018; 34:507-514. [PMID: 30089179 DOI: 10.3928/1081597x-20180530-02] [Citation(s) in RCA: 159] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 05/25/2018] [Indexed: 11/20/2022]
Abstract
PURPOSE To evaluate and compare the performance of two diffractive trifocal and one extended depth of focus (EDOF) intraocular lenses (IOLs). METHODS In this 6-month, single-center, prospective, randomized, comparative study, patients undergoing routine cataract surgery were randomized to receive one of two trifocal IOLs (AcrySof IQ PanOptix; Alcon Laboratories, Inc., Fort Worth, TX, or FineVision Micro F; PhysIOL SA, Liège, Belgium) or an EDOF IOL (TECNIS Symfony; Abbott Medical Optics, Inc., Abbott Park, IL). There were 20 patients in each group. The primary outcome was binocular and monocular uncorrected distance (UDVA), intermediate (UIVA), and near (UNVA) visual acuity. The secondary outcomes were quality of vision and aberrometry. RESULTS There was no statistically significant difference between groups in either monocular (P = .717) or binocular (P = .837) UDVA. Monocular and binocular UNVA were statistically and significantly better for both trifocal lenses than for the EDOF IOL (P = .002). The percentage of patients with J2 UNVA was 52.5% monocularly and 70% binocularly for the TECNIS Symfony IOL, 81.5% monocularly and 100% binocularly for the AcrySof IQ PanOptix IOL, and 82.5% monocularly and 95% binocularly for the FineVision Micro F IOL. There was no significant difference in binocular UIVA between groups; VA was better than 0.6 in 55%, 53%, and 35% of patients with the TECNIS Symfony, AcrySof IQ Pan-Optix, and FineVision Micro F IOLs, respectively. Overall, 90% patients achieved spectacle independence. There were no differences in visual symptoms and aberrometry among groups. CONCLUSIONS All three IOLs provided good visual acuity at all distances, a high percentage of spectacle independence, and little or no impact of visual symptoms on the patients' daily functioning. Near vision was statistically better for both trifocal IOLs compared to the EDOF IOL. [J Refract Surg. 2018;34(8):507-514.].
Collapse
|
16
|
Smart data augmentation for surgical tool detection on the surgical tray. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017; 2017:4407-4410. [PMID: 29060874 DOI: 10.1109/embc.2017.8037833] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
In recent years, several algorithms were proposed to monitor a surgery through the automatic analysis of endoscope or microscope videos. This paper aims at improving existing solutions for the automated analysis of cataract surgeries, the most common ophthalmic surgery, which are performed under a microscope. Through the analysis of a video recording the surgical tray, it is possible to know which tools are put on or taken from the surgical tray, and therefore which ones are likely being used by the surgeon. Combining these observations with observations from the microscope video should enhance the overall performance of the system. Our contribution is twofold: first, datasets of artificial surgery videos are generated in order to train the convolutional neural networks (CNN) and, second, two classification methods are evaluated to detect the presence of tools in videos. Also, we assess the impact of the manner of building the artificial datasets on the tool recognition performance. By design, the proposed artificial datasets highly reduce the need for fully annotated real datasets and should also produce better performance. Experiments show that one of the proposed classification methods was able to detect most of the targeted tools well.
Collapse
|
17
|
Surgical tool detection in cataract surgery videos through multi-image fusion inside a convolutional neural network. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017; 2017:2002-2005. [PMID: 29060288 DOI: 10.1109/embc.2017.8037244] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The automatic detection of surgical tools in surgery videos is a promising solution for surgical workflow analysis. It paves the way to various applications, including surgical workflow optimization, surgical skill evaluation and real-time warning generation. A solution based on convolutional neural networks (CNNs) is proposed in this paper. Unlike existing solutions, the proposed CNN does not analyze images independently. it analyzes sequences of consecutive images. Features extracted from each image by the CNN are fused inside the network using the optical flow. For improved performance, this multi-image fusion strategy is also applied while training the CNN. The proposed framework was evaluated in a dataset of 30 cataract surgery videos (6 hours of videos). Ten tool categories were defined by surgeons. The proposed system was able to detect each of these categories with a high area under the ROC curve (0.953 ≤ Az ≤ 0.987). The proposed detector, based on multi-image fusion, was significantly more sensitive and specific than a similar system analyzing images independently (p = 2.98 × 10-6 and p = 2.07 × 10-3, respectively).
Collapse
|
18
|
Deep image mining for diabetic retinopathy screening. Med Image Anal 2017; 39:178-193. [PMID: 28511066 DOI: 10.1016/j.media.2017.04.012] [Citation(s) in RCA: 148] [Impact Index Per Article: 21.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2016] [Revised: 04/18/2017] [Accepted: 04/27/2017] [Indexed: 01/29/2023]
Abstract
Deep learning is quickly becoming the leading methodology for medical image analysis. Given a large medical archive, where each image is associated with a diagnosis, efficient pathology detectors or classifiers can be trained with virtually no expert knowledge about the target pathologies. However, deep learning algorithms, including the popular ConvNets, are black boxes: little is known about the local patterns analyzed by ConvNets to make a decision at the image level. A solution is proposed in this paper to create heatmaps showing which pixels in images play a role in the image-level predictions. In other words, a ConvNet trained for image-level classification can be used to detect lesions as well. A generalization of the backpropagation method is proposed in order to train ConvNets that produce high-quality heatmaps. The proposed solution is applied to diabetic retinopathy (DR) screening in a dataset of almost 90,000 fundus photographs from the 2015 Kaggle Diabetic Retinopathy competition and a private dataset of almost 110,000 photographs (e-ophtha). For the task of detecting referable DR, very good detection performance was achieved: Az=0.954 in Kaggle's dataset and Az=0.949 in e-ophtha. Performance was also evaluated at the image level and at the lesion level in the DiaretDB1 dataset, where four types of lesions are manually segmented: microaneurysms, hemorrhages, exudates and cotton-wool spots. For the task of detecting images containing these four lesion types, the proposed detector, which was trained to detect referable DR, outperforms recent algorithms trained to detect those lesions specifically, with pixel-level supervision. At the lesion level, the proposed detector outperforms heatmap generation algorithms for ConvNets. This detector is part of the Messidor® system for mobile eye pathology screening. Because it does not rely on expert knowledge or manual segmentation for detecting relevant patterns, the proposed solution is a promising image mining tool, which has the potential to discover new biomarkers in images.
Collapse
|
19
|
Mapping the retinas of a patient using a mixed set of fundus photographs from both eyes. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017; 2016:3239-3242. [PMID: 28268998 DOI: 10.1109/embc.2016.7591419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
With the increased prevalence of retinal pathologies, automating the detection and progression measurement of these pathologies is becoming more and more relevant. Color fundus photography is the leading modality for assessing retinal pathologies. Because eye fundus cameras have a limited field of view, multiple photographs are taken from each retina during an eye fundus examination. However, operators usually don't indicate which photographs are from the left retina and which ones are from the right retina. This paper presents a novel algorithm that automatically assigns each photograph to one retina and builds a composite image (or "mosaic") per retina, which is expected to push the performance of automated diagnosis forward. The algorithm starts by jointly forming two mosaics, one per retina, using a novel graph theoretic approach. Then, in order to determine which mosaic corresponds to the left retina and which one corresponds to the right retina, two retinal landmarks are detected robustly in each mosaic: the main vessel arch surrounding the macula and the optic disc. The laterality of each mosaic derives from their relative location. Experiments on 2790 manually annotated images validate the very good performance of the proposed framework even for highly pathological images.
Collapse
|
20
|
|
21
|
Target Properties Effects on Central versus Peripheral Vertical Fusion Interaction Tested on a 3D Platform. Curr Eye Res 2016; 42:476-483. [PMID: 27419270 DOI: 10.1080/02713683.2016.1196704] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
PURPOSE We investigated the impact of target properties on vertical fusion amplitude (VFA) using a 3D display platform; the performance of the subjects allowed us to assess how central and peripheral retina regions interact during the fusion process. MATERIAL AND METHODS Fourteen subjects were involved in the test. VFA was recorded by varying the viewing distance, target complexity, disparity velocity, lighting condition and background luminance. Base-up prisms were introduced to create vertical disparity in the peripheral retinal area, whereas an offset compensation was added in the central area. Data were analyzed in JMP software using T-test and repeated-measures ANOVA tests. RESULTS VFA is significantly affected by target properties including viewing distance, target complexity and disparity velocity; the impact from lighting condition and background luminance is not significant. Although central retina plays a crucial role in the fusion process, peripheral regions also affect the fusion performance when stimulus size on retina and contents disparity values are modified between central and peripheral vision. CONCLUSION Vertical fusion is affected by various target properties. For the first time, peripheral vertical disparity direction effects on central fusion and eye motion response have been explored. Besides, a quantitative interaction of central and peripheral fusion is observed, which could be applied in clinical measurement on binocular disease concerning central and peripheral vision conflict.
Collapse
|
22
|
Multiple-Instance Learning for Anomaly Detection in Digital Mammography. IEEE TRANSACTIONS ON MEDICAL IMAGING 2016; 35:1604-1614. [PMID: 26829783 DOI: 10.1109/tmi.2016.2521442] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
This paper describes a computer-aided detection and diagnosis system for breast cancer, the most common form of cancer among women, using mammography. The system relies on the Multiple-Instance Learning (MIL) paradigm, which has proven useful for medical decision support in previous works from our team. In the proposed framework, breasts are first partitioned adaptively into regions. Then, features derived from the detection of lesions (masses and microcalcifications) as well as textural features, are extracted from each region and combined in order to classify mammography examinations as "normal" or "abnormal". Whenever an abnormal examination record is detected, the regions that induced that automated diagnosis can be highlighted. Two strategies are evaluated to define this anomaly detector. In a first scenario, manual segmentations of lesions are used to train an SVM that assigns an anomaly index to each region; local anomaly indices are then combined into a global anomaly index. In a second scenario, the local and global anomaly detectors are trained simultaneously, without manual segmentations, using various MIL algorithms (DD, APR, mi-SVM, MI-SVM and MILBoost). Experiments on the DDSM dataset show that the second approach, which is only weakly-supervised, surprisingly outperforms the first approach, even though it is strongly-supervised. This suggests that anomaly detectors can be advantageously trained on large medical image archives, without the need for manual segmentation.
Collapse
|
23
|
[Assessment of a 3D digital orthoptic test platform]. J Fr Ophtalmol 2016; 39:441-8. [PMID: 27185660 DOI: 10.1016/j.jfo.2016.01.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2015] [Revised: 01/08/2016] [Accepted: 01/15/2016] [Indexed: 11/26/2022]
Abstract
PURPOSE To compare standard orthoptic tests with a novel digital 3D orthoptic platform, 3DeltaEasy(©) from Orthoptica(®). MATERIALS AND METHODS This study tests the 3D digital orthoptics platform, 3DeltaEasy(©) from Orthoptica(®) and compares it to the corresponding standard orthoptic tests. This platform consists of a computer equipped with dedicated software, a video projector and 3D liquid crystal glasses. Three tests were compared: Wirt test, measurement of horizontal and vertical phorias, and the horizontal fusional amplitude in convergence and divergence. A total of 102 subjects, 53 males (52 %) and 49 females (48 %), aged between 9 years and 72 years (mean age 33±16.4 years) were examined at the ophthalmologic department of the Brest Hospital (France) and included in this observational cross-sectional study. Subjects recruited in this study were patients requiring orthoptic screening or therapy. Patients without their optimal visual corrections were excluded. All patients underwent both ophthalmological and orthoptic examination including Wirt fly stereotest with polarizing spectacles, cover tests to evaluate and measure the horizontal and vertical deviation of the lines of sight, horizontal vergence ranges using prism bar and their equivalent tests implemented in the digital 3D orthoptic tests 3DeltaEasy(©) from Orthoptica(®). RESULTS All data were processed using MedCalc Statistical Software version 14.12.0 (MedCalc Software bvba, Ostend, Belgium). The main result of this study is that 3DeltaEasy(©) and the classical Wirt test are correlated (Spearman's coefficient of rank correlation: ρ=0.74; P<0.0001), cover tests are equivalent for intermediate and far vision (paired t-test; P=0.46 and P=0.51), and horizontal and vertical vergence range are comparable for distance vision (paired t-test; P=0.34 and P=0.94). CONCLUSION New digital 3D tools could easily substitute for some orthoptic tests with better ergonomics. Eventually, by increasing the number of tests performed, it could substitute for nearly all tests.
Collapse
|
24
|
Suitability of a Low-Cost, Handheld, Nonmydriatic Retinograph for Diabetic Retinopathy Diagnosis. Transl Vis Sci Technol 2016; 5:16. [PMID: 27134775 PMCID: PMC4849542 DOI: 10.1167/tvst.5.2.16] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 03/02/2016] [Indexed: 11/26/2022] Open
Abstract
Purpose We assessed the suitability of a low-cost, handheld, nonmydriatic retinograph, namely the Horus DEC 200, for diabetic retinopathy (DR) diagnosis. Two factors were considered: ease of image acquisition and image quality. Methods One operator acquired fundus photographs from 54 patients using the Horus and AFC-330, a more expensive, nonportable retinograph. Satisfaction surveys were filled out by patients. Then, two retinologists subjectively assessed image quality and graded DR severity in one eye of each patient. Objective image quality indices also were computed. Results During image acquisitions, patients had difficulty locating the fixation target inside the Horus: by default, 53.7% of them had to fixate external points with the contralateral eye, as opposed to none of them using the AFC-330 (P < 0.0001). This issue impacted the duration of image acquisitions. Images obtained by the Horus were of significantly lower quality according to the experts (P = 0.0002 and P = 0.0004) and to the objective criterion (P < 0.0001). As a result, up to 20.4% of eyes were inadequate for interpretation, as opposed to 9.3% using the AFC-330. However, no significant difference was found in terms of DR severity according to both experts (P = 0.557 and P = 0.156). Conclusions The Horus can be used to screen DR, but at the cost of longer examination times and higher proportions of patients referred to an ophthalmologist due to inadequate image quality. Translational Relevance The Horus is adequate to screen DR, for instance in primary care centers or in mobile imaging units.
Collapse
|
25
|
Automatic detection of referral patients due to retinal pathologies through data mining. Med Image Anal 2015; 29:47-64. [PMID: 26774796 DOI: 10.1016/j.media.2015.12.006] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2015] [Revised: 12/17/2015] [Accepted: 12/18/2015] [Indexed: 12/29/2022]
Abstract
With the increased prevalence of retinal pathologies, automating the detection of these pathologies is becoming more and more relevant. In the past few years, many algorithms have been developed for the automated detection of a specific pathology, typically diabetic retinopathy, using eye fundus photography. No matter how good these algorithms are, we believe many clinicians would not use automatic detection tools focusing on a single pathology and ignoring any other pathology present in the patient's retinas. To solve this issue, an algorithm for characterizing the appearance of abnormal retinas, as well as the appearance of the normal ones, is presented. This algorithm does not focus on individual images: it considers examination records consisting of multiple photographs of each retina, together with contextual information about the patient. Specifically, it relies on data mining in order to learn diagnosis rules from characterizations of fundus examination records. The main novelty is that the content of examination records (images and context) is characterized at multiple levels of spatial and lexical granularity: 1) spatial flexibility is ensured by an adaptive decomposition of composite retinal images into a cascade of regions, 2) lexical granularity is ensured by an adaptive decomposition of the feature space into a cascade of visual words. This multigranular representation allows for great flexibility in automatically characterizing normality and abnormality: it is possible to generate diagnosis rules whose precision and generalization ability can be traded off depending on data availability. A variation on usual data mining algorithms, originally designed to mine static data, is proposed so that contextual and visual data at adaptive granularity levels can be mined. This framework was evaluated in e-ophtha, a dataset of 25,702 examination records from the OPHDIAT screening network, as well as in the publicly-available Messidor dataset. It was successfully applied to the detection of patients that should be referred to an ophthalmologist and also to the specific detection of several pathologies.
Collapse
|
26
|
Automated surgical step recognition in normalized cataract surgery videos. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2015; 2014:4647-50. [PMID: 25571028 DOI: 10.1109/embc.2014.6944660] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Huge amounts of surgical data are recorded during video-monitored surgery. Content-based video retrieval systems intent to reuse those data for computer-aided surgery. In this paper, we focus on real-time recognition of cataract surgery steps: the goal is to retrieve from a database surgery videos that were recorded during the same surgery step. The proposed system relies on motion features for video characterization. Motion features are usually impacted by eye motion or zoom level variations, which are not necessarily relevant for surgery step recognition. Those problems certainly limit the performance of the retrieval system. We therefore propose to refine motion feature extraction by applying pre-processing steps based on a novel pupil center and scale tracking method. Those pre-processing steps are evaluated for two different motion features. In this paper, a similarity measure adapted from Piciarelli's video surveillance system is evaluated for the first time in a surgery dataset. This similarity measure provides good results and for both motion features, the proposed preprocessing steps improved the retrieval performance of the system significantly.
Collapse
|
27
|
Multimedia data mining for automatic diabetic retinopathy screening. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2015; 2013:7144-7. [PMID: 24111392 DOI: 10.1109/embc.2013.6611205] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
This paper presents TeleOphta, an automatic system for screening diabetic retinopathy in teleophthalmology networks. Its goal is to reduce the burden on ophthalmologists by automatically detecting non referable examination records, i.e. examination records presenting no image quality problems and no pathological signs related to diabetic retinopathy or any other retinal pathology. TeleOphta is an attempt to put into practice years of algorithmic developments from our groups. It combines image quality metrics, specific lesion detectors and a generic pathological pattern miner to process the visual content of eye fundus photographs. This visual information is further combined with contextual data in order to compute an abnormality risk for each examination record. The TeleOphta system was trained and tested on a large dataset of 25,702 examination records from the OPHDIAT screening network in Paris. It was able to automatically detect 68% of the non referable examination records while achieving the same sensitivity as a second ophthalmologist. This suggests that it could safely reduce the burden on ophthalmologists by 56%.
Collapse
|
28
|
Real-time task recognition in cataract surgery videos using adaptive spatiotemporal polynomials. IEEE TRANSACTIONS ON MEDICAL IMAGING 2015; 34:877-887. [PMID: 25373078 DOI: 10.1109/tmi.2014.2366726] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
This paper introduces a new algorithm for recognizing surgical tasks in real-time in a video stream. The goal is to communicate information to the surgeon in due time during a video-monitored surgery. The proposed algorithm is applied to cataract surgery, which is the most common eye surgery. To compensate for eye motion and zoom level variations, cataract surgery videos are first normalized. Then, the motion content of short video subsequences is characterized with spatiotemporal polynomials: a multiscale motion characterization based on adaptive spatiotemporal polynomials is presented. The proposed solution is particularly suited to characterize deformable moving objects with fuzzy borders, which are typically found in surgical videos. Given a target surgical task, the system is trained to identify which spatiotemporal polynomials are usually extracted from videos when and only when this task is being performed. These key spatiotemporal polynomials are then searched in new videos to recognize the target surgical task. For improved performances, the system jointly adapts the spatiotemporal polynomial basis and identifies the key spatiotemporal polynomials using the multiple-instance learning paradigm. The proposed system runs in real-time and outperforms the previous solution from our group, both for surgical task recognition ( Az = 0.851 on average, as opposed to Az = 0.794 previously) and for the joint segmentation and recognition of surgical tasks ( Az = 0.856 on average, as opposed to Az = 0.832 previously).
Collapse
|
29
|
Multiple-instance learning for breast cancer detection in mammograms. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2015; 2015:7055-7058. [PMID: 26737917 DOI: 10.1109/embc.2015.7320017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
This paper describes an experimental computer-aided detection and diagnosis system for breast cancer, the most common form of cancer among women, using mammography. The system relies on the Multiple-Instance Learning (MIL) paradigm, which has proven useful for medical decision support in previous works from our team. In the proposed framework, the breasts are first partitioned adaptively into regions. Then, either textural features, or features derived from the detection of masses and microcalcifications, are extracted from each region. Finally, feature vectors extracted from each region are combined using an MIL algorithm (Citation k-NN or mi-Graph), in order to recognize "normal" mammography examinations or to categorize examinations as "normal", "benign" or "cancer". An accuracy of 91.1% (respectively 62.1%) was achieved for normality recognition (respectively three-class categorization) in a subset of 720 mammograms from the DDSM dataset. The paper also discusses future improvements, that will make the most of the MIL paradigm, in order to improve "benign" versus "cancer" discrimination in particular.
Collapse
|
30
|
Real-time segmentation and recognition of surgical tasks in cataract surgery videos. IEEE TRANSACTIONS ON MEDICAL IMAGING 2014; 33:2352-60. [PMID: 25055383 DOI: 10.1109/tmi.2014.2340473] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
In ophthalmology, it is now common practice to record every surgical procedure and to archive the resulting videos for documentation purposes. In this paper, we present a solution to automatically segment and categorize surgical tasks in real-time during the surgery, using the video recording. The goal would be to communicate information to the surgeon in due time, such as recommendations to the less experienced surgeons. The proposed solution relies on the content-based video retrieval paradigm: it reuses previously archived videos to automatically analyze the current surgery, by analogy reasoning. Each video is segmented, in real-time, into an alternating sequence of idle phases, during which no clinically-relevant motions are visible, and action phases. As soon as an idle phase is detected, the previous action phase is categorized and the next action phase is predicted. A conditional random field is used for categorization and prediction. The proposed system was applied to the automatic segmentation and categorization of cataract surgery tasks. A dataset of 186 surgeries, performed by ten different surgeons, was manually annotated: ten possibly overlapping surgical tasks were delimited in each surgery. Using the content of action phases and the duration of idle phases as sources of evidence, an average recognition performance of Az = 0.832 ± 0.070 was achieved.
Collapse
|
31
|
Exudate detection in color retinal images for mass screening of diabetic retinopathy. Med Image Anal 2014; 18:1026-43. [PMID: 24972380 DOI: 10.1016/j.media.2014.05.004] [Citation(s) in RCA: 183] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Revised: 04/22/2014] [Accepted: 05/07/2014] [Indexed: 11/16/2022]
|
32
|
Real-time recognition of surgical tasks in eye surgery videos. Med Image Anal 2014; 18:579-90. [DOI: 10.1016/j.media.2014.02.007] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2012] [Revised: 02/07/2014] [Accepted: 02/17/2014] [Indexed: 01/23/2023]
|
33
|
Normalizing videos of anterior eye segment surgeries. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2014; 2014:122-125. [PMID: 25569912 DOI: 10.1109/embc.2014.6943544] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Anterior eye segment surgeries are usually video-recorded. If we are able to efficiently analyze surgical videos in real-time, new decision support tools will emerge. The main anatomical landmarks in these videos are the pupil boundaries and the limbus, but segmenting them is challenging due to the variety of colors and textures in the pupil, the iris, the sclera and the lids. In this paper, we present a solution to reliably normalize the center and the scale in videos, without explicitly segmenting these landmarks. First, a robust solution to track the pupil center is presented: it uses the fact that the pupil boundaries, the limbus and the sclera / lid interface are concentric. Second, a solution to estimate the zoom level is presented: it relies on the illumination pattern reflected on the cornea. The proposed solution was assessed in a dataset of 186 real-live cataract surgery videos. The distance between the true and estimated pupil centers was equal to 8.0 ± 6.9% of the limbus radius. The correlation between the estimated zoom level and the true limbus size in images was high: R = 0.834.
Collapse
|
34
|
Mass segmentation in mammograms by using Bidimensional Emperical Mode Decomposition BEMD. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2013; 2013:5441-4. [PMID: 24110967 DOI: 10.1109/embc.2013.6610780] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Breast mass segmentation in mammography plays a crucial role in Computer-Aided Diagnosis (CAD) systems. In this paper a Bidimensional Emperical Mode Decomposition (BEMD) method is introduced for the mass segmentation in mammography images. This method is used to decompose images into a set of functions named Bidimensional Intrinsic Mode Functions (BIMF) and a residue. Our approach consists of three steps: 1) the regions of interest (ROIs) were identified by using iterative thresholding; 2) the contour of the regions of interest (ROI) was extracted from the first BIMF by using the (BEMD) method; 3) the region of interest was finally refined by the extracted contour. The proposed approach is tested on (MIAS) database and the obtained results demonstrate the efficacy of the proposed approach.
Collapse
|
35
|
Motion-based video retrieval with application to computer-assisted retinal surgery. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2013; 2012:4962-5. [PMID: 23367041 DOI: 10.1109/embc.2012.6347106] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In this paper, we address the problem of computer-aided ophthalmic surgery. In particular, a novel Content-Based Video Retrieval (CBVR) system is presented : given a video stream captured by a digital camera monitoring the current surgery, the system retrieves, within digital archives, videos that resemble the current surgery monitoring video. The search results may be used to guide surgeons' decisions, for example, let the surgeon know what a more experienced fellow worker would do in a similar situation. With this goal, we propose to use motion information contained in MPEG- 4 AVC/H.264 video standard to extract features from videos. We propose two approaches, one of which is based on motion histogram created for every frame of a compressed video sequence to extract motion direction and intensity statistics. The other combine segmentation and tracking to extract region displacements between consecutive frames and therefore characterize region trajectories. To compare videos, an extension of the fast dynamic time warping to multidimensional time series was adopted. The system is applied to a dataset of 69 video-recorded retinal surgery steps. Results are promising: the retrieval efficiency is higher than 69%.
Collapse
|
36
|
Abstract
IMPORTANCE The diagnostic accuracy of computer detection programs has been reported to be comparable to that of specialists and expert readers, but no computer detection programs have been validated in an independent cohort using an internationally recognized diabetic retinopathy (DR) standard. OBJECTIVE To determine the sensitivity and specificity of the Iowa Detection Program (IDP) to detect referable diabetic retinopathy (RDR). DESIGN AND SETTING In primary care DR clinics in France, from January 1, 2005, through December 31, 2010, patients were photographed consecutively, and retinal color images were graded for retinopathy severity according to the International Clinical Diabetic Retinopathy scale and macular edema by 3 masked independent retinal specialists and regraded with adjudication until consensus. The IDP analyzed the same images at a predetermined and fixed set point. We defined RDR as more than mild nonproliferative retinopathy and/or macular edema. PARTICIPANTS A total of 874 people with diabetes at risk for DR. MAIN OUTCOME MEASURES Sensitivity and specificity of the IDP to detect RDR, area under the receiver operating characteristic curve, sensitivity and specificity of the retinal specialists' readings, and mean interobserver difference (κ). RESULTS The RDR prevalence was 21.7% (95% CI, 19.0%-24.5%). The IDP sensitivity was 96.8% (95% CI, 94.4%-99.3%) and specificity was 59.4% (95% CI, 55.7%-63.0%), corresponding to 6 of 874 false-negative results (none met treatment criteria). The area under the receiver operating characteristic curve was 0.937 (95% CI, 0.916-0.959). Before adjudication and consensus, the sensitivity/specificity of the retinal specialists were 0.80/0.98, 0.71/1.00, and 0.91/0.95, and the mean intergrader κ was 0.822. CONCLUSIONS The IDP has high sensitivity and specificity to detect RDR. Computer analysis of retinal photographs for DR and automated detection of RDR can be implemented safely into the DR screening pipeline, potentially improving access to screening and health care productivity and reducing visual loss through early treatment.
Collapse
|
37
|
Studying disagreements among retinal experts through image analysis. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2013; 2012:5959-62. [PMID: 23367286 DOI: 10.1109/embc.2012.6347351] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In recent years, many image analysis algorithms have been presented to assist Diabetic Retinopathy (DR) screening. The goal was usually to detect healthy examination records automatically, in order to reduce the number of records that should be analyzed by retinal experts. In this paper, a novel application is presented: these algorithms are used to 1) discover image characteristics that sometimes cause an expert to disagree with his/her peers and 2) warn the expert whenever these characteristics are detected in an examination record. In a DR screening program, each examination record is only analyzed by one expert, therefore analyzing disagreements among experts is challenging. A statistical framework, based on Parzen-windowing and the Patrick-Fischer distance, is presented to solve this problem. Disagreements among eleven experts from the Ophdiat screening program were analyzed, using an archive of 25,702 examination records.
Collapse
|
38
|
Automatic correction of rotating ultrasound bio microscopy acquisitions for the segmentation of the eye anterior segment. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2013; 2013:1156-1159. [PMID: 24109898 DOI: 10.1109/embc.2013.6609711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
We have developed a rotating 3D probe prototype in order to acquire the anterior segment of the eye in three dimensions. The acquisition accuracy has to be sufficient to allow for the use of automatic segmentation of the provided data, and thus generate a 3D structure of the eye, for which it could be easier to obtain measurements than in 2D images. We have created an image post processing scheme in order to compensate for vibrations and eye movements during acquisition that are associated with increased noise. These tools have been applied to 92 volume datasets acquired on 21 patients in pre-operative conditions. Acquisition noise was reduced by 97% in specific conditions with respect to data acquired without correction.
Collapse
|
39
|
|
40
|
Use of a JPEG-2000 Wavelet Compression Scheme for Content-Based Ophtalmologic Retinal Images Retrieval. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2012; 2005:4010-3. [PMID: 17281111 DOI: 10.1109/iembs.2005.1615341] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In this paper we propose a content based image retrieval method for diagnosis aid in diabetic retinopathy. We characterize images without extracting significant features, and use histograms obtained from the compressed images in JPEG-2000 wavelet scheme to build signatures. The research is carried out by calculating signature distances between the query and database images. A weighted distance between histograms is used. Retrieval efficiency is given for different standard types of JPEG-2000 wavelets, and for different values of histogram weights. A classified diabetic retinopathy image database is built allowing algorithms tests. On this image database, results are promising: the retrieval efficiency is higher than 70% for some lesion types.
Collapse
|
41
|
A low distorsion and reversible watermark: application to angiographic images of the retina. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2012; 2005:2224-7. [PMID: 17282674 DOI: 10.1109/iembs.2005.1616905] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Medical image security can be enhanced using watermarking, which allows embedding the protection information as a digital signature, by modifying the pixel gray levels of the image. In this paper we propose a reversible watermarking scheme which guarantees that once the embedded message is read, alterations introduced during the insertion process can be removed from the image. Thereafter, original pixel gray levels of the image are restored. The proposed approach relies on estimation of image signal that is invariant to the insertion process, and permits to introduce a very slight watermark within the image. In fact, the insertion process adds or subtracts at least one gray level to the pixels of the original image. Depending on the image to be watermarked, in our case angiographic images of the retina, it is expected that such image alteration will not have any impact on the diagnosis quality, and consequently that the watermark can be kept within the image while this one is interpreted.
Collapse
|
42
|
Real-time retrieval of similar videos with application to computer-aided retinal surgery. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2011:4465-8. [PMID: 22255330 DOI: 10.1109/iembs.2011.6091107] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This paper introduces ongoing research on computer-aided ophthalmic surgery. In particular, a novel Content-Based Video Retrieval (CBVR) system is presented. Its purpose is the following: given a video stream captured by a digital camera monitoring the surgery, the system should retrieve, in real-time, similar video subsequences in video archives. In order to retrieve semantically-relevant videos, most existing CBVR systems rely on temporally flexible distance measures such as Dynamic Time Warping. These distance measures are slow and therefore do not allow real-time retrieval. In the proposed system, temporal flexibility is introduced in the way video subsequences are characterized, which allows the use of simple and fast distance measures. As a consequence, realtime retrieval of similar video subsequences, among hundreds of thousands of examples, is now possible. Besides, the proposed system is adaptive: a fast training procedure is presented. The system has been successfully applied to automated recognition of retinal surgery steps on a 69-video dataset: areas under the Receiver Operating Characteristic curves range from A(z)=0.809 to A(z)=0.989.
Collapse
|
43
|
Fast wavelet-based image characterization for highly adaptive image retrieval. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2012; 21:1613-1623. [PMID: 22194244 DOI: 10.1109/tip.2011.2180915] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Adaptive wavelet-based image characterizations have been proposed in previous works for content-based image retrieval (CBIR) applications. In these applications, the same wavelet basis was used to characterize each query image: This wavelet basis was tuned to maximize the retrieval performance in a training data set. We take it one step further in this paper: A different wavelet basis is used to characterize each query image. A regression function, which is tuned to maximize the retrieval performance in the training data set, is used to estimate the best wavelet filter, i.e., in terms of expected retrieval performance, for each query image. A simple image characterization, which is based on the standardized moments of the wavelet coefficient distributions, is presented. An algorithm is proposed to compute this image characterization almost instantly for every possible separable or nonseparable wavelet filter. Therefore, using a different wavelet basis for each query image does not considerably increase computation times. On the other hand, significant retrieval performance increases were obtained in a medical image data set, a texture data set, a face recognition data set, and an object picture data set. This additional flexibility in wavelet adaptation paves the way to relevance feedback on image characterization itself and not simply on the way image characterizations are combined.
Collapse
|
44
|
Fouille d’images multi-instance et multi-résolution appliquée au dépistage de la rétinopathie diabétique. Ing Rech Biomed 2011. [DOI: 10.1016/j.irbm.2011.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
45
|
Automated assessment of diabetic retinopathy severity using content-based image retrieval in multimodal fundus photographs. Invest Ophthalmol Vis Sci 2011; 52:8342-8. [PMID: 21896872 DOI: 10.1167/iovs.11-7418] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
PURPOSE Recent studies on diabetic retinopathy (DR) screening in fundus photographs suggest that disagreements between algorithms and clinicians are now comparable to disagreements among clinicians. The purpose of this study is to (1) determine whether this observation also holds for automated DR severity assessment algorithms, and (2) show the interest of such algorithms in clinical practice. METHODS A dataset of 85 consecutive DR examinations (168 eyes, 1176 multimodal eye fundus photographs) was collected at Brest University Hospital (Brest, France). Two clinicians with different experience levels determined DR severity in each eye, according to the International Clinical Diabetic Retinopathy Disease Severity (ICDRS) scale. Based on Cohen's kappa (κ) measurements, the performance of clinicians at assessing DR severity was compared to the performance of state-of-the-art content-based image retrieval (CBIR) algorithms from our group. RESULTS At assessing DR severity in each patient, intraobserver agreement was κ = 0.769 for the most experienced clinician. Interobserver agreement between clinicians was κ = 0.526. Interobserver agreement between the most experienced clinicians and the most advanced algorithm was κ = 0.592. Besides, the most advanced algorithm was often able to predict agreements and disagreements between clinicians. CONCLUSIONS Automated DR severity assessment algorithms, trained to imitate experienced clinicians, can be used to predict when young clinicians would agree or disagree with their more experienced fellow members. Such algorithms may thus be used in clinical practice to help validate or invalidate their diagnoses. CBIR algorithms, in particular, may also be used for pooling diagnostic knowledge among peers, with applications in training and coordination of clinicians' prescriptions.
Collapse
|
46
|
Content Based medical image retrieval based on BEMD: optimization of a similarity metric. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2011; 2010:3069-72. [PMID: 21095736 DOI: 10.1109/iembs.2010.5626134] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Most medical images are now digitized and stored in patients files databases. The challenge is how to use them for acquiring knowledge or/and for aid to diagnosis. In this paper, we address the challenge of diagnosis aid by Content Based Image Retrieval (CBIR). We propose to characterize images by using the Bidimensional Empirical Mode Decomposition (BEMD). Images are decomposed into a set of functions named Bidimensional Intrinsic Mode Functions (BIMF). Two methods are used to characterize BIMFs information content: the Generalized Gaussian density functions (GGD) and the Huang-Hilbert transform (HHT). In order to enhance results, we introduce a similarity metric optimization process: weighted distances between BIMFs are adapted for each image in the database. Retrieval efficiency is given for different databases (DB), including a diabetic retinopathy DB, a mammography DB and a faces DB. Results are promising: the retrieval efficiency is higher than 95% for some cases.
Collapse
|
47
|
Case retrieval in medical databases by fusing heterogeneous information. IEEE TRANSACTIONS ON MEDICAL IMAGING 2011; 30:108-118. [PMID: 20693107 DOI: 10.1109/tmi.2010.2063711] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
A novel content-based heterogeneous information retrieval framework, particularly well suited to browse medical databases and support new generation computer aided diagnosis (CADx) systems, is presented in this paper. It was designed to retrieve possibly incomplete documents, consisting of several images and semantic information, from a database; more complex data types such as videos can also be included in the framework. The proposed retrieval method relies on image processing, in order to characterize each individual image in a document by their digital content, and information fusion. Once the available images in a query document are characterized, a degree of match, between the query document and each reference document stored in the database, is defined for each attribute (an image feature or a metadata). A Bayesian network is used to recover missing information if need be. Finally, two novel information fusion methods are proposed to combine these degrees of match, in order to rank the reference documents by decreasing relevance for the query. In the first method, the degrees of match are fused by the Bayesian network itself. In the second method, they are fused by the Dezert-Smarandache theory: the second approach lets us model our confidence in each source of information (i.e., each attribute) and take it into account in the fusion process for a better retrieval performance. The proposed methods were applied to two heterogeneous medical databases, a diabetic retinopathy database and a mammography screening database, for computer aided diagnosis. Precisions at five of 0.809 ± 0.158 and 0.821 ± 0.177, respectively, were obtained for these two databases, which is very promising.
Collapse
|
48
|
Abstract
A novel content-based information retrieval framework, designed to cover several medical applications, is presented in this paper. The presented framework allows the retrieval of possibly incomplete medical cases consisting of several images together with semantic information. It relies on a committee of decision trees, decision support tools well suited to process this type of information. In our proposed framework, images are characterized by their digital content. It was applied to two heterogeneous medical datasets for computer-aided diagnoses: a diabetic retinopathy follow-up dataset (DRD) and a mammography-screening dataset (DDSM). Measure of precision among the top five retrieved results of 0.788 + or - 0.137 and 0.869 + or - 0.161 was obtained on DRD and DDSM, respectively. On DRD, for instance, it increases by half the retrieval of single images.
Collapse
|
49
|
Wavelet optimization for content-based image retrieval in medical databases. Med Image Anal 2010; 14:227-41. [DOI: 10.1016/j.media.2009.11.004] [Citation(s) in RCA: 140] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2008] [Revised: 11/07/2009] [Accepted: 11/11/2009] [Indexed: 11/26/2022]
|
50
|
Retinopathy online challenge: automatic detection of microaneurysms in digital color fundus photographs. IEEE TRANSACTIONS ON MEDICAL IMAGING 2010; 29:185-195. [PMID: 19822469 DOI: 10.1109/tmi.2009.2033909] [Citation(s) in RCA: 177] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
The detection of microaneurysms in digital color fundus photographs is a critical first step in automated screening for diabetic retinopathy (DR), a common complication of diabetes. To accomplish this detection numerous methods have been published in the past but none of these was compared with each other on the same data. In this work we present the results of the first international microaneurysm detection competition, organized in the context of the Retinopathy Online Challenge (ROC), a multiyear online competition for various aspects of DR detection. For this competition, we compare the results of five different methods, produced by five different teams of researchers on the same set of data. The evaluation was performed in a uniform manner using an algorithm presented in this work. The set of data used for the competition consisted of 50 training images with available reference standard and 50 test images where the reference standard was withheld by the organizers (M. Niemeijer, B. van Ginneken, and M. D. Abràmoff). The results obtained on the test data was submitted through a website after which standardized evaluation software was used to determine the performance of each of the methods. A human expert detected microaneurysms in the test set to allow comparison with the performance of the automatic methods. The overall results show that microaneurysm detection is a challenging task for both the automatic methods as well as the human expert. There is room for improvement as the best performing system does not reach the performance of the human expert. The data associated with the ROC microaneurysm detection competition will remain publicly available and the website will continue accepting submissions.
Collapse
|