1
|
Wong CYT, Antaki F, Woodward-Court P, Ong AY, Keane PA. The role of saliency maps in enhancing ophthalmologists' trust in artificial intelligence models. Asia Pac J Ophthalmol (Phila) 2024; 13:100087. [PMID: 39069106 DOI: 10.1016/j.apjo.2024.100087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Revised: 07/17/2024] [Accepted: 07/23/2024] [Indexed: 07/30/2024] Open
Abstract
PURPOSE Saliency maps (SM) allow clinicians to better understand the opaque decision-making process in artificial intelligence (AI) models by visualising the important features responsible for predictions. This ultimately improves interpretability and confidence. In this work, we review the use case for SMs, exploring their impact on clinicians' understanding and trust in AI models. We use the following ophthalmic conditions as examples: (1) glaucoma, (2) myopia, (3) age-related macular degeneration, and (4) diabetic retinopathy. METHOD A multi-field search on MEDLINE, Embase, and Web of Science was conducted using specific keywords. Only studies on the use of SMs in glaucoma, myopia, AMD, or DR were considered for inclusion. RESULTS Findings reveal that SMs are often used to validate AI models and advocate for their adoption, potentially leading to biased claims. Overlooking the technical limitations of SMs, and the conductance of superficial assessments of their quality and relevance, was discerned. Uncertainties persist regarding the role of saliency maps in building trust in AI. It is crucial to enhance understanding of SMs' technical constraints and improve evaluation of their quality, impact, and suitability for specific tasks. Establishing a standardised framework for selecting and assessing SMs, as well as exploring their relationship with other reliability sources (e.g. safety and generalisability), is essential for enhancing clinicians' trust in AI. CONCLUSION We conclude that SMs are not beneficial for interpretability and trust-building purposes in their current forms. Instead, SMs may confer benefits to model debugging, model performance enhancement, and hypothesis testing (e.g. novel biomarkers).
Collapse
Affiliation(s)
| | - Fares Antaki
- Institute of Ophthalmology, University College London, London, United Kingdom
| | | | - Ariel Yuhan Ong
- Institute of Ophthalmology, University College London, London, United Kingdom
| | - Pearse A Keane
- Institute of Ophthalmology, University College London, London, United Kingdom.
| |
Collapse
|
2
|
Lee DK, Choi YJ, Lee SJ, Kang HG, Park YR. Development of a deep learning model to distinguish the cause of optic disc atrophy using retinal fundus photography. Sci Rep 2024; 14:5079. [PMID: 38429319 PMCID: PMC10907364 DOI: 10.1038/s41598-024-55054-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 02/20/2024] [Indexed: 03/03/2024] Open
Abstract
The differential diagnosis for optic atrophy can be challenging and requires expensive, time-consuming ancillary testing to determine the cause. While Leber's hereditary optic neuropathy (LHON) and optic neuritis (ON) are both clinically significant causes for optic atrophy, both relatively rare in the general population, contributing to limitations in obtaining large imaging datasets. This study therefore aims to develop a deep learning (DL) model based on small datasets that could distinguish the cause of optic disc atrophy using only fundus photography. We retrospectively reviewed fundus photographs of 120 normal eyes, 30 eyes (15 patients) with genetically-confirmed LHON, and 30 eyes (26 patients) with ON. Images were split into a training dataset and a test dataset and used for model training with ResNet-18. To visualize the critical regions in retinal photographs that are highly associated with disease prediction, Gradient-Weighted Class Activation Map (Grad-CAM) was used to generate image-level attention heat maps and to enhance the interpretability of the DL system. In the 3-class classification of normal, LHON, and ON, the area under the receiver operating characteristic curve (AUROC) was 1.0 for normal, 0.988 for LHON, and 0.990 for ON, clearly differentiating each class from the others with an overall total accuracy of 0.93. Specifically, when distinguishing between normal and disease cases, the precision, recall, and F1 scores were perfect at 1.0. Furthermore, in the differentiation of LHON from other conditions, ON from others, and between LHON and ON, we consistently observed precision, recall, and F1 scores of 0.8. The model performance was maintained until only 10% of the pixel values of the image, identified as important by Grad-CAM, were preserved and the rest were masked, followed by retraining and evaluation.
Collapse
Affiliation(s)
- Dong Kyu Lee
- Department of Ophthalmology, Institute of Vision Research, Severance Eye Hospital, Yonsei University College of Medicine, Yonsei-ro 50-1, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Young Jo Choi
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Yonsei-ro 50-1, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Seung Jae Lee
- Department of Ophthalmology, Institute of Vision Research, Severance Eye Hospital, Yonsei University College of Medicine, Yonsei-ro 50-1, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Hyun Goo Kang
- Department of Ophthalmology, Institute of Vision Research, Severance Eye Hospital, Yonsei University College of Medicine, Yonsei-ro 50-1, Seodaemun-gu, Seoul, 03722, Republic of Korea.
| | - Yu Rang Park
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Yonsei-ro 50-1, Seodaemun-gu, Seoul, 03722, Republic of Korea.
| |
Collapse
|
3
|
Hasan MM, Phu J, Sowmya A, Meijering E, Kalloniatis M. Artificial intelligence in the diagnosis of glaucoma and neurodegenerative diseases. Clin Exp Optom 2024; 107:130-146. [PMID: 37674264 DOI: 10.1080/08164622.2023.2235346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 07/07/2023] [Indexed: 09/08/2023] Open
Abstract
Artificial Intelligence is a rapidly expanding field within computer science that encompasses the emulation of human intelligence by machines. Machine learning and deep learning - two primary data-driven pattern analysis approaches under the umbrella of artificial intelligence - has created considerable interest in the last few decades. The evolution of technology has resulted in a substantial amount of artificial intelligence research on ophthalmic and neurodegenerative disease diagnosis using retinal images. Various artificial intelligence-based techniques have been used for diagnostic purposes, including traditional machine learning, deep learning, and their combinations. Presented here is a review of the literature covering the last 10 years on this topic, discussing the use of artificial intelligence in analysing data from different modalities and their combinations for the diagnosis of glaucoma and neurodegenerative diseases. The performance of published artificial intelligence methods varies due to several factors, yet the results suggest that such methods can potentially facilitate clinical diagnosis. Generally, the accuracy of artificial intelligence-assisted diagnosis ranges from 67-98%, and the area under the sensitivity-specificity curve (AUC) ranges from 0.71-0.98, which outperforms typical human performance of 71.5% accuracy and 0.86 area under the curve. This indicates that artificial intelligence-based tools can provide clinicians with useful information that would assist in providing improved diagnosis. The review suggests that there is room for improvement of existing artificial intelligence-based models using retinal imaging modalities before they are incorporated into clinical practice.
Collapse
Affiliation(s)
- Md Mahmudul Hasan
- School of Computer Science and Engineering, University of New South Wales, Kensington, New South Wales, Australia
| | - Jack Phu
- School of Optometry and Vision Science, University of New South Wales, Kensington, Australia
- Centre for Eye Health, University of New South Wales, Sydney, New South Wales, Australia
- School of Medicine (Optometry), Deakin University, Waurn Ponds, Victoria, Australia
| | - Arcot Sowmya
- School of Computer Science and Engineering, University of New South Wales, Kensington, New South Wales, Australia
| | - Erik Meijering
- School of Computer Science and Engineering, University of New South Wales, Kensington, New South Wales, Australia
| | - Michael Kalloniatis
- School of Optometry and Vision Science, University of New South Wales, Kensington, Australia
- School of Medicine (Optometry), Deakin University, Waurn Ponds, Victoria, Australia
| |
Collapse
|
4
|
Kurysheva NI, Rodionova OY, Pomerantsev AL, Sharova GA. [Application of artificial intelligence in glaucoma. Part 1. Neural networks and deep learning in glaucoma screening and diagnosis]. Vestn Oftalmol 2024; 140:82-87. [PMID: 38962983 DOI: 10.17116/oftalma202414003182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/05/2024]
Abstract
This article reviews literature on the use of artificial intelligence (AI) for screening, diagnosis, monitoring and treatment of glaucoma. The first part of the review provides information how AI methods improve the effectiveness of glaucoma screening, presents the technologies using deep learning, including neural networks, for the analysis of big data obtained by methods of ocular imaging (fundus imaging, optical coherence tomography of the anterior and posterior eye segments, digital gonioscopy, ultrasound biomicroscopy, etc.), including a multimodal approach. The results found in the reviewed literature are contradictory, indicating that improvement of the AI models requires further research and a standardized approach. The use of neural networks for timely detection of glaucoma based on multimodal imaging will reduce the risk of blindness associated with glaucoma.
Collapse
Affiliation(s)
- N I Kurysheva
- Medical Biological University of Innovations and Continuing Education of the Federal Biophysical Center named after A.I. Burnazyan, Moscow, Russia
- Ophthalmological Center of the Federal Medical-Biological Agency at the Federal Biophysical Center named after A.I. Burnazyan, Moscow, Russia
| | - O Ye Rodionova
- N.N. Semenov Federal Research Center for Chemical Physics, Moscow, Russia
| | - A L Pomerantsev
- N.N. Semenov Federal Research Center for Chemical Physics, Moscow, Russia
| | - G A Sharova
- Medical Biological University of Innovations and Continuing Education of the Federal Biophysical Center named after A.I. Burnazyan, Moscow, Russia
- OOO Glaznaya Klinika Doktora Belikovoy, Moscow, Russia
| |
Collapse
|
5
|
Jeong J, Yoon W, Lee JG, Kim D, Woo Y, Kim DK, Shin HW. Standardized image-based polysomnography database and deep learning algorithm for sleep-stage classification. Sleep 2023; 46:zsad242. [PMID: 37703391 DOI: 10.1093/sleep/zsad242] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Revised: 08/11/2023] [Indexed: 09/15/2023] Open
Abstract
STUDY OBJECTIVES Polysomnography (PSG) scoring is labor-intensive, subjective, and often ambiguous. Recently several deep learning (DL) models for automated sleep scoring have been developed, they are tied to a fixed amount of input channels and resolution. In this study, we constructed a standardized image-based PSG dataset in order to overcome the heterogeneity of raw signal data obtained from various PSG devices and various sleep laboratory environments. METHODS All individually exported European data format files containing raw signals were converted into images with an annotation file, which contained the demographics, diagnoses, and sleep statistics. An image-based DL model for automatic sleep staging was developed, compared with a signal-based model, and validated in an external dataset. RESULTS We constructed 10253 image-based PSG datasets using a standardized format. Among these, 7745 diagnostic PSG data were used to develop our DL model. The DL model using the image dataset showed similar performance to the signal-based dataset for the same subject. The overall DL accuracy was greater than 80%, even with severe obstructive sleep apnea. Moreover, for the first time, we showed explainable DL in the field of sleep medicine as visualized key inference regions using Eigen-class activation maps. Furthermore, when a DL model for sleep scoring performs external validation, we achieved a relatively good performance. CONCLUSIONS Our main contribution demonstrates the availability of a standardized image-based dataset, and highlights that changing the data sampling rate or number of sensors may not require retraining, although performance decreases slightly as the number of sensors decreases.
Collapse
Affiliation(s)
- Jaemin Jeong
- Department of Computer Engineering, School of Software, Hallym University, Chuncheon, Republic of Korea
| | | | - Jeong-Gun Lee
- Department of Computer Engineering, School of Software, Hallym University, Chuncheon, Republic of Korea
| | - Dongyoung Kim
- Department of Computer Engineering, School of Software, Hallym University, Chuncheon, Republic of Korea
| | - Yunhee Woo
- Institute of New Frontier Research, Division of Big Data and Artificial Intelligence, Chuncheon Sacred Heart Hospital, Hallym University College of Medicine, Chuncheon, Republic of Korea
| | - Dong-Kyu Kim
- OUaR LaB, Inc, Seoul, Republic of Korea
- Institute of New Frontier Research, Division of Big Data and Artificial Intelligence, Chuncheon Sacred Heart Hospital, Hallym University College of Medicine, Chuncheon, Republic of Korea
- Department of Otorhinolaryngology-Head and Neck Surgery, Chuncheon Sacred Heart Hospital, Hallym University College of Medicine, Chuncheon, Republic of Korea¸
| | - Hyun-Woo Shin
- OUaR LaB, Inc, Seoul, Republic of Korea
- Obstructive Upper Airway Research (OUaR) Laboratory, Department of Pharmacology, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, Republic of Korea
- Cancer Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea
- Sensory Organ Research Institute, Seoul National University College of Medicine, Seoul, Republic of Korea
- Department of Otorhinolaryngology-Head and Neck Surgery, Seoul National University Hospital, Seoul, Republic of Korea
| |
Collapse
|
6
|
Zhang L, Tang L, Xia M, Cao G. The application of artificial intelligence in glaucoma diagnosis and prediction. Front Cell Dev Biol 2023; 11:1173094. [PMID: 37215077 PMCID: PMC10192631 DOI: 10.3389/fcell.2023.1173094] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 04/24/2023] [Indexed: 05/24/2023] Open
Abstract
Artificial intelligence is a multidisciplinary and collaborative science, the ability of deep learning for image feature extraction and processing gives it a unique advantage in dealing with problems in ophthalmology. The deep learning system can assist ophthalmologists in diagnosing characteristic fundus lesions in glaucoma, such as retinal nerve fiber layer defects, optic nerve head damage, optic disc hemorrhage, etc. Early detection of these lesions can help delay structural damage, protect visual function, and reduce visual field damage. The development of deep learning led to the emergence of deep convolutional neural networks, which are pushing the integration of artificial intelligence with testing devices such as visual field meters, fundus imaging and optical coherence tomography to drive more rapid advances in clinical glaucoma diagnosis and prediction techniques. This article details advances in artificial intelligence combined with visual field, fundus photography, and optical coherence tomography in the field of glaucoma diagnosis and prediction, some of which are familiar and some not widely known. Then it further explores the challenges at this stage and the prospects for future clinical applications. In the future, the deep cooperation between artificial intelligence and medical technology will make the datasets and clinical application rules more standardized, and glaucoma diagnosis and prediction tools will be simplified in a single direction, which will benefit multiple ethnic groups.
Collapse
Affiliation(s)
- Linyu Zhang
- The Affiliated Eye Hospital of Nanjing Medical University, Nanjing, China
- The Fourth School of Clinical Medicine, Nanjing Medical University, Nanjing, China
| | - Li Tang
- The Affiliated Eye Hospital of Nanjing Medical University, Nanjing, China
| | - Min Xia
- The Affiliated Eye Hospital of Nanjing Medical University, Nanjing, China
- The Fourth School of Clinical Medicine, Nanjing Medical University, Nanjing, China
| | - Guofan Cao
- The Affiliated Eye Hospital of Nanjing Medical University, Nanjing, China
- The Fourth School of Clinical Medicine, Nanjing Medical University, Nanjing, China
| |
Collapse
|
7
|
Cao J, You K, Zhou J, Xu M, Xu P, Wen L, Wang S, Jin K, Lou L, Wang Y, Ye J. A cascade eye diseases screening system with interpretability and expandability in ultra-wide field fundus images: A multicentre diagnostic accuracy study. EClinicalMedicine 2022; 53:101633. [PMID: 36110868 PMCID: PMC9468501 DOI: 10.1016/j.eclinm.2022.101633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 08/08/2022] [Accepted: 08/08/2022] [Indexed: 12/09/2022] Open
Abstract
BACKGROUND Clinical application of artificial intelligence is limited due to the lack of interpretability and expandability in complex clinical settings. We aimed to develop an eye diseases screening system with improved interpretability and expandability based on a lesion-level dissection and tested the clinical expandability and auxiliary ability of the system. METHODS The four-hierarchical interpretable eye diseases screening system (IEDSS) based on a novel structural pattern named lesion atlas was developed to identify 30 eye diseases and conditions using a total of 32,026 ultra-wide field images collected from the Second Affiliated Hospital of Zhejiang University, School of Medicine (SAHZU), the First Affiliated Hospital of University of Science and Technology of China (FAHUSTC), and the Affiliated People's Hospital of Ningbo University (APHNU) in China between November 1, 2016 to February 28, 2022. The performance of IEDSS was compared with ophthalmologists and classic models trained with image-level labels. We further evaluated IEDSS in two external datasets, and tested it in a real-world scenario and an extended dataset with new phenotypes beyond the training categories. The accuracy (ACC), F1 score and confusion matrix were calculated to assess the performance of IEDSS. FINDINGS IEDSS reached average ACCs (aACC) of 0·9781 (95%CI 0·9739-0·9824), 0·9660 (95%CI 0·9591-0·9730) and 0·9709 (95%CI 0·9655-0·9763), frequency-weighted average F1 scores of 0·9042 (95%CI 0·8957-0·9127), 0·8837 (95%CI 0·8714-0·8960) and 0·8874 (95%CI 0·8772-0·8972) in datasets of SAHZU, APHNU and FAHUSTC, respectively. IEDSS reached a higher aACC (0·9781, 95%CI 0·9739-0·9824) compared with a multi-class image-level model (0·9398, 95%CI 0·9329-0·9467), a classic multi-label image-level model (0·9278, 95%CI 0·9189-0·9366), a novel multi-label image-level model (0·9241, 95%CI 0·9151-0·9331) and a lesion-level model without Adaboost (0·9381, 95%CI 0·9299-0·9463). In the real-world scenario, the aACC of IEDSS (0·9872, 95%CI 0·9828-0·9915) was higher than that of the senior ophthalmologist (SO) (0·9413, 95%CI 0·9321-0·9504, p = 0·000) and the junior ophthalmologist (JO) (0·8846, 95%CI 0·8722-0·8971, p = 0·000). IEDSS remained strong performance (ACC = 0·8560, 95%CI 0·8252-0·8868) compared with JO (ACC = 0·784, 95%CI 0·7479-0·8201, p= 0·003) and SO (ACC = 0·8500, 95%CI 0·8187-0·8813, p = 0·789) in the extended dataset. INTERPRETATION IEDSS showed excellent and stable performance in identifying common eye conditions and conditions beyond the training categories. The transparency and expandability of IEDSS could tremendously increase the clinical application range and the practical clinical value of it. It would enhance the efficiency and reliability of clinical practice, especially in remote areas with a lack of experienced specialists. FUNDING National Natural Science Foundation Regional Innovation and Development Joint Fund (U20A20386), Key research and development program of Zhejiang Province (2019C03020), Clinical Medical Research Centre for Eye Diseases of Zhejiang Province (2021E50007).
Collapse
Affiliation(s)
- Jing Cao
- Department of Ophthalmology, the Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou, Zhejiang, China
| | - Kun You
- Zhejiang Feitu Medical Imaging Co.,LTD, Hangzhou, Zhejiang, China
| | - Jingxin Zhou
- Department of Ophthalmology, the Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou, Zhejiang, China
| | - Mingyu Xu
- Department of Ophthalmology, the Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou, Zhejiang, China
| | - Peifang Xu
- Department of Ophthalmology, the Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou, Zhejiang, China
| | - Lei Wen
- The First Affiliated Hospital of University of Science and Technology of China, Hefei, Anhui, China
| | - Shengzhan Wang
- The Affiliated People's Hospital of Ningbo University, Ningbo, Zhejiang, China
| | - Kai Jin
- Department of Ophthalmology, the Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou, Zhejiang, China
| | - Lixia Lou
- Department of Ophthalmology, the Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou, Zhejiang, China
| | - Yao Wang
- Department of Ophthalmology, the Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou, Zhejiang, China
| | - Juan Ye
- Department of Ophthalmology, the Second Affiliated Hospital of Zhejiang University, College of Medicine, Hangzhou, Zhejiang, China
- Corresponding author at: No. 1 West Lake Avenue, Hangzhou, Zhejiang Province, China, 310009.
| |
Collapse
|
8
|
Widen the Applicability of a Convolutional Neural-Network-Assisted Glaucoma Detection Algorithm of Limited Training Images across Different Datasets. Biomedicines 2022; 10:biomedicines10061314. [PMID: 35740336 PMCID: PMC9219722 DOI: 10.3390/biomedicines10061314] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 05/22/2022] [Accepted: 05/30/2022] [Indexed: 02/04/2023] Open
Abstract
Automated glaucoma detection using deep learning may increase the diagnostic rate of glaucoma to prevent blindness, but generalizable models are currently unavailable despite the use of huge training datasets. This study aims to evaluate the performance of a convolutional neural network (CNN) classifier trained with a limited number of high-quality fundus images in detecting glaucoma and methods to improve its performance across different datasets. A CNN classifier was constructed using EfficientNet B3 and 944 images collected from one medical center (core model) and externally validated using three datasets. The performance of the core model was compared with (1) the integrated model constructed by using all training images from the four datasets and (2) the dataset-specific model built by fine-tuning the core model with training images from the external datasets. The diagnostic accuracy of the core model was 95.62% but dropped to ranges of 52.5–80.0% on the external datasets. Dataset-specific models exhibited superior diagnostic performance on the external datasets compared to other models, with a diagnostic accuracy of 87.50–92.5%. The findings suggest that dataset-specific tuning of the core CNN classifier effectively improves its applicability across different datasets when increasing training images fails to achieve generalization.
Collapse
|
9
|
|
10
|
WU JOHSUAN, NISHIDA TAKASHI, WEINREB ROBERTN, LIN JOUWEI. Performances of Machine Learning in Detecting Glaucoma Using Fundus and Retinal Optical Coherence Tomography Images: A Meta-Analysis. Am J Ophthalmol 2022; 237:1-12. [PMID: 34942113 DOI: 10.1016/j.ajo.2021.12.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 11/24/2021] [Accepted: 12/03/2021] [Indexed: 11/01/2022]
Abstract
PURPOSE To evaluate the performance of machine learning (ML) in detecting glaucoma using fundus and retinal optical coherence tomography (OCT) images. DESIGN Meta-analysis. METHODS PubMed and EMBASE were searched on August 11, 2021. A bivariate random-effects model was used to pool ML's diagnostic sensitivity, specificity, and area under the curve (AUC). Subgroup analyses were performed based on ML classifier categories and dataset types. RESULTS One hundred and five studies (3.3%) were retrieved. Seventy-three (69.5%), 30 (28.6%), and 2 (1.9%) studies tested ML using fundus, OCT, and both image types, respectively. Total testing data numbers were 197,174 for fundus and 16,039 for OCT. Overall, ML showed excellent performances for both fundus (pooled sensitivity = 0.92 [95% CI, 0.91-0.93]; specificity = 0.93 [95% CI, 0.91-0.94]; and AUC = 0.97 [95% CI, 0.95-0.98]) and OCT (pooled sensitivity = 0.90 [95% CI, 0.86-0.92]; specificity = 0.91 [95% CI, 0.89-0.92]; and AUC = 0.96 [95% CI, 0.93-0.97]). ML performed similarly using all data and external data for fundus and the external test result of OCT was less robust (AUC = 0.87). When comparing different classifier categories, although support vector machine showed the highest performance (pooled sensitivity, specificity, and AUC ranges, 0.92-0.96, 0.95-0.97, and 0.96-0.99, respectively), results by neural network and others were still good (pooled sensitivity, specificity, and AUC ranges, 0.88-0.93, 0.90-0.93, 0.95-0.97, respectively). When analyzed based on dataset types, ML demonstrated consistent performances on clinical datasets (fundus AUC = 0.98 [95% CI, 0.97-0.99] and OCT AUC = 0.95 [95% 0.93-0.97]). CONCLUSIONS Performance of ML in detecting glaucoma compares favorably to that of experts and is promising for clinical application. Future prospective studies are needed to better evaluate its real-world utility.
Collapse
|
11
|
Chaurasia AK, Greatbatch CJ, Hewitt AW. Diagnostic Accuracy of Artificial Intelligence in Glaucoma Screening and Clinical Practice. J Glaucoma 2022; 31:285-299. [PMID: 35302538 DOI: 10.1097/ijg.0000000000002015] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 02/26/2022] [Indexed: 11/25/2022]
Abstract
PURPOSE Artificial intelligence (AI) has been shown as a diagnostic tool for glaucoma detection through imaging modalities. However, these tools are yet to be deployed into clinical practice. This meta-analysis determined overall AI performance for glaucoma diagnosis and identified potential factors affecting their implementation. METHODS We searched databases (Embase, Medline, Web of Science, and Scopus) for studies that developed or investigated the use of AI for glaucoma detection using fundus and optical coherence tomography (OCT) images. A bivariate random-effects model was used to determine the summary estimates for diagnostic outcomes. The Preferred Reporting Items for Systematic Reviews and Meta-Analysis of Diagnostic Test Accuracy (PRISMA-DTA) extension was followed, and the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool was used for bias and applicability assessment. RESULTS Seventy-nine articles met inclusion criteria, with a subset of 66 containing adequate data for quantitative analysis. The pooled area under receiver operating characteristic curve across all studies for glaucoma detection was 96.3%, with a sensitivity of 92.0% (95% confidence interval: 89.0-94.0) and specificity of 94.0% (95% confidence interval: 92.0-95.0). The pooled area under receiver operating characteristic curve on fundus and OCT images was 96.2% and 96.0%, respectively. Mixed data set and external data validation had unsatisfactory diagnostic outcomes. CONCLUSION Although AI has the potential to revolutionize glaucoma care, this meta-analysis highlights that before such algorithms can be implemented into clinical care, a number of issues need to be addressed. With substantial heterogeneity across studies, many factors were found to affect the diagnostic performance. We recommend implementing a standard diagnostic protocol for grading, implementing external data validation, and analysis across different ethnicity groups.
Collapse
Affiliation(s)
- Abadh K Chaurasia
- Menzies Institute for Medical Research, School of Medicine, University of Tasmania, Tasmania
| | - Connor J Greatbatch
- Menzies Institute for Medical Research, School of Medicine, University of Tasmania, Tasmania
| | - Alex W Hewitt
- Menzies Institute for Medical Research, School of Medicine, University of Tasmania, Tasmania
- Centre for Eye Research Australia, University of Melbourne, Melbourne, Australia
| |
Collapse
|
12
|
Li M, Wan C. The use of deep learning technology for the detection of optic neuropathy. Quant Imaging Med Surg 2022; 12:2129-2143. [PMID: 35284277 PMCID: PMC8899937 DOI: 10.21037/qims-21-728] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Accepted: 10/26/2021] [Indexed: 03/14/2024]
Abstract
The emergence of computer graphics processing units (GPUs), improvements in mathematical models, and the availability of big data, has allowed artificial intelligence (AI) to use machine learning and deep learning (DL) technology to achieve robust performance in various fields of medicine. The DL system provides improved capabilities, especially in image recognition and image processing. Recent progress in the sorting of AI data sets has stimulated great interest in the development of DL algorithms. Compared with subjective evaluation and other traditional methods, DL algorithms can identify diseases faster and more accurately in diagnostic tests. Medical imaging is of great significance in the clinical diagnosis and individualized treatment of ophthalmic diseases. Based on the morphological data sets of millions of data points, various image-related diagnostic techniques can now impart high-resolution information on anatomical and functional changes, thereby providing unprecedented insights in ophthalmic clinical practice. As ophthalmology relies heavily on imaging examinations, it is one of the first medical fields to apply DL algorithms in clinical practice. Such algorithms can assist in the analysis of large amounts of data acquired from the examination of auxiliary images. In recent years, rapid advancements in imaging technology have facilitated the application of DL in the automatic identification and classification of pathologies that are characteristic of ophthalmic diseases, thereby providing high quality diagnostic information. This paper reviews the origins, development, and application of DL technology. The technical and clinical problems associated with building DL systems to meet clinical needs and the potential challenges of clinical application are discussed, especially in relation to the field of optic nerve diseases.
Collapse
Affiliation(s)
- Mei Li
- Department of Ophthalmology, Yanan People’s Hospital, Yanan, China
| | - Chao Wan
- Department of Ophthalmology, the First Hospital of China Medical University, Shenyang, China
| |
Collapse
|
13
|
Akbar S, Hassan SA, Shoukat A, Alyami J, Bahaj SA. Detection of microscopic glaucoma through fundus images using deep transfer learning approach. Microsc Res Tech 2022; 85:2259-2276. [PMID: 35170136 DOI: 10.1002/jemt.24083] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2021] [Revised: 01/05/2022] [Accepted: 01/27/2022] [Indexed: 11/07/2022]
Abstract
Glaucoma disease in humans can lead to blindness if it progresses to the point where it affects the oculus' optic nerve head. It is not easily detected since there are no symptoms, but it can be detected using tonometry, ophthalmoscopy, and perimeter. However, advances in artificial intelligence approaches have permitted machine learning techniques to diagnose at an early stage. Numerous methods have been proposed using Machine Learning to diagnose glaucoma with different data sets and techniques but these are complex methods. Although, medical imaging instruments are used as glaucoma screening methods, fundus imaging specifically is the most used screening technique for glaucoma detection. This study presents a novel DenseNet and DarkNet combination to classify normal and glaucoma affected fundus image. These frameworks have been trained and tested on three data sets of high-resolution fundus (HRF), RIM 1, and ACRIMA. A total of 658 images have been used for healthy eyes and 612 images for glaucoma-affected eyes classification. It has also been observed that the fusion of DenseNet and DarkNet outperforms the two CNN networks and achieved 99.7% accuracy, 98.9% sensitivity, 100% specificity for the HRF database. In contrast, for the RIM1 database, 89.3% accuracy, 93.3% sensitivity, 88.46% specificity has been attained. Moreover, for the ACRIMA database, 99% accuracy, 100% sensitivity, 99% specificity has been achieved. Therefore, the proposed method is robust and efficient with less computational time and complexity compared to the literature available.
Collapse
Affiliation(s)
- Shahzad Akbar
- Riphah College of Computing, Riphah International University, Faisalabad Campus, Faisalabad, Pakistan
| | - Syed Ale Hassan
- Riphah College of Computing, Riphah International University, Faisalabad Campus, Faisalabad, Pakistan
| | - Ayesha Shoukat
- Riphah College of Computing, Riphah International University, Faisalabad Campus, Faisalabad, Pakistan
| | - Jaber Alyami
- Department of Diagnostic Radiology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, 21589, Saudi Arabia.,Imaging Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Saeed Ali Bahaj
- MIS Department, College of Business Administration, Prince Sattam Bin Abdulaziz University, Alkharj, 11942, Saudi Arabia
| |
Collapse
|
14
|
Wang Z, Keane PA, Chiang M, Cheung CY, Wong TY, Ting DSW. Artificial Intelligence and Deep Learning in Ophthalmology. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
15
|
Artificial Intelligence and Deep Learning in Ophthalmology. Artif Intell Med 2021. [DOI: 10.1007/978-3-030-58080-3_200-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
16
|
Park J, Hwang Y, Nam JH, Oh DJ, Kim KB, Song HJ, Kim SH, Kang SH, Jung MK, Jeong Lim Y. Artificial intelligence that determines the clinical significance of capsule endoscopy images can increase the efficiency of reading. PLoS One 2020; 15:e0241474. [PMID: 33119718 PMCID: PMC7595411 DOI: 10.1371/journal.pone.0241474] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 10/14/2020] [Indexed: 02/07/2023] Open
Abstract
Artificial intelligence (AI), which has demonstrated outstanding achievements in image recognition, can be useful for the tedious capsule endoscopy (CE) reading. We aimed to develop a practical AI-based method that can identify various types of lesions and tried to evaluate the effectiveness of the method under clinical settings. A total of 203,244 CE images were collected from multiple centers selected considering the regional distribution. The AI based on the Inception-Resnet-V2 model was trained with images that were classified into two categories according to their clinical significance. The performance of AI was evaluated with a comparative test involving two groups of reviewers with different experiences. The AI summarized 67,008 (31.89%) images with a probability of more than 0.8 for containing lesions in 210,100 frames of 20 selected CE videos. Using the AI-assisted reading model, reviewers in both the groups exhibited increased lesion detection rates compared to those achieved using the conventional reading model (experts; 34.3%–73.0%; p = 0.029, trainees; 24.7%–53.1%; p = 0.029). The improved result for trainees was comparable to that for the experts (p = 0.057). Further, the AI-assisted reading model significantly shortened the reading time for trainees (1621.0–746.8 min; p = 0.029). Thus, we have developed an AI-assisted reading model that can detect various lesions and can successfully summarize CE images according to clinical significance. The assistance rendered by AI can increase the lesion detection rates of reviewers. Especially, trainees could improve their efficiency of reading as a result of reduced reading time using the AI-assisted model.
Collapse
Affiliation(s)
- Junseok Park
- Department of Internal Medicine, Digestive Disease Center, Institute for Digestive Research, Soonchunhyang University College of Medicine, Seoul, Republic of Korea
| | - Youngbae Hwang
- Department of Electronics Engineering, Chungbuk National University, Cheongju, Republic of Korea
| | - Ji Hyung Nam
- Division of Gastroenterology, Department of Internal Medicine, Dongguk University Ilsan Hospital, Dongguk University College of Medicine, Goyang, Republic of Korea
| | - Dong Jun Oh
- Division of Gastroenterology, Department of Internal Medicine, Dongguk University Ilsan Hospital, Dongguk University College of Medicine, Goyang, Republic of Korea
| | - Ki Bae Kim
- Department of Internal Medicine, Chungbuk National University College of Medicine, Cheongju, Republic of Korea
| | - Hyun Joo Song
- Department of Internal Medicine, Jeju National University School of Medicine, Jeju, Republic of Korea
| | - Su Hwan Kim
- Department of Internal Medicine, Seoul Metropolitan Government Seoul National University Boramae Medical Center, Seoul, Republic of Korea
| | - Sun Hyung Kang
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Chungnam National University School of Medicine, Daejeon, Republic of Korea
| | - Min Kyu Jung
- Division of Gastroenterology and Hepatology, Department of Internal Medicine, Kyungpook National University Hospital, Daegu, Republic of Korea
| | - Yun Jeong Lim
- Division of Gastroenterology, Department of Internal Medicine, Dongguk University Ilsan Hospital, Dongguk University College of Medicine, Goyang, Republic of Korea
- * E-mail:
| |
Collapse
|