1
|
Amin J, Shazadi I, Sharif M, Yasmin M, Almujally NA, Nam Y. Localization and grading of NPDR lesions using ResNet-18-YOLOv8 model and informative features selection for DR classification based on transfer learning. Heliyon 2024; 10:e30954. [PMID: 38779022 PMCID: PMC11109848 DOI: 10.1016/j.heliyon.2024.e30954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 05/04/2024] [Accepted: 05/08/2024] [Indexed: 05/25/2024] Open
Abstract
Complications in diabetes lead to diabetic retinopathy (DR) hence affecting the vision. Computerized methods performed a significant role in DR detection at the initial phase to cure vision loss. Therefore, a method is proposed in this study that consists of three models for localization, segmentation, and classification. A novel technique is designed with the combination of pre-trained ResNet-18 and YOLOv8 models based on the selection of optimum layers for the localization of DR lesions. The localized images are passed to the designed semantic segmentation model on selected layers and trained on optimized learning hyperparameters. The segmentation model performance is evaluated on the Grand-challenge IDRID segmentation dataset. The achieved results are computed in terms of mean IoU 0.95,0.94, 0.96, 0.94, and 0.95 on OD, SoftExs, HardExs, HAE, and MAs respectively. Another classification model is developed in which deep features are derived from the pre-trained Efficientnet-b0 model and optimized using a Genetic algorithm (GA) based on the selected parameters for grading of NPDR lesions. The proposed model achieved greater than 98 % accuracy which is superior to previous methods.
Collapse
Affiliation(s)
- Javaria Amin
- Department of Computer Science, University of Wah, Wah Cantt, Pakistan
| | - Irum Shazadi
- Department of Computer Science, University of Wah, Wah Cantt, Pakistan
| | - Muhammad Sharif
- Department of Computer Science, COMSATS University Islamabad, Wah Cantt, Pakistan
| | - Mussarat Yasmin
- Department of Computer Science, COMSATS University Islamabad, Wah Cantt, Pakistan
| | - Nouf Abdullah Almujally
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia
| | - Yunyoung Nam
- Department of ICT Convergence, Soonchunhyang University, Asan, 31538, South Korea
| |
Collapse
|
2
|
Alshahrani M, Al-Jabbar M, Senan EM, Ahmed IA, Saif JAM. Hybrid Methods for Fundus Image Analysis for Diagnosis of Diabetic Retinopathy Development Stages Based on Fusion Features. Diagnostics (Basel) 2023; 13:2783. [PMID: 37685321 PMCID: PMC10486790 DOI: 10.3390/diagnostics13172783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 08/22/2023] [Accepted: 08/24/2023] [Indexed: 09/10/2023] Open
Abstract
Diabetic retinopathy (DR) is a complication of diabetes that damages the delicate blood vessels of the retina and leads to blindness. Ophthalmologists rely on diagnosing the retina by imaging the fundus. The process takes a long time and needs skilled doctors to diagnose and determine the stage of DR. Therefore, automatic techniques using artificial intelligence play an important role in analyzing fundus images for the detection of the stages of DR development. However, diagnosis using artificial intelligence techniques is a difficult task and passes through many stages, and the extraction of representative features is important in reaching satisfactory results. Convolutional Neural Network (CNN) models play an important and distinct role in extracting features with high accuracy. In this study, fundus images were used for the detection of the developmental stages of DR by two proposed methods, each with two systems. The first proposed method uses GoogLeNet with SVM and ResNet-18 with SVM. The second method uses Feed-Forward Neural Networks (FFNN) based on the hybrid features extracted by first using GoogLeNet, Fuzzy color histogram (FCH), Gray Level Co-occurrence Matrix (GLCM), and Local Binary Pattern (LBP); followed by ResNet-18, FCH, GLCM and LBP. All the proposed methods obtained superior results. The FFNN network with hybrid features of ResNet-18, FCH, GLCM, and LBP obtained 99.7% accuracy, 99.6% precision, 99.6% sensitivity, 100% specificity, and 99.86% AUC.
Collapse
Affiliation(s)
- Mohammed Alshahrani
- Computer Department, Applied College, Najran University, Najran 66462, Saudi Arabia;
| | - Mohammed Al-Jabbar
- Computer Department, Applied College, Najran University, Najran 66462, Saudi Arabia;
| | - Ebrahim Mohammed Senan
- Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Alrazi University, Sana’a, Yemen
| | | | | |
Collapse
|
3
|
Selvachandran G, Quek SG, Paramesran R, Ding W, Son LH. Developments in the detection of diabetic retinopathy: a state-of-the-art review of computer-aided diagnosis and machine learning methods. Artif Intell Rev 2023; 56:915-964. [PMID: 35498558 PMCID: PMC9038999 DOI: 10.1007/s10462-022-10185-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/04/2022] [Indexed: 02/02/2023]
Abstract
The exponential increase in the number of diabetics around the world has led to an equally large increase in the number of diabetic retinopathy (DR) cases which is one of the major complications caused by diabetes. Left unattended, DR worsens the vision and would lead to partial or complete blindness. As the number of diabetics continue to increase exponentially in the coming years, the number of qualified ophthalmologists need to increase in tandem in order to meet the demand for screening of the growing number of diabetic patients. This makes it pertinent to develop ways to automate the detection process of DR. A computer aided diagnosis system has the potential to significantly reduce the burden currently placed on the ophthalmologists. Hence, this review paper is presented with the aim of summarizing, classifying, and analyzing all the recent development on automated DR detection using fundus images from 2015 up to this date. Such work offers an unprecedentedly thorough review of all the recent works on DR, which will potentially increase the understanding of all the recent studies on automated DR detection, particularly on those that deploys machine learning algorithms. Firstly, in this paper, a comprehensive state-of-the-art review of the methods that have been introduced in the detection of DR is presented, with a focus on machine learning models such as convolutional neural networks (CNN) and artificial neural networks (ANN) and various hybrid models. Each AI will then be classified according to its type (e.g. CNN, ANN, SVM), its specific task(s) in performing DR detection. In particular, the models that deploy CNN will be further analyzed and classified according to some important properties of the respective CNN architectures of each model. A total of 150 research articles related to the aforementioned areas that were published in the recent 5 years have been utilized in this review to provide a comprehensive overview of the latest developments in the detection of DR. Supplementary Information The online version contains supplementary material available at 10.1007/s10462-022-10185-6.
Collapse
Affiliation(s)
- Ganeshsree Selvachandran
- Department of Actuarial Science and Applied Statistics, Faculty of Business & Management, UCSI University, Jalan Menara Gading, Cheras, 56000 Kuala Lumpur, Malaysia
| | - Shio Gai Quek
- Department of Actuarial Science and Applied Statistics, Faculty of Business & Management, UCSI University, Jalan Menara Gading, Cheras, 56000 Kuala Lumpur, Malaysia
| | - Raveendran Paramesran
- Institute of Computer Science and Digital Innovation, UCSI University, Jalan Menara Gading, Cheras, 56000 Kuala Lumpur, Malaysia
| | - Weiping Ding
- School of Information Science and Technology, Nantong University, Nantong, 226019 People’s Republic of China
| | - Le Hoang Son
- VNU Information Technology Institute, Vietnam National University, Hanoi, Vietnam
| |
Collapse
|
4
|
Wang Z, Xia H, Yin W, Yang B. An improved generative adversarial network for fault diagnosis of rotating machine in nuclear power plant. ANN NUCL ENERGY 2023. [DOI: 10.1016/j.anucene.2022.109434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
5
|
EDLDR: An Ensemble Deep Learning Technique for Detection and Classification of Diabetic Retinopathy. Diagnostics (Basel) 2022; 13:diagnostics13010124. [PMID: 36611416 PMCID: PMC9818466 DOI: 10.3390/diagnostics13010124] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 12/24/2022] [Accepted: 12/26/2022] [Indexed: 01/03/2023] Open
Abstract
Diabetic retinopathy (DR) is an ophthalmological disease that causes damage in the blood vessels of the eye. DR causes clotting, lesions or haemorrhage in the light-sensitive region of the retina. Person suffering from DR face loss of vision due to the formation of exudates or lesions in the retina. The detection of DR is critical to the successful treatment of patients suffering from DR. The retinal fundus images may be used for the detection of abnormalities leading to DR. In this paper, an automated ensemble deep learning model is proposed for the detection and classification of DR. The ensembling of a deep learning model enables better predictions and achieves better performance than any single contributing model. Two deep learning models, namely modified DenseNet101 and ResNeXt, are ensembled for the detection of diabetic retinopathy. The ResNeXt model is an improvement over the existing ResNet models. The model includes a shortcut from the previous block to next block, stacking layers and adapting split-transform-merge strategy. The model has a cardinality parameter that specifies the number of transformations. The DenseNet model gives better feature use efficiency as the dense blocks perform concatenation. The ensembling of these two models is performed using normalization over the classes followed by maximum a posteriori over the class outputs to compute the final class label. The experiments are conducted on two datasets APTOS19 and DIARETDB1. The classifications are carried out for both two classes and five classes. The images are pre-processed using CLAHE method for histogram equalization. The dataset has a high-class imbalance and the images of the non-proliferative type are very low, therefore, GAN-based augmentation technique is used for data augmentation. The results obtained from the proposed method are compared with other existing methods. The comparison shows that the proposed method has higher accuracy, precision and recall for both two classes and five classes. The proposed method has an accuracy of 86.08 for five classes and 96.98% for two classes. The precision and recall for two classes are 0.97. For five classes also, the precision and recall are high, i.e., 0.76 and 0.82, respectively.
Collapse
|
6
|
Integrating Transfer Learning and Feature Aggregation into Self-defined Convolutional Neural Network for Automated Detection of Lung Cancer Bone Metastasis. J Med Biol Eng 2022. [DOI: 10.1007/s40846-022-00770-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
7
|
Srinivasan V, Strodthoff N, Ma J, Binder A, Müller KR, Samek W. To pretrain or not? A systematic analysis of the benefits of pretraining in diabetic retinopathy. PLoS One 2022; 17:e0274291. [PMID: 36256665 PMCID: PMC9578637 DOI: 10.1371/journal.pone.0274291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Accepted: 08/26/2022] [Indexed: 11/06/2022] Open
Abstract
There is an increasing number of medical use cases where classification algorithms based on deep neural networks reach performance levels that are competitive with human medical experts. To alleviate the challenges of small dataset sizes, these systems often rely on pretraining. In this work, we aim to assess the broader implications of these approaches in order to better understand what type of pretraining works reliably (with respect to performance, robustness, learned representation etc.) in practice and what type of pretraining dataset is best suited to achieve good performance in small target dataset size scenarios. Considering diabetic retinopathy grading as an exemplary use case, we compare the impact of different training procedures including recently established self-supervised pretraining methods based on contrastive learning. To this end, we investigate different aspects such as quantitative performance, statistics of the learned feature representations, interpretability and robustness to image distortions. Our results indicate that models initialized from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions. In particular, self-supervised models show further benefits to supervised models. Self-supervised models with initialization from ImageNet pretraining not only report higher performance, they also reduce overfitting to large lesions along with improvements in taking into account minute lesions indicative of the progression of the disease. Understanding the effects of pretraining in a broader sense that goes beyond simple performance comparisons is of crucial importance for the broader medical imaging community beyond the use case considered in this work.
Collapse
Affiliation(s)
- Vignesh Srinivasan
- Department of Artificial Intelligence, Fraunhofer Heinrich Hertz Institute, Berlin, Germany
| | - Nils Strodthoff
- School of Medicine and Health Services, Oldenburg University, Oldenburg, Germany
| | - Jackie Ma
- Department of Artificial Intelligence, Fraunhofer Heinrich Hertz Institute, Berlin, Germany
| | - Alexander Binder
- Singapore Institute of Technology, ICT Cluster, Singapore, Singapore
- Department of Informatics, Oslo University, Oslo, Norway
| | - Klaus-Robert Müller
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
- Department of Electrical Engineering and Computer Science, Technische Universität Berlin, Berlin, Germany
- Department of Artificial Intelligence, Korea University, Seoul, South Korea
- Max Planck Institute for Informatics, Saarbrücken, Germany
- * E-mail: (KRM); (WS)
| | - Wojciech Samek
- Department of Artificial Intelligence, Fraunhofer Heinrich Hertz Institute, Berlin, Germany
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
- Department of Electrical Engineering and Computer Science, Technische Universität Berlin, Berlin, Germany
- * E-mail: (KRM); (WS)
| |
Collapse
|
8
|
OLTU B, KARACA BK, ERDEM H, ÖZGÜR A. A systematic review of transfer learning-based approaches for diabetic retinopathy detection. GAZI UNIVERSITY JOURNAL OF SCIENCE 2022. [DOI: 10.35378/gujs.1081546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Cases of diabetes and related diabetic retinopathy (DR) have been increasing at an alarming rate in modern times. Early detection of DR is an important problem since it may cause permanent blindness in the late stages. In the last two decades, many different approaches have been applied in DR detection. Reviewing academic literature shows that deep neural networks (DNNs) have become the most preferred approach for DR detection. Among these DNN approaches, Convolutional Neural Network (CNN) models are the most used ones in the field of medical image classification. Designing a new CNN architecture is a tedious and time-consuming approach. Additionally, training an enormous number of parameters is also a difficult task. Due to this reason, instead of training CNNs from scratch, using pre-trained models has been suggested in recent years as transfer learning approach. Accordingly, the present study as a review focuses on DNN and Transfer Learning based applications of DR detection considering 43 publications between 2015 and 2021. The published papers are summarized using 3 figures and 10 tables, giving information about 29 pre-trained CNN models, 13 DR data sets and standard performance metrics.
Collapse
Affiliation(s)
- Burcu OLTU
- BAŞKENT ÜNİVERSİTESİ, MÜHENDİSLİK FAKÜLTESİ
| | | | | | | |
Collapse
|
9
|
GÜRCAN ÖF, ATICI U, BEYCA ÖF. A Hybrid Deep Learning-Metaheuristic Model for Diagnosis of Diabetic Retinopathy. GAZI UNIVERSITY JOURNAL OF SCIENCE 2022. [DOI: 10.35378/gujs.919572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
International Diabetes Federation (IDF) reports that diabetes is one of the rapidly growing illnesses. About 463 million adults between 20-79 years have diabetes. There are also millions of undiagnosed patients. It is estimated that there will be about 578 million diabetics by 2030 [1]. Diabetes reasons different eye diseases. Diabetic retinopathy (DR) is one of them and is also one of the most common vision loss or blindness worldwide. DR progresses slowly and has few indicators in the early stages. It makes the diagnosis of DR a problematic task. Automated systems promise to support the diagnosis of DR. Many deep learning-based models have been developed for DR classification. This study aims to support ophthalmologists in the diagnosis process and increase the diagnosis performance of DR through a hybrid model. A publicly available Messidor-2 dataset was used in this study, comprised of retinal images. In the proposed model, first, images were pre-processed and a deep learning model, namely, InceptionV3 was used in feature extraction where a transfer learning approach is applied. Next, the number of features in obtained feature vectors was decreased with feature selection by Simulated Annealing (SA). Lastly, the best representation features were used in XGBoost model. The XGBoost algorithm gives an accuracy of 92.26% in a binary classification task. This study shows that a pre-trained ConvNet with a metaheuristic algorithm for feature selection gives a satisfactory result in the diagnosis of DR.
Collapse
Affiliation(s)
| | | | - Ömer Faruk BEYCA
- İstanbul Technical University, Department of Industrial Engineering
| |
Collapse
|
10
|
Balancing Data through Data Augmentation Improves the Generality of Transfer Learning for Diabetic Retinopathy Classification. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12115363] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The incidence of diabetes in Mauritius is amongst the highest in the world. Diabetic retinopathy (DR), a complication resulting from the disease, can lead to blindness if not detected early. The aim of this work was to investigate the use of transfer learning and data augmentation for the classification of fundus images into five different stages of diabetic retinopathy. The five stages are No DR, Mild nonproliferative DR, Moderate nonproliferative DR, Severe nonproliferative DR and Proliferative. To this end, deep transfer learning and three pre-trained models, VGG16, ResNet50 and DenseNet169, were used to classify the APTOS dataset. The preliminary experiments resulted in low training and validation accuracies, and hence, the APTOS dataset was augmented while ensuring a balance between the five classes. This dataset was then used to train the three models, and the best three models were used to classify a blind Mauritian test datum. We found that the ResNet50 model produced the best results out of the three models and also achieved very good accuracies for the five classes. The classification of class-4 Mauritian fundus images, severe cases, produced some unexpected results, with some images being classified as mild, and therefore needs to be further investigated.
Collapse
|
11
|
Okuwobi IP, Ding Z, Wan J, Ding S. Artificial intelligence model driven by transfer learning for image-based medical diagnosis. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-220066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Artificial intelligent (AI) systems for clinical-decision support are an important tool in clinical routine. It has become a crucial diagnostic tool with adequate reliability and interpretability in disease diagnosis and monitoring. Undoubtedly, these models are faced with insufficient data challenges for training, which often directly determines the model’s performance. In order word, insufficient data for model training leads to inefficiency in the model built. To overcome this problem, we propose an AI-driven model by transfer learning in accurate diagnosis for medical decision support. Our approach leverages the shortage of data with a pretrained model by training the neural network with a fraction of the new dataset. For this purpose, we utilized the VGG19 network as the backbone network to support our model in integrating known features with the newly learned features for accurate diagnosis and decision making. Integrating this trained model speeds up the training phase and improve the performance of the proposed model. Experimental results show that the proposed model is effective and efficient in diagnosing different medical diseases. As such, we anticipated that this diagnosis tool will ultimately aid in facilitating early treatment of these treatable diseases, which will improve clinical out-comes.
Collapse
Affiliation(s)
- Idowu Paul Okuwobi
- School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin, China
| | - Zhixiang Ding
- Department of Ophthalmology, Affiliated Hospital of Guilin Medical University, Guilin, China
| | - Jifeng Wan
- Department of Ophthalmology, Affiliated Hospital of Guilin Medical University, Guilin, China
| | - Shuxue Ding
- School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin, China
| |
Collapse
|
12
|
Shaik NS, Cherukuri TK. Hinge attention network: A joint model for diabetic retinopathy severity grading. APPL INTELL 2022. [DOI: 10.1007/s10489-021-03043-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
13
|
Multi-Classification of Chest X-rays for COVID-19 Diagnosis Using Deep Learning Algorithms. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12042080] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Accurate detection of COVID-19 is of immense importance to help physicians intervene with appropriate treatments. Although RT-PCR is routinely used for COVID-19 detection, it is expensive, takes a long time, and is prone to inaccurate results. Currently, medical imaging-based detection systems have been explored as an alternative for more accurate diagnosis. In this work, we propose a multi-level diagnostic framework for the accurate detection of COVID-19 using X-ray scans based on transfer learning. The developed framework consists of three stages, beginning with a pre-processing step to remove noise effects and image resizing followed by a deep learning architecture utilizing an Xception pre-trained model for feature extraction from the pre-processed image. Our design utilizes a global average pooling (GAP) layer for avoiding over-fitting, and an activation layer is added in order to reduce the losses. Final classification is achieved using a softmax layer. The system is evaluated using different activation functions and thresholds with different optimizers. We used a benchmark dataset from the kaggle website. The proposed model has been evaluated on 7395 images that consist of 3 classes (COVID-19, normal and pneumonia). Additionally, we compared our framework with the traditional pre-trained deep learning models and with other literature studies. Our evaluation using various metrics showed that our framework achieved a high test accuracy of 99.3% with a minimum loss of 0.02 using the LeakyReLU activation function at a threshold equal to 0.1 with the RMSprop optimizer. Additionally, we achieved a sensitivity and specificity of 99 and F1-Score of 99.3% with only 10 epochs and a 10−4 learning rate.
Collapse
|
14
|
Detection of Diabetic Retinopathy (DR) Severity from Fundus Photographs: An Ensemble Approach Using Weighted Average. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-021-06381-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
15
|
Mujeeb Rahman KK, Subashini MM. Identification of Autism in Children Using Static Facial Features and Deep Neural Networks. Brain Sci 2022; 12:brainsci12010094. [PMID: 35053837 PMCID: PMC8773918 DOI: 10.3390/brainsci12010094] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 01/03/2022] [Accepted: 01/08/2022] [Indexed: 01/27/2023] Open
Abstract
Autism spectrum disorder (ASD) is a complicated neurological developmental disorder that manifests itself in a variety of ways. The child diagnosed with ASD and their parents’ daily lives can be dramatically improved with early diagnosis and appropriate medical intervention. The applicability of static features extracted from autistic children’s face photographs as a biomarker to distinguish them from typically developing children is investigated in this study paper. We used five pre-trained CNN models: MobileNet, Xception, EfficientNetB0, EfficientNetB1, and EfficientNetB2 as feature extractors and a DNN model as a binary classifier to identify autism in children accurately. We used a publicly available dataset to train the suggested models, which consisted of face pictures of children diagnosed with autism and controls classed as autistic and non-autistic. The Xception model outperformed the others, with an AUC of 96.63%, a sensitivity of 88.46%, and an NPV of 88%. EfficientNetB0 produced a consistent prediction score of 59% for autistic and non-autistic groups with a 95% confidence level.
Collapse
Affiliation(s)
- K. K. Mujeeb Rahman
- School of Electronics Engineering, Vellore Institute of Technology, Vellore 632014, India;
- Department of Biomedical Engineering, Ajman University, Ajman P.O. Box 346, United Arab Emirates
| | - M. Monica Subashini
- School of Electrical Engineering, Vellore Institute of Technology, Vellore 632014, India
- Correspondence:
| |
Collapse
|
16
|
GSV-NET: A Multi-Modal Deep Learning Network for 3D Point Cloud Classification. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12010483] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Light Detection and Ranging (LiDAR), which applies light in the formation of a pulsed laser to estimate the distance between the LiDAR sensor and objects, is an effective remote sensing technology. Many applications use LiDAR including autonomous vehicles, robotics, and virtual and augmented reality (VR/AR). The 3D point cloud classification is now a hot research topic with the evolution of LiDAR technology. This research aims to provide a high performance and compatible real-world data method for 3D point cloud classification. More specifically, we introduce a novel framework for 3D point cloud classification, namely, GSV-NET, which uses Gaussian Supervector and enhancing region representation. GSV-NET extracts and combines both global and regional features of the 3D point cloud to further enhance the information of the point cloud features for the 3D point cloud classification. Firstly, we input the Gaussian Supervector description into a 3D wide-inception convolution neural network (CNN) structure to define the global feature. Secondly, we convert the regions of the 3D point cloud into color representation and capture region features with a 2D wide-inception network. These extracted features are inputs of a 1D CNN architecture. We evaluate the proposed framework on the point cloud dataset: ModelNet and the LiDAR dataset: Sydney. The ModelNet dataset was developed by Princeton University (New Jersey, United States), while the Sydney dataset was created by the University of Sydney (Sydney, Australia). Based on our numerical results, our framework achieves more accuracy than the state-of-the-art approaches.
Collapse
|
17
|
Classification of diabetic retinopathy using unlabeled data and knowledge distillation. Artif Intell Med 2021; 121:102176. [PMID: 34763798 DOI: 10.1016/j.artmed.2021.102176] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 09/11/2021] [Accepted: 09/13/2021] [Indexed: 11/22/2022]
Abstract
Over the last decade, advances in Machine Learning and Artificial Intelligence have highlighted their potential as a diagnostic tool in the healthcare domain. Despite the widespread availability of medical images, their usefulness is severely hampered by a lack of access to labeled data. For example, while Convolutional Neural Networks (CNNs) have emerged as an essential analytical tool in image processing, their impact is curtailed by training limitations due to insufficient labeled data availability. Transfer Learning enables models developed for one task to be reused for a second task. Knowledge distillation enables transferring knowledge from a pre-trained model to another. However, it suffers from limitations, and the two models' constraints need to be architecturally similar. Knowledge distillation addresses some of the shortcomings of transfer learning by generalizing a complex model to a lighter model. However, some parts of the knowledge may not be distilled by knowledge distillation sufficiently. In this paper, a novel knowledge distillation approach using transfer learning is proposed. The proposed approach transfers the complete knowledge of a model to a new smaller one. Unlabeled data are used in an unsupervised manner to transfer the new smaller model's maximum amount of knowledge. The proposed method can be beneficial in medical image analysis, where labeled data are typically scarce. The proposed approach is evaluated in classifying images for diagnosing Diabetic Retinopathy on two publicly available datasets, including Messidor and EyePACS. Simulation results demonstrate that the approach effectively transfers knowledge from a complex model to a lighter one. Furthermore, experimental results illustrate that different small models' performance is improved significantly using unlabeled data and knowledge distillation.
Collapse
|
18
|
Liu Q, Gao Y, Xu B. Transferable, Deep-Learning-Driven Fast Prediction and Design of Thermal Transport in Mechanically Stretched Graphene Flakes. ACS NANO 2021; 15:16597-16606. [PMID: 34648261 DOI: 10.1021/acsnano.1c06340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Piling graphene sheets into a bulk form is essential for achieving massive applications of graphene in flexible structures and devices, and the arbitrary shape, random distributions, and adjacent overlaps of graphene sheets are yet challenging the prediction of its fundamental properties that are strongly coupled by mechanical strength and thermal or electronic transport. Here, we present a deep neural network (DNN)-based machine learning (ML) approach that enables the prediction of thermal conductivity of piled graphene structures with a broad range of geometric configurations and dimensions in response to external mechanical loading. A physics-informed pixel value matrix is developed to capture the key geometric features of piled graphene structures and is incorporated into the DNN to train the ML model with the only training data ratio of 12.5% but the prediction accuracy of 94%. The ML model is further extended with the transferred knowledge from primitive training data sets to predict the thermal transport of piled graphene in a custom data set. Extensive demonstrations in search of piled graphene structures with desirable thermal conductivity and its response to mechanical loading are presented and illustrate the capability and accuracy of the DNN-ML model for establishing a mechanically adaptive structure: responsive thermal property paradigm in piled graphene. This work lays a foundation for quantitatively evaluating thermal conductivity of piled graphene in response to mechanical loadings through an ML model and also offers a rational route for exploring mechanically tunable thermal properties of nanomaterial-based bulk forms, potentially useful in the design of flexible thermal structures and devices with controllable thermal management performance.
Collapse
Affiliation(s)
- Qingchang Liu
- Department of Mechanical and Aerospace Engineering, University of Virginia, Charlottesville, Virginia 22904, United States
| | - Yuan Gao
- Department of Mechanical and Aerospace Engineering, University of Virginia, Charlottesville, Virginia 22904, United States
| | - Baoxing Xu
- Department of Mechanical and Aerospace Engineering, University of Virginia, Charlottesville, Virginia 22904, United States
| |
Collapse
|
19
|
Abbas Q, Qureshi I, Ibrahim MEA. An Automatic Detection and Classification System of Five Stages for Hypertensive Retinopathy Using Semantic and Instance Segmentation in DenseNet Architecture. SENSORS (BASEL, SWITZERLAND) 2021; 21:6936. [PMID: 34696149 PMCID: PMC8538561 DOI: 10.3390/s21206936] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 10/13/2021] [Accepted: 10/15/2021] [Indexed: 12/23/2022]
Abstract
The stage and duration of hypertension are connected to the occurrence of Hypertensive Retinopathy (HR) of eye disease. Currently, a few computerized systems have been developed to recognize HR by using only two stages. It is difficult to define specialized features to recognize five grades of HR. In addition, deep features have been used in the past, but the classification accuracy is not up-to-the-mark. In this research, a new hypertensive retinopathy (HYPER-RETINO) framework is developed to grade the HR based on five grades. The HYPER-RETINO system is implemented based on pre-trained HR-related lesions. To develop this HYPER-RETINO system, several steps are implemented such as a preprocessing, the detection of HR-related lesions by semantic and instance-based segmentation and a DenseNet architecture to classify the stages of HR. Overall, the HYPER-RETINO system determined the local regions within input retinal fundus images to recognize five grades of HR. On average, a 10-fold cross-validation test obtained sensitivity (SE) of 90.5%, specificity (SP) of 91.5%, accuracy (ACC) of 92.6%, precision (PR) of 91.7%, Matthews correlation coefficient (MCC) of 61%, F1-score of 92% and area-under-the-curve (AUC) of 0.915 on 1400 HR images. Thus, the applicability of the HYPER-RETINO method to reliably diagnose stages of HR is verified by experimental findings.
Collapse
Affiliation(s)
- Qaisar Abbas
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia;
| | - Imran Qureshi
- Department of Computer Software Engineering, Military College of Signals, National University of Sciences and Technology (MCS-NUST), Islamabad 44000, Pakistan;
| | - Mostafa E. A. Ibrahim
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia;
- Department of Electrical Engineering, Benha Faculty of Engineering, Benha University, Qalubia, Benha 13518, Egypt
| |
Collapse
|
20
|
A Fusion-Based Hybrid-Feature Approach for Recognition of Unconstrained Offline Handwritten Hindi Characters. FUTURE INTERNET 2021. [DOI: 10.3390/fi13090239] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Hindi is the official language of India and used by a large population for several public services like postal, bank, judiciary, and public surveys. Efficient management of these services needs language-based automation. The proposed model addresses the problem of handwritten Hindi character recognition using a machine learning approach. The pre-trained DCNN models namely; InceptionV3-Net, VGG19-Net, and ResNet50 were used for the extraction of salient features from the characters’ images. A novel approach of fusion is adopted in the proposed work; the DCNN-based features are fused with the handcrafted features received from Bi-orthogonal discrete wavelet transform. The feature size was reduced by the Principal Component Analysis method. The hybrid features were examined with popular classifiers namely; Multi-Layer Perceptron (MLP) and Support Vector Machine (SVM). The recognition cost was reduced by 84.37%. The model achieved significant scores of precision, recall, and F1-measure—98.78%, 98.67%, and 98.69%—with overall recognition accuracy of 98.73%.
Collapse
|
21
|
Differentiation of River Sediments Fractions in UAV Aerial Images by Convolution Neural Network. REMOTE SENSING 2021. [DOI: 10.3390/rs13163188] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Riverbed material has multiple functions in river ecosystems, such as habitats, feeding grounds, spawning grounds, and shelters for aquatic organisms, and particle size of riverbed material reflects the tractive force of the channel flow. Therefore, regular surveys of riverbed material are conducted for environmental protection and river flood control projects. The field method is the most conventional riverbed material survey. However, conventional surveys of particle size of riverbed material require much labor, time, and cost to collect material on site. Furthermore, its spatial representativeness is also a problem because of the limited survey area against a wide riverbank. As a further solution to these problems, in this study, we tried an automatic classification of riverbed conditions using aerial photography with an unmanned aerial vehicle (UAV) and image recognition with artificial intelligence (AI) to improve survey efficiency. Due to using AI for image processing, a large number of images can be handled regardless of whether they are of fine or coarse particles. We tried a classification of aerial riverbed images that have the difference of particle size characteristics with a convolutional neural network (CNN). GoogLeNet, Alexnet, VGG-16 and ResNet, the common pre-trained networks, were retrained to perform the new task with the 70 riverbed images using transfer learning. Among the networks tested, GoogleNet showed the best performance for this study. The overall accuracy of the image classification reached 95.4%. On the other hand, it was supposed that shadows of the gravels caused the error of the classification. The network retrained with the images taken in the uniform temporal period gives higher accuracy for classifying the images taken in the same period as the training data. The results suggest the potential of evaluating riverbed materials using aerial photography with UAV and image recognition with CNN.
Collapse
|
22
|
Iglesias LL, Bellón PS, Del Barrio AP, Fernández-Miranda PM, González DR, Vega JA, Mandly AAG, Blanco JAP. A primer on deep learning and convolutional neural networks for clinicians. Insights Imaging 2021; 12:117. [PMID: 34383173 PMCID: PMC8360246 DOI: 10.1186/s13244-021-01052-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Accepted: 07/01/2021] [Indexed: 11/25/2022] Open
Abstract
Deep learning is nowadays at the forefront of artificial intelligence. More precisely, the use of convolutional neural networks has drastically improved the learning capabilities of computer vision applications, being able to directly consider raw data without any prior feature extraction. Advanced methods in the machine learning field, such as adaptive momentum algorithms or dropout regularization, have dramatically improved the convolutional neural networks predicting ability, outperforming that of conventional fully connected neural networks. This work summarizes, in an intended didactic way, the main aspects of these cutting-edge techniques from a medical imaging perspective.
Collapse
Affiliation(s)
- Lara Lloret Iglesias
- Advanced Computation and e-Science, Instituto de Fsica de Cantabria - CSIC, Santander, Spain.
| | - Pablo Sanz Bellón
- Servicio de Radiodiagnostico, Hospital Universitario Marques de Valdecilla, Santander, Spain
- Instituto de Investigación Sanitaria Valdecilla (IDIVAL), Santander, Spain
| | - Amaia Pérez Del Barrio
- Servicio de Radiodiagnostico, Hospital Universitario Marques de Valdecilla, Santander, Spain
- Instituto de Investigación Sanitaria Valdecilla (IDIVAL), Santander, Spain
| | - Pablo Menéndez Fernández-Miranda
- Servicio de Radiodiagnostico, Hospital Universitario Marques de Valdecilla, Santander, Spain
- Instituto de Investigación Sanitaria Valdecilla (IDIVAL), Santander, Spain
| | | | - José A Vega
- Departamento de Morfologa y Biologa Celular, Universidad de Oviedo, Oviedo, Spain
- Facultad de Ciencias de la Salud, Universidad Autonoma de Chile, Santiago de Chile, Chile
| | - Andrés A González Mandly
- Servicio de Radiodiagnostico, Hospital Universitario Marques de Valdecilla, Santander, Spain
- Instituto de Investigación Sanitaria Valdecilla (IDIVAL), Santander, Spain
| | - José A Parra Blanco
- Servicio de Radiodiagnostico, Hospital Universitario Marques de Valdecilla, Santander, Spain
- Instituto de Investigación Sanitaria Valdecilla (IDIVAL), Santander, Spain
| |
Collapse
|
23
|
Bhardwaj C, Jain S, Sood M. Deep Learning-Based Diabetic Retinopathy Severity Grading System Employing Quadrant Ensemble Model. J Digit Imaging 2021; 34:440-457. [PMID: 33686525 DOI: 10.1007/s10278-021-00418-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 12/23/2020] [Accepted: 01/03/2021] [Indexed: 12/23/2022] Open
Abstract
The diabetic retinopathy accounts in the deterioration of retinal blood vessels leading to a serious compilation affecting the eyes. The automated DR diagnosis frameworks are critically important for the early identification and detection of these eye-related problems, helping the ophthalmic experts in providing the second opinion for effectual treatment. The deep learning techniques have evolved as an improvement over the conventional approaches, which are dependent on the handcrafted feature extraction. To address the issue of proficient DR discrimination, the authors have proposed a quadrant ensemble automated DR grading approach by implementing InceptionResnet-V2 deep neural network framework. The presented model incorporates histogram equalization, optical disc localization, and quadrant cropping along with the data augmentation step for improving the network performance. A superior accuracy performance of 93.33% is observed for the proposed framework, and a significant reduction of 0.325 is noticed in the cross-entropy loss function for MESSIDOR benchmark dataset; however, its validation utilizing the latest IDRiD dataset establishes its generalization ability. The accuracy improvement of 13.58% is observed when the proposed QEIRV-2 model is compared with the classical Inception-V3 CNN model. To justify the viability of the proposed framework, its performance is compared with the existing state-of-the-art approaches and 25.23% of accuracy improvement is observed.
Collapse
Affiliation(s)
- Charu Bhardwaj
- Department of Electronics and Communication Engineering, JUIT Waknaghat, Solan, HP, India.
| | - Shruti Jain
- Department of Electronics and Communication Engineering, JUIT Waknaghat, Solan, HP, India
| | | |
Collapse
|
24
|
Abstract
This study is about the manufacturing of a personified automatic robotic lawn mower with image recognition. The system structure is that the platform above the crawler tracks is combined with the lawn mower, steering motor, slide rail, and webcam to achieve the purpose of personification. Crawler tracks with a strong grip and good ability to adapt to terrain are selected as a moving vehicle to simulate human feet. In addition, a lawn mower mechanism is designed to simulate the left and right swing of human mowing to promote efficiency and innovation, and then human eyes are replaced by Webcam to identify obstacles. A human-machine interface is added so that through the mobile phone remote operation, users can choose a slow mode, inching mode, and obstacle avoidance mode on the human-machine interface. When the length of both sides of the rectangular area is input to the program, the automatic robotic lawn mower will complete the instruction according to the specified path. The chip of a Digital Signal Processor (DSP) TMS320F2808 is used as the core controller, and Raspberry Pi is used as image recognition and human-machine interface design. This robot can reduce labor costs and improve the efficiency of mowing by remote control. In addition to the use as an automatic mower on farms, this study concept can also be used in the lawn maintenance of golf courses and school playgrounds.
Collapse
|
25
|
Gu YH, Yin H, Jin D, Park JH, Yoo SJ. Image-Based Hot Pepper Disease and Pest Diagnosis Using Transfer Learning and Fine-Tuning. FRONTIERS IN PLANT SCIENCE 2021; 12:724487. [PMID: 34975933 PMCID: PMC8716927 DOI: 10.3389/fpls.2021.724487] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Accepted: 11/19/2021] [Indexed: 05/20/2023]
Abstract
Past studies of plant disease and pest recognition used classification methods that presented a singular recognition result to the user. Unfortunately, incorrect recognition results may be output, which may lead to further crop damage. To address this issue, there is a need for a system that suggest several candidate results and allow the user to make the final decision. In this study, we propose a method for diagnosing plant diseases and identifying pests using deep features based on transfer learning. To extract deep features, we employ pre-trained VGG and ResNet 50 architectures based on the ImageNet dataset, and output disease and pest images similar to a query image via a k-nearest-neighbor algorithm. In this study, we use a total of 23,868 images of 19 types of hot-pepper diseases and pests, for which, the proposed model achieves accuracies of 96.02 and 99.61%, respectively. We also measure the effects of fine-tuning and distance metrics. The results show that the use of fine-tuning-based deep features increases accuracy by approximately 0.7-7.38%, and the Bray-Curtis distance achieves an accuracy of approximately 0.65-1.51% higher than the Euclidean distance.
Collapse
Affiliation(s)
- Yeong Hyeon Gu
- Department of Computer Science and Engineering, Sejong University, Seoul, South Korea
| | - Helin Yin
- Department of Computer Science and Engineering, Sejong University, Seoul, South Korea
| | - Dong Jin
- Department of Computer Science and Engineering, Sejong University, Seoul, South Korea
| | - Jong-Han Park
- Horticultural and Herbal Crop Environment Division, National Institute of Horticultural and Herbal Science, Rural Development Administration, Wanju, South Korea
| | - Seong Joon Yoo
- Department of Computer Science and Engineering, Sejong University, Seoul, South Korea
- *Correspondence: Seong Joon Yoo,
| |
Collapse
|
26
|
Two-Stage Mask-RCNN Approach for Detecting and Segmenting the Optic Nerve Head, Optic Disc, and Optic Cup in Fundus Images. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10113833] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In this paper, we propose a method for localizing the optic nerve head and segmenting the optic disc/cup in retinal fundus images. The approach is based on a simple two-stage Mask-RCNN compared to sophisticated methods that represent the state-of-the-art in the literature. In the first stage, we detect and crop around the optic nerve head then feed the cropped image as input for the second stage. The second stage network is trained using a weighted loss to produce the final segmentation. To further improve the detection in the first stage, we propose a new fine-tuning strategy by combining the cropping output of the first stage with the original training image to train a new detection network using different scales for the region proposal network anchors. We evaluate the method on Retinal Fundus Images for Glaucoma Analysis (REFUGE), Magrabi, and MESSIDOR datasets. We used the REFUGE training subset to train the models in the proposed method. Our method achieved 0.0430 mean absolute error in the vertical cup-to-disc ratio (MAE vCDR) on the REFUGE test set compared to 0.0414 obtained using complex and multiple ensemble networks methods. The models trained with the proposed method transfer well to datasets outside REFUGE, achieving a MAE vCDR of 0.0785 and 0.077 on MESSIDOR and Magrabi datasets, respectively, without being retrained. In terms of detection accuracy, the proposed new fine-tuning strategy improved the detection rate from 96.7% to 98.04% on MESSIDOR and from 93.6% to 100% on Magrabi datasets compared to the reported detection rates in the literature.
Collapse
|