1
|
Rajinikanth V, Biju R, Mittal N, Mittal V, Askar S, Abouhawwash M. COVID-19 detection in lung CT slices using Brownian-butterfly-algorithm optimized lightweight deep features. Heliyon 2024; 10:e27509. [PMID: 38468955 PMCID: PMC10926136 DOI: 10.1016/j.heliyon.2024.e27509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 02/29/2024] [Accepted: 02/29/2024] [Indexed: 03/13/2024] Open
Abstract
Several deep-learning assisted disease assessment schemes (DAS) have been proposed to enhance accurate detection of COVID-19, a critical medical emergency, through the analysis of clinical data. Lung imaging, particularly from CT scans, plays a pivotal role in identifying and assessing the severity of COVID-19 infections. Existing automated methods leveraging deep learning contribute significantly to reducing the diagnostic burden associated with this process. This research aims in developing a simple DAS for COVID-19 detection using the pre-trained lightweight deep learning methods (LDMs) applied to lung CT slices. The use of LDMs contributes to a less complex yet highly accurate detection system. The key stages of the developed DAS include image collection and initial processing using Shannon's thresholding, deep-feature mining supported by LDMs, feature optimization utilizing the Brownian Butterfly Algorithm (BBA), and binary classification through three-fold cross-validation. The performance evaluation of the proposed scheme involves assessing individual, fused, and ensemble features. The investigation reveals that the developed DAS achieves a detection accuracy of 93.80% with individual features, 96% accuracy with fused features, and an impressive 99.10% accuracy with ensemble features. These outcomes affirm the effectiveness of the proposed scheme in significantly enhancing COVID-19 detection accuracy in the chosen lung CT database.
Collapse
Affiliation(s)
- Venkatesan Rajinikanth
- Department of Computer Science and Engineering, Division of Research and Innovation, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, 602105, Tamil Nadu, India
| | - Roshima Biju
- Department of Computer Science Engineering, Parul University, Vadodara, 391760, Gujarat, India
| | - Nitin Mittal
- Skill Faculty of Engineering and Technology, Shri Vishwakarma Skill University, Palwal, 121102, Haryana, India
| | - Vikas Mittal
- Department of Electronics and Communication Engineering, Chandigarh University, Mohali, 140413, India
| | - S.S. Askar
- Department of Statistics and Operations Research, College of Science, King Saud University, P.O. Box 2455, Riyadh, 11451, Saudi Arabia
| | - Mohamed Abouhawwash
- Department of Mathematics, Faculty of Science, Mansoura University, Mansoura, 35516, Egypt
| |
Collapse
|
2
|
Zhu Y, Chen S, Yin H, Han X, Xu M, Wang W, Zhang Y, Feng X, Liu Y. Classification of oolong tea varieties based on computer vision and convolutional neural networks. J Sci Food Agric 2024; 104:1630-1637. [PMID: 37842747 DOI: 10.1002/jsfa.13049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 10/10/2023] [Accepted: 10/16/2023] [Indexed: 10/17/2023]
Abstract
BACKGROUND In the contemporary food industry, accurate and rapid differentiation of oolong tea varieties holds paramount importance for traceability and quality control. However, achieving this remains a formidable challenge. This study addresses this lacuna by employing machine learning algorithms - namely support vector machines (SVMs) and convolutional neural networks (CNNs) - alongside computer vision techniques for the automated classification of oolong tea leaves based on visual attributes. RESULTS An array of 13 distinct characteristics, encompassing color and texture, were identified from five unique oolong tea varieties. To fortify the robustness of the predictive models, data augmentation and image cropping methods were employed. A comparative analysis of SVM- and CNN-based models revealed that the ResNet50 model achieved a high Top-1 accuracy rate exceeding 93%. This robust performance substantiates the efficacy of the implemented methodology for rapid and precise oolong tea classification. CONCLUSION The study elucidates that the integration of computer vision with machine learning algorithms constitutes a promising, non-invasive approach for the quick and accurate categorization of oolong tea varieties. The findings have significant ramifications for process monitoring, quality assurance, authenticity validation and adulteration detection within the tea industry. © 2023 Society of Chemical Industry.
Collapse
Affiliation(s)
- Yiwen Zhu
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Siyuan Chen
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Hanzhe Yin
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Xihao Han
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Menghan Xu
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Wenli Wang
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Yin Zhang
- Key Laboratory of Meat Processing of Sichuan, Chengdu University, Chengdu, China
| | - Xiaoxiao Feng
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai, China
| | - Yuan Liu
- Department of Food Science & Technology, School of Agriculture & Biology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
3
|
Alharthi AG, Alzahrani SM. Multi-Slice Generation sMRI and fMRI for Autism Spectrum Disorder Diagnosis Using 3D-CNN and Vision Transformers. Brain Sci 2023; 13:1578. [PMID: 38002538 PMCID: PMC10670036 DOI: 10.3390/brainsci13111578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 11/03/2023] [Accepted: 11/09/2023] [Indexed: 11/26/2023] Open
Abstract
Researchers have explored various potential indicators of ASD, including changes in brain structure and activity, genetics, and immune system abnormalities, but no definitive indicator has been found yet. Therefore, this study aims to investigate ASD indicators using two types of magnetic resonance images (MRI), structural (sMRI) and functional (fMRI), and to address the issue of limited data availability. Transfer learning is a valuable technique when working with limited data, as it utilizes knowledge gained from a pre-trained model in a domain with abundant data. This study proposed the use of four vision transformers namely ConvNeXT, MobileNet, Swin, and ViT using sMRI modalities. The study also investigated the use of a 3D-CNN model with sMRI and fMRI modalities. Our experiments involved different methods of generating data and extracting slices from raw 3D sMRI and 4D fMRI scans along the axial, coronal, and sagittal brain planes. To evaluate our methods, we utilized a standard neuroimaging dataset called NYU from the ABIDE repository to classify ASD subjects from typical control subjects. The performance of our models was evaluated against several baselines including studies that implemented VGG and ResNet transfer learning models. Our experimental results validate the effectiveness of the proposed multi-slice generation with the 3D-CNN and transfer learning methods as they achieved state-of-the-art results. In particular, results from 50-middle slices from the fMRI and 3D-CNN showed a profound promise in ASD classifiability as it obtained a maximum accuracy of 0.8710 and F1-score of 0.8261 when using the mean of 4D images across the axial, coronal, and sagittal. Additionally, the use of the whole slices in fMRI except the beginnings and the ends of brain views helped to reduce irrelevant information and showed good performance of 0.8387 accuracy and 0.7727 F1-score. Lastly, the transfer learning with the ConvNeXt model achieved results higher than other transformers when using 50-middle slices sMRI along the axial, coronal, and sagittal planes.
Collapse
Affiliation(s)
| | - Salha M. Alzahrani
- Department of Computer Science, College of Computers and Information Technology, Taif University, Taif 21944, Saudi Arabia;
| |
Collapse
|
4
|
Shim J, Koo J, Park Y. A Methodology of Condition Monitoring System Utilizing Supervised and Semi-Supervised Learning in Railway. Sensors (Basel) 2023; 23:9075. [PMID: 38005464 PMCID: PMC10674533 DOI: 10.3390/s23229075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/08/2023] [Accepted: 10/28/2023] [Indexed: 11/26/2023]
Abstract
In this paper, research was conducted on anomaly detection of wheel flats. In the railway sector, conducting tests with actual railway vehicles is challenging due to safety concerns for passengers and maintenance issues as it is a public industry. Therefore, dynamics software was utilized. Next, STFT (short-time Fourier transform) was performed to create spectrogram images. In the case of railway vehicles, control, monitoring, and communication are performed through TCMS, but complex analysis and data processing are difficult because there are no devices such as GPUs. Furthermore, there are memory limitations. Therefore, in this paper, the relatively lightweight models LeNet-5, ResNet-20, and MobileNet-V3 were selected for deep learning experiments. At this time, the LeNet-5 and MobileNet-V3 models were modified from the basic architecture. Since railway vehicles are given preventive maintenance, it is difficult to obtain fault data. Therefore, semi-supervised learning was also performed. At this time, the Deep One Class Classification paper was referenced. The evaluation results indicated that the modified LeNet-5 and MobileNet-V3 models achieved approximately 97% and 96% accuracy, respectively. At this point, the LeNet-5 model showed a training time of 12 min faster than the MobileNet-V3 model. In addition, the semi-supervised learning results showed a significant outcome of approximately 94% accuracy when considering the railway maintenance environment. In conclusion, considering the railway vehicle maintenance environment and device specifications, it was inferred that the relatively simple and lightweight LeNet-5 model can be effectively utilized while using small images.
Collapse
Affiliation(s)
- Jaeseok Shim
- Complex Research Center for Materials & Components of Railway, Seoul National University of Science and Technology, Seoul 01811, Republic of Korea;
| | - Jeongseo Koo
- Department of Railway Safety Engineering, Seoul National University of Science and Technology, Seoul 01811, Republic of Korea;
| | - Yongwoon Park
- A2Mind, 213, Toegye-ro, Jung-gu, Seoul 04557, Republic of Korea
| |
Collapse
|
5
|
Islam MM, Alam KMR, Uddin J, Ashraf I, Samad MA. Benign and Malignant Oral Lesion Image Classification Using Fine-Tuned Transfer Learning Techniques. Diagnostics (Basel) 2023; 13:3360. [PMID: 37958257 PMCID: PMC10650377 DOI: 10.3390/diagnostics13213360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 10/23/2023] [Accepted: 10/30/2023] [Indexed: 11/15/2023] Open
Abstract
Oral lesions are a prevalent manifestation of oral disease, and the timely identification of oral lesions is imperative for effective intervention. Fortunately, deep learning algorithms have shown great potential for automated lesion detection. The primary aim of this study was to employ deep learning-based image classification algorithms to identify oral lesions. We used three deep learning models, namely VGG19, DeIT, and MobileNet, to assess the efficacy of various categorization methods. To evaluate the accuracy and reliability of the models, we employed a dataset consisting of oral pictures encompassing two distinct categories: benign and malignant lesions. The experimental findings indicate that VGG19 and MobileNet attained an almost perfect accuracy rate of 100%, while DeIT achieved a slightly lower accuracy rate of 98.73%. The results of this study indicate that deep learning algorithms for picture classification demonstrate a high level of effectiveness in detecting oral lesions by achieving 100% for VGG19 and MobileNet and 98.73% for DeIT. Specifically, the VGG19 and MobileNet models exhibit notable suitability for this particular task.
Collapse
Affiliation(s)
- Md. Monirul Islam
- Department of Software Engineering, Daffodil International University, Daffodil Smart City (DSC), Birulia, Savar, Dhaka 1216, Bangladesh
| | - K. M. Rafiqul Alam
- Department of Statistics, Jahangirnagar University, Dhaka 1342, Bangladesh
| | - Jia Uddin
- AI and Big Data Department, Endicott College, Woosong University, Daejeon 34606, Republic of Korea
| | - Imran Ashraf
- Department of Information and Communication Engineering, Yeungnam University, Gyeongsan-si 38541, Republic of Korea
| | - Md Abdus Samad
- Department of Information and Communication Engineering, Yeungnam University, Gyeongsan-si 38541, Republic of Korea
| |
Collapse
|
6
|
Saidani O, Aljrees T, Umer M, Alturki N, Alshardan A, Khan SW, Alsubai S, Ashraf I. Enhancing Prediction of Brain Tumor Classification Using Images and Numerical Data Features. Diagnostics (Basel) 2023; 13:2544. [PMID: 37568907 PMCID: PMC10417332 DOI: 10.3390/diagnostics13152544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 07/23/2023] [Accepted: 07/26/2023] [Indexed: 08/13/2023] Open
Abstract
Brain tumors, along with other diseases that harm the neurological system, are a significant contributor to global mortality. Early diagnosis plays a crucial role in effectively treating brain tumors. To distinguish individuals with tumors from those without, this study employs a combination of images and data-based features. In the initial phase, the image dataset is enhanced, followed by the application of a UNet transfer-learning-based model to accurately classify patients as either having tumors or being normal. In the second phase, this research utilizes 13 features in conjunction with a voting classifier. The voting classifier incorporates features extracted from deep convolutional layers and combines stochastic gradient descent with logistic regression to achieve better classification results. The reported accuracy score of 0.99 achieved by both proposed models shows its superior performance. Also, comparing results with other supervised learning algorithms and state-of-the-art models validates its performance.
Collapse
Affiliation(s)
- Oumaima Saidani
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia; (O.S.); (N.A.); (A.A.)
| | - Turki Aljrees
- Department College of Computer Science and Engineering, University of Hafr Al-Batin, Hafar Al-Batin 39524, Saudi Arabia;
| | - Muhammad Umer
- Department of Computer Science & Information Technology, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan
| | - Nazik Alturki
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia; (O.S.); (N.A.); (A.A.)
| | - Amal Alshardan
- Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia; (O.S.); (N.A.); (A.A.)
| | - Sardar Waqar Khan
- Department of Computer Science & Information Technology, The University of Lahore, Lahore 54000, Pakistan;
| | - Shtwai Alsubai
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Al-Kharj 11942, Saudi Arabia;
| | - Imran Ashraf
- Department of Information and Communication Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea
| |
Collapse
|
7
|
Khader A, Alquran H. Automated Prediction of Osteoarthritis Level in Human Osteochondral Tissue Using Histopathological Images. Bioengineering (Basel) 2023; 10:764. [PMID: 37508791 PMCID: PMC10376879 DOI: 10.3390/bioengineering10070764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 06/21/2023] [Accepted: 06/23/2023] [Indexed: 07/30/2023] Open
Abstract
Osteoarthritis (OA) is the most common arthritis and the leading cause of lower extremity disability in older adults. Understanding OA progression is important in the development of patient-specific therapeutic techniques at the early stage of OA rather than at the end stage. Histopathology scoring systems are usually used to evaluate OA progress and the mechanisms involved in the development of OA. This study aims to classify the histopathological images of cartilage specimens automatically, using artificial intelligence algorithms. Hematoxylin and eosin (HE)- and safranin O and fast green (SafO)-stained images of human cartilage specimens were divided into early, mild, moderate, and severe OA. Five pre-trained convolutional networks (DarkNet-19, MobileNet, ResNet-101, NasNet) were utilized to extract the twenty features from the last fully connected layers for both scenarios of SafO and HE. Principal component analysis (PCA) and ant lion optimization (ALO) were utilized to obtain the best-weighted features. The support vector machine classifier was trained and tested based on the selected descriptors to achieve the highest accuracies of 98.04% and 97.03% in HE and SafO, respectively. Using the ALO algorithm, the F1 scores were 0.97, 0.991, 1, and 1 for the HE images and 1, 0.991, 0.97, and 1 for the SafO images for the early, mild, moderate, and severe classes, respectively. This algorithm may be a useful tool for researchers to evaluate the histopathological images of OA without the need for experts in histopathology scoring systems or the need to train new experts. Incorporating automated deep features could help to improve the characterization and understanding of OA progression and development.
Collapse
Affiliation(s)
- Ateka Khader
- Department of Biomedical Systems and Informatics Engineering, Hijjawi Faculty for Engineering Technology, Yarmouk University, Irbid 21163, Jordan
| | - Hiam Alquran
- Department of Biomedical Systems and Informatics Engineering, Hijjawi Faculty for Engineering Technology, Yarmouk University, Irbid 21163, Jordan
| |
Collapse
|
8
|
Srinivas K, Gagana Sri R, Pravallika K, Nishitha K, Polamuri SR. COVID-19 prediction based on hybrid Inception V3 with VGG16 using chest X-ray images. Multimed Tools Appl 2023:1-18. [PMID: 37362699 PMCID: PMC10240113 DOI: 10.1007/s11042-023-15903-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 05/12/2023] [Accepted: 05/18/2023] [Indexed: 06/28/2023]
Abstract
The Corona Virus was first started in the Wuhan city, China in December of 2019. It belongs to the Coronaviridae family, which can infect both animals and humans. The diagnosis of coronavirus disease-2019 (COVID-19) is typically detected by Serology, Genetic Real-Time reverse transcription-Polymerase Chain Reaction (RT-PCR), and Antigen testing. These testing methods have limitations like limited sensitivity, high cost, and long turn-around time. It is necessary to develop an automatic detection system for COVID-19 prediction. Chest X-ray is a lower-cost process in comparison to chest Computed tomography (CT). Deep learning is the best fruitful technique of machine learning, which provides useful investigation for learning and screening a large amount of chest X-ray images with COVID-19 and normal. There are many deep learning methods for prediction, but these methods have a few limitations like overfitting, misclassification, and false predictions for poor-quality chest X-rays. In order to overcome these limitations, the novel hybrid model called "Inception V3 with VGG16 (Visual Geometry Group)" is proposed for the prediction of COVID-19 using chest X-rays. It is a combination of two deep learning models, Inception V3 and VGG16 (IV3-VGG). To build the hybrid model, collected 243 images from the COVID-19 Radiography Database. Out of 243 X-rays, 121 are COVID-19 positive and 122 are normal images. The hybrid model is divided into two modules namely pre-processing and the IV3-VGG. In the dataset, some of the images with different sizes and different color intensities are identified and pre-processed. The second module i.e., IV3-VGG consists of four blocks. The first block is considered for VGG-16 and blocks 2 and 3 are considered for Inception V3 networks and final block 4 consists of four layers namely Avg pooling, dropout, fully connected, and Softmax layers. The experimental results show that the IV3-VGG model achieves the highest accuracy of 98% compared to the existing five prominent deep learning models such as Inception V3, VGG16, ResNet50, DenseNet121, and MobileNet.
Collapse
Affiliation(s)
- K. Srinivas
- Department of CSE, VR Siddhartha Engineering College, Vijayawada, 520007 India
| | - R. Gagana Sri
- Department of CSE, VR Siddhartha Engineering College, Vijayawada, 520007 India
| | - K. Pravallika
- Department of CSE, Sir C. R. Reddy Engineering College, Eluru, 534007 India
| | - K. Nishitha
- Department of CSE, VR Siddhartha Engineering College, Vijayawada, 520007 India
| | - Subba Rao Polamuri
- Department of CSE, Bonam Venkata Chalamayya Engineering College (Autonomous), Odalarevu, 533210 India
| |
Collapse
|
9
|
Rahman MH, Jannat MKA, Islam MS, Grossi G, Bursic S, Aktaruzzaman M. Real-time face mask position recognition system based on MobileNet model. Smart Health (Amst) 2023; 28:100382. [PMID: 36743719 PMCID: PMC9886393 DOI: 10.1016/j.smhl.2023.100382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 12/23/2022] [Accepted: 01/27/2023] [Indexed: 02/01/2023]
Abstract
COVID-19 is a highly contagious disease that was first identified in 2019, and has since taken more than six million lives world wide till date, while also causing considerable economic, social, cultural and political turmoil. As a way to limit its spread, the World Health Organization and medical experts have advised properly wearing face masks, social distancing and hand sanitization, besides vaccination. However, people wear masks sometimes uncovering their mouths and/or noses consciously or unconsciously, thereby lessening the effectiveness of the protection they provide. A system capable of automatic recognition of face mask position could alert and ensure that an individual is wearing a mask properly before entering a crowded public area and putting themselves and others at risk. We first develop and publicly release a dataset of face mask images, which are collected from 391 individuals of different age groups and gender. Then, we study six different architectures of pre-trained deep learning models, and finally propose a model developed by fine tuning the pre-trained state of the art MobileNet model. We evaluate the performance (accuracy, F1-score, and Cohen's Kappa) of this model on the proposed dataset and MaskedFace-Net, a publicly available synthetic dataset created by image editing. Its performance is also compared to other existing methods. The proposed MobileNet is found as the best model providing an accuracy, F1-score, and Cohen's Kappa of 99.23%, 99.22%, and 99.19%, respectively for face mask position recognition. It outperforms the accuracy of the best existing model by about 2%. Finally, an automatic face mask position recognition system has been developed, which can recognize if an individual is wearing a mask correctly or incorrectly. The proposed model performs very well with no drop in recognition accuracy from real images captured by a camera.
Collapse
Affiliation(s)
- Md Hafizur Rahman
- Department of Electrical and Electronic Engineering, Islamic University, Kushtia 7003, Bangladesh
| | | | - Md Shafiqul Islam
- Department of Computer Science and Engineering, The People's University of Bangladesh, Dhaka 1207, Bangladesh
| | - Giuliano Grossi
- Dipartimento di Informatica, Università degli Studi di Milano, Milan, Italy
| | - Sathya Bursic
- Dipartimento di Informatica, Università degli Studi di Milano, Milan, Italy
| | - Md Aktaruzzaman
- Department of Computer Science and Engineering, Islamic University, Kushtia 7003, Bangladesh
| |
Collapse
|
10
|
Shamsan A, Senan EM, Shatnawi HSA. Automatic Classification of Colour Fundus Images for Prediction Eye Disease Types Based on Hybrid Features. Diagnostics (Basel) 2023; 13:diagnostics13101706. [PMID: 37238190 DOI: 10.3390/diagnostics13101706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 05/06/2023] [Accepted: 05/08/2023] [Indexed: 05/28/2023] Open
Abstract
Early detection of eye diseases is the only solution to receive timely treatment and prevent blindness. Colour fundus photography (CFP) is an effective fundus examination technique. Because of the similarity in the symptoms of eye diseases in the early stages and the difficulty in distinguishing between the type of disease, there is a need for computer-assisted automated diagnostic techniques. This study focuses on classifying an eye disease dataset using hybrid techniques based on feature extraction with fusion methods. Three strategies were designed to classify CFP images for the diagnosis of eye disease. The first method is to classify an eye disease dataset using an Artificial Neural Network (ANN) with features from the MobileNet and DenseNet121 models separately after reducing the high dimensionality and repetitive features using Principal Component Analysis (PCA). The second method is to classify the eye disease dataset using an ANN on the basis of fused features from the MobileNet and DenseNet121 models before and after reducing features. The third method is to classify the eye disease dataset using ANN based on the fused features from the MobileNet and DenseNet121 models separately with handcrafted features. Based on the fused MobileNet and handcrafted features, the ANN attained an AUC of 99.23%, an accuracy of 98.5%, a precision of 98.45%, a specificity of 99.4%, and a sensitivity of 98.75%.
Collapse
Affiliation(s)
- Ahlam Shamsan
- Computer Department, Applied College, Najran University, Najran 66462, Saudi Arabia
| | - Ebrahim Mohammed Senan
- Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Alrazi University, Sana'a, Yemen
| | | |
Collapse
|
11
|
Sajid MZ, Qureshi I, Abbas Q, Albathan M, Shaheed K, Youssef A, Ferdous S, Hussain A. Mobile-HR: An Ophthalmologic-Based Classification System for Diagnosis of Hypertensive Retinopathy Using Optimized MobileNet Architecture. Diagnostics (Basel) 2023; 13:diagnostics13081439. [PMID: 37189539 DOI: 10.3390/diagnostics13081439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 04/13/2023] [Accepted: 04/15/2023] [Indexed: 05/17/2023] Open
Abstract
Hypertensive retinopathy (HR) is a serious eye disease that causes the retinal arteries to change. This change is mainly due to the fact of high blood pressure. Cotton wool patches, bleeding in the retina, and retinal artery constriction are affected lesions of HR symptoms. An ophthalmologist often makes the diagnosis of eye-related diseases by analyzing fundus images to identify the stages and symptoms of HR. The likelihood of vision loss can significantly decrease the initial detection of HR. In the past, a few computer-aided diagnostics (CADx) systems were developed to automatically detect HR eye-related diseases using machine learning (ML) and deep learning (DL) techniques. Compared to ML methods, the CADx systems use DL techniques that require the setting of hyperparameters, domain expert knowledge, a huge training dataset, and a high learning rate. Those CADx systems have shown to be good for automating the extraction of complex features, but they cause problems with class imbalance and overfitting. By ignoring the issues of a small dataset of HR, a high level of computational complexity, and the lack of lightweight feature descriptors, state-of-the-art efforts depend on performance enhancement. In this study, a pretrained transfer learning (TL)-based MobileNet architecture is developed by integrating dense blocks to optimize the network for the diagnosis of HR eye-related disease. We developed a lightweight HR-related eye disease diagnosis system, known as Mobile-HR, by integrating a pretrained model and dense blocks. To increase the size of the training and test datasets, we applied a data augmentation technique. The outcomes of the experiments show that the suggested approach was outperformed in many cases. This Mobile-HR system achieved an accuracy of 99% and an F1 score of 0.99 on different datasets. The results were verified by an expert ophthalmologist. These results indicate that the Mobile-HR CADx model produces positive outcomes and outperforms state-of-the-art HR systems in terms of accuracy.
Collapse
Affiliation(s)
- Muhammad Zaheer Sajid
- Department of Computer Software Engineering, MCS, National University of Science and Technology, Islamabad 44000, Pakistan
| | - Imran Qureshi
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia
| | - Qaisar Abbas
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia
| | - Mubarak Albathan
- College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia
| | - Kashif Shaheed
- Department of Multimedia Systems, Faculty of Electronics, Telecommunication and Informatics, Gdansk University of Technology, 80-233 Gdansk, Poland
| | - Ayman Youssef
- Department of Computers and Systems, Electronics Research Institute, Cairo 12622, Egypt
| | - Sehrish Ferdous
- Department of Software Engineering, National University of Modern Languages, Rawalpindi 44000, Pakistan
| | - Ayyaz Hussain
- Department of Computer Science, Quaid-i-Azam University, Islamabad 44000, Pakistan
| |
Collapse
|
12
|
Mengash HA, Alamgeer M, Maashi M, Othman M, Hamza MA, Ibrahim SS, Zamani AS, Yaseen I. Leveraging Marine Predators Algorithm with Deep Learning for Lung and Colon Cancer Diagnosis. Cancers (Basel) 2023; 15. [PMID: 36900381 DOI: 10.3390/cancers15051591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 02/04/2023] [Accepted: 02/08/2023] [Indexed: 03/08/2023] Open
Abstract
Cancer is a deadly disease caused by various biochemical abnormalities and genetic diseases. Colon and lung cancer have developed as two major causes of disability and death in human beings. The histopathological detection of these malignancies is a vital element in determining the optimal solution. Timely and initial diagnosis of the sickness on either front diminishes the possibility of death. Deep learning (DL) and machine learning (ML) methods are used to hasten such cancer recognition, allowing the research community to examine more patients in a much shorter period and at a less cost. This study introduces a marine predator's algorithm with deep learning as a lung and colon cancer classification (MPADL-LC3) technique. The presented MPADL-LC3 technique aims to properly discriminate different types of lung and colon cancer on histopathological images. To accomplish this, the MPADL-LC3 technique employs CLAHE-based contrast enhancement as a pre-processing step. In addition, the MPADL-LC3 technique applies MobileNet to derive feature vector generation. Meanwhile, the MPADL-LC3 technique employs MPA as a hyperparameter optimizer. Furthermore, deep belief networks (DBN) can be applied for lung and color classification. The simulation values of the MPADL-LC3 technique were examined on benchmark datasets. The comparison study highlighted the enhanced outcomes of the MPADL-LC3 system in terms of different measures.
Collapse
|
13
|
Fatima A, Shafi I, Afzal H, Mahmood K, Díez IDLT, Lipari V, Ballester JB, Ashraf I. Deep Learning-Based Multiclass Instance Segmentation for Dental Lesion Detection. Healthcare (Basel) 2023; 11:healthcare11030347. [PMID: 36766922 PMCID: PMC9914729 DOI: 10.3390/healthcare11030347] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 01/16/2023] [Accepted: 01/19/2023] [Indexed: 01/27/2023] Open
Abstract
Automated dental imaging interpretation is one of the most prolific areas of research using artificial intelligence. X-ray imaging systems have enabled dental clinicians to identify dental diseases. However, the manual process of dental disease assessment is tedious and error-prone when diagnosed by inexperienced dentists. Thus, researchers have employed different advanced computer vision techniques, as well as machine and deep learning models for dental disease diagnoses using X-ray imagery. In this regard, a lightweight Mask-RCNN model is proposed for periapical disease detection. The proposed model is constructed in two parts: a lightweight modified MobileNet-v2 backbone and region-based network (RPN) are proposed for periapical disease localization on a small dataset. To measure the effectiveness of the proposed model, the lightweight Mask-RCNN is evaluated on a custom annotated dataset comprising images of five different types of periapical lesions. The results reveal that the model can detect and localize periapical lesions with an overall accuracy of 94%, a mean average precision of 85%, and a mean insection over a union of 71.0%. The proposed model improves the detection, classification, and localization accuracy significantly using a smaller number of images compared to existing methods and outperforms state-of-the-art approaches.
Collapse
Affiliation(s)
- Anum Fatima
- National Centre for Robotics, National University of Sciences and Technology (NUST), Islamabad 44000, Pakistan
| | - Imran Shafi
- College of Electrical and Mechanical Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, Pakistan
| | - Hammad Afzal
- Military College of Signals (MCS), National University of Sciences and Technology (NUST), Rawalpindi 44000, Pakistan
| | - Khawar Mahmood
- Military College of Signals (MCS), National University of Sciences and Technology (NUST), Rawalpindi 44000, Pakistan
| | - Isabel de la Torre Díez
- Department of Signal Theory and Communications and Telematic Engineering, University of Valladolid, Paseo de Belén 15, 47011 Valladolid, Spain
- Correspondence: (I.d.l.T.D.); (I.A.)
| | - Vivian Lipari
- Research Group on Foods, Nutritional Biochemistry and Health, Universidad Europea del Atlántico, Isabel Torres 21, 39011 Santander, Spain
- Department of Project Management, Universidad Internacional Iberoamericana, Campeche 24560, Mexico
- Fundación Universitaria Internacional de Colombia Bogotá, Bogotá 11001, Colombia
| | - Julien Brito Ballester
- Research Group on Foods, Nutritional Biochemistry and Health, Universidad Europea del Atlántico, Isabel Torres 21, 39011 Santander, Spain
- Department of Project Management, Universidad Internacional Iberoamericana Arecibo, Arecibo, PR 00613, USA
- Project Management, Universidade Internacional do Cuanza, Cuito EN250, Bié, Angola
| | - Imran Ashraf
- Department of Information and Communication Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea
- Correspondence: (I.d.l.T.D.); (I.A.)
| |
Collapse
|
14
|
Kaya Y, Gürsoy E. A MobileNet-based CNN model with a novel fine-tuning mechanism for COVID-19 infection detection. Soft comput 2023; 27:5521-5535. [PMID: 36618761 PMCID: PMC9812349 DOI: 10.1007/s00500-022-07798-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/24/2022] [Indexed: 01/05/2023]
Abstract
COVID-19 is a virus that causes upper respiratory tract and lung infections. The number of cases and deaths increased daily during the pandemic. Once it is vital to diagnose such a disease in a timely manner, the researchers have focused on computer-aided diagnosis systems. Chest X-rays have helped monitor various lung diseases consisting COVID-19. In this study, we proposed a deep transfer learning approach with novel fine-tuning mechanisms to classify COVID-19 from chest X-ray images. We presented one classical and two new fine-tuning mechanisms to increase the model's performance. Two publicly available databases were combined and used for the study, which included 3616 COVID-19 and 1576 normal (healthy) and 4265 pneumonia X-ray images. The models achieved average accuracy rates of 95.62%, 96.10%, and 97.61%, respectively, for 3-class cases with fivefold cross-validation. Numerical results show that the third model reduced 81.92% of the total fine-tuning operations and achieved better results. The proposed approach is quite efficient compared with other state-of-the-art methods of detecting COVID-19.
Collapse
Affiliation(s)
- Yasin Kaya
- Department of Computer Engineering, Adana Alparslan Turkes Science and Technology University, Adana, Turkey
| | - Ercan Gürsoy
- Department of Computer Engineering, Adana Alparslan Turkes Science and Technology University, Adana, Turkey
| |
Collapse
|
15
|
Tseng FH, Yeh KH, Kao FY, Chen CY. MiniNet: Dense squeeze with depthwise separable convolutions for image classification in resource-constrained autonomous systems. ISA Trans 2023; 132:120-130. [PMID: 36038366 DOI: 10.1016/j.isatra.2022.07.030] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 07/26/2022] [Accepted: 07/26/2022] [Indexed: 06/15/2023]
Abstract
In recent years, artificial intelligence (AI) has been developed vigorously, and a great number of AI autonomous applications have been proposed. However, how to decrease computations and shorten training time with high accuracy under the limited hardware resource is a vital issue. In this paper, on the basis of MobileNet architecture, the dense squeeze with depthwise separable convolutions model is proposed, viz. MiniNet. MiniNet utilizes depthwise and pointwise convolutions, and is composed of the dense connection technique and the Squeeze-and-Excitation operations. The proposed MiniNet model is implemented and experimented with Keras. In experimental results, MiniNet is compared with three existing models, i.e., DenseNet, MobileNet, and SE-Inception-Resnet-v1. To validate that the proposed MiniNet model is provided with less computation and shorter training time, two types as well as large and small datasets are used. The experimental results showed that the proposed MiniNet model significantly reduces the number of parameters and shortens training time efficiently. MiniNet is superior to other models in terms of the lowest parameters, shortest training time, and highest accuracy when the dataset is small, especially.
Collapse
Affiliation(s)
- Fan-Hsun Tseng
- Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan.
| | - Kuo-Hui Yeh
- Department of Information Management, National Dong Hwa University, Hualien, Taiwan; Computer Science and Engineering Department of National Sun Yat-sen University, Kaohsiung 804, Taiwan.
| | - Fan-Yi Kao
- Department of Technology Application and Human Resource Development, National Taiwan Normal University, Taipei, Taiwan.
| | - Chi-Yuan Chen
- Department of Computer Science and Information Engineering, National Ilan University, Yilan, Taiwan.
| |
Collapse
|
16
|
Turk O, Ozhan D, Acar E, Akinci TC, Yilmaz M. Automatic detection of brain tumors with the aid of ensemble deep learning architectures and class activation map indicators by employing magnetic resonance images. Z Med Phys 2022:S0939-3889(22)00131-3. [PMID: 36593139 DOI: 10.1016/j.zemedi.2022.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 11/25/2022] [Indexed: 01/01/2023]
Abstract
Today, as in every life-threatening disease, early diagnosis of brain tumors plays a life-saving role. The brain tumor is formed by the transformation of brain cells from their normal structures into abnormal cell structures. These formed abnormal cells begin to form in masses in the brain regions. Nowadays, many different techniques are employed to detect these tumor masses, and the most common of these techniques is Magnetic Resonance Imaging (MRI). In this study, it is aimed to automatically detect brain tumors with the help of ensemble deep learning architectures (ResNet50, VGG19, InceptionV3 and MobileNet) and Class Activation Maps (CAMs) indicators by employing MRI images. The proposed system was implemented in three stages. In the first stage, it was determined whether there was a tumor in the MR images (Binary Approach). In the second stage, different tumor types (Normal, Glioma Tumor, Meningioma Tumor, Pituitary Tumor) were detected from MR images (Multi-class Approach). In the last stage, CAMs of each tumor group were created as an alternative tool to facilitate the work of specialists in tumor detection. The results showed that the overall accuracy of the binary approach was calculated as 100% on the ResNet50, InceptionV3 and MobileNet architectures, and 99.71% on the VGG19 architecture. Moreover, the accuracy values of 96.45% with ResNet50, 93.40% with VGG19, 85.03% with InceptionV3 and 89.34% with MobileNet architectures were obtained in the multi-class approach.
Collapse
Affiliation(s)
- Omer Turk
- Department of Computer Programming, Vocational School, Mardin Artuklu University, 47500 Mardin, Turkey.
| | - Davut Ozhan
- Department of Electronics, Vocational School, Mardin Artuklu University, 47500 Mardin, Turkey.
| | - Emrullah Acar
- Department of Electrical-Electronics Engineering, Architecture and Engineering Faculty, Batman University, Batman, Turkey.
| | - Tahir Cetin Akinci
- WCGEC, University of California Riverside, Riverside, CA, USA; Department of Electrical Engineering, Istanbul Technical University, Istanbul, Turkey.
| | - Musa Yilmaz
- Department of Electrical-Electronics Engineering, Architecture and Engineering Faculty, Batman University, Batman, Turkey.
| |
Collapse
|
17
|
Shinde RK, Alam MS, Hossain MB, Md Imtiaz S, Kim J, Padwal AA, Kim N. Squeeze-MNet: Precise Skin Cancer Detection Model for Low Computing IoT Devices Using Transfer Learning. Cancers (Basel) 2022; 15:cancers15010012. [PMID: 36612010 PMCID: PMC9817940 DOI: 10.3390/cancers15010012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2022] [Revised: 12/15/2022] [Accepted: 12/16/2022] [Indexed: 12/24/2022] Open
Abstract
Cancer remains a deadly disease. We developed a lightweight, accurate, general-purpose deep learning algorithm for skin cancer classification. Squeeze-MNet combines a Squeeze algorithm for digital hair removal during preprocessing and a MobileNet deep learning model with predefined weights. The Squeeze algorithm extracts important image features from the image, and the black-hat filter operation removes noise. The MobileNet model (with a dense neural network) was developed using the International Skin Imaging Collaboration (ISIC) dataset to fine-tune the model. The proposed model is lightweight; the prototype was tested on a Raspberry Pi 4 Internet of Things device with a Neo pixel 8-bit LED ring; a medical doctor validated the device. The average precision (AP) for benign and malignant diagnoses was 99.76% and 98.02%, respectively. Using our approach, the required dataset size decreased by 66%. The hair removal algorithm increased the accuracy of skin cancer detection to 99.36% with the ISIC dataset. The area under the receiver operating curve was 98.9%.
Collapse
Affiliation(s)
- Rupali Kiran Shinde
- Department of Information and Communication Engineering, Chungbuk National University, Cheongju 28644, Republic of Korea
| | | | - Md. Biddut Hossain
- Department of Information and Communication Engineering, Chungbuk National University, Cheongju 28644, Republic of Korea
| | - Shariar Md Imtiaz
- Department of Information and Communication Engineering, Chungbuk National University, Cheongju 28644, Republic of Korea
| | - JoonHyun Kim
- Department of Information and Communication Engineering, Chungbuk National University, Cheongju 28644, Republic of Korea
| | | | - Nam Kim
- Department of Information and Communication Engineering, Chungbuk National University, Cheongju 28644, Republic of Korea
- Correspondence:
| |
Collapse
|
18
|
Xiong Q, Zhang X, Wang X, Qiao N, Shen J. Robust Iris-Localization Algorithm in Non-Cooperative Environments Based on the Improved YOLO v4 Model. Sensors (Basel) 2022; 22:9913. [PMID: 36560280 PMCID: PMC9785435 DOI: 10.3390/s22249913] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 12/13/2022] [Accepted: 12/13/2022] [Indexed: 06/17/2023]
Abstract
Iris localization in non-cooperative environments is challenging and essential for accurate iris recognition. Motivated by the traditional iris-localization algorithm and the robustness of the YOLO model, we propose a novel iris-localization algorithm. First, we design a novel iris detector with a modified you only look once v4 (YOLO v4) model. We can approximate the position of the pupil center. Then, we use a modified integro-differential operator to precisely locate the iris inner and outer boundaries. Experiment results show that iris-detection accuracy can reach 99.83% with this modified YOLO v4 model, which is higher than that of a traditional YOLO v4 model. The accuracy in locating the inner and outer boundary of the iris without glasses can reach 97.72% at a short distance and 98.32% at a long distance. The locating accuracy with glasses can obtained at 93.91% and 84%, respectively. It is much higher than the traditional Daugman's algorithm. Extensive experiments conducted on multiple datasets demonstrate the effectiveness and robustness of our method for iris localization in non-cooperative environments.
Collapse
Affiliation(s)
- Qi Xiong
- MOE Key Lab for Intelligent Networks and Network Security, School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China
- International Collage, Hunan University of Arts and Sciences, Changde 415000, China
| | - Xinman Zhang
- MOE Key Lab for Intelligent Networks and Network Security, School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China
| | - Xingzhu Wang
- International Collage, Hunan University of Arts and Sciences, Changde 415000, China
| | - Naosheng Qiao
- International Collage, Hunan University of Arts and Sciences, Changde 415000, China
| | - Jun Shen
- School of Computing and Information Technology, University of Wollongong, Wollongong, NSW 2522, Australia
| |
Collapse
|
19
|
Asghar MZ, Albogamy FR, Al-Rakhami MS, Asghar J, Rahmat MK, Alam MM, Lajis A, Nasir HM. Facial Mask Detection Using Depthwise Separable Convolutional Neural Network Model During COVID-19 Pandemic. Front Public Health 2022; 10:855254. [PMID: 35321193 PMCID: PMC8936807 DOI: 10.3389/fpubh.2022.855254] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 01/31/2022] [Indexed: 11/13/2022] Open
Abstract
Deep neural networks have made tremendous strides in the categorization of facial photos in the last several years. Due to the complexity of features, the enormous size of the picture/frame, and the severe inhomogeneity of image data, efficient face image classification using deep convolutional neural networks remains a challenge. Therefore, as data volumes continue to grow, the effective categorization of face photos in a mobile context utilizing advanced deep learning techniques is becoming increasingly important. In the recent past, some Deep Learning (DL) approaches for learning to identify face images have been designed; many of them use convolutional neural networks (CNNs). To address the problem of face mask recognition in facial images, we propose to use a Depthwise Separable Convolution Neural Network based on MobileNet (DWS-based MobileNet). The proposed network utilizes depth-wise separable convolution layers instead of 2D convolution layers. With limited datasets, the DWS-based MobileNet performs exceptionally well. DWS-based MobileNet decreases the number of trainable parameters while enhancing learning performance by adopting a lightweight network. Our technique outperformed the existing state of the art when tested on benchmark datasets. When compared to Full Convolution MobileNet and baseline methods, the results of this study reveal that adopting Depthwise Separable Convolution-based MobileNet significantly improves performance (Acc. = 93.14, Pre. = 92, recall = 92, F-score = 92).
Collapse
Affiliation(s)
- Muhammad Zubair Asghar
- Center for Research & Innovation, CoRI, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia
- Institute of Computing and Information Technology, Gomal University, Dera Ismail Khan, Pakistan
| | - Fahad R. Albogamy
- Computer Sciences Program, Turabah University College, Taif University, Taif, Saudi Arabia
| | - Mabrook S. Al-Rakhami
- Research Chair of Pervasive and Mobile Computing, Information Systems Department, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Junaid Asghar
- Faculty of Pharmacy, Gomal University, Dera Ismail Khan, Pakistan
| | - Mohd Khairil Rahmat
- Center for Research & Innovation, CoRI, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia
| | - Muhammad Mansoor Alam
- Center for Research & Innovation, CoRI, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia
- Faculty of Computing, Riphah International University, Islamabad, Pakistan
- Malaysian Institute of Information Technology, University of Kuala Lumpur, Kuala Lumpur, Malaysia
- Faculty of Computing and Informatics, Multimedia University, Cyberjaya, Malaysia
- Faculty of Engineering and Information Technology, School of Computer Science, University of Technology Sydney, Ultimo, NSW, Australia
| | - Adidah Lajis
- Center for Research & Innovation, CoRI, Universiti Kuala Lumpur, Kuala Lumpur, Malaysia
| | | |
Collapse
|
20
|
Tiwari S, Jain A. A lightweight capsule network architecture for detection of COVID-19 from lung CT scans. Int J Imaging Syst Technol 2022; 32:419-434. [PMID: 35465213 PMCID: PMC9015631 DOI: 10.1002/ima.22706] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 11/22/2021] [Accepted: 01/04/2022] [Indexed: 05/28/2023]
Abstract
COVID-19, a novel coronavirus, has spread quickly and produced a worldwide respiratory ailment outbreak. There is a need for large-scale screening to prevent the spreading of the disease. When compared with the reverse transcription polymerase chain reaction (RT-PCR) test, computed tomography (CT) is far more consistent, concrete, and precise in detecting COVID-19 patients through clinical diagnosis. An architecture based on deep learning has been proposed by integrating a capsule network with different variants of convolution neural networks. DenseNet, ResNet, VGGNet, and MobileNet are utilized with CapsNet to detect COVID-19 cases using lung computed tomography scans. It has found that all the four models are providing adequate accuracy, among which the VGGCapsNet, DenseCapsNet, and MobileCapsNet models have gained the highest accuracy of 99%. An Android-based app can be deployed using MobileCapsNet model to detect COVID-19 as it is a lightweight model and best suited for handheld devices like a mobile.
Collapse
Affiliation(s)
- Shamik Tiwari
- School of Computer ScienceUniversity of Petroleum and Energy StudiesDehradunUttarakhandIndia
| | - Anurag Jain
- School of Computer ScienceUniversity of Petroleum and Energy StudiesDehradunUttarakhandIndia
| |
Collapse
|
21
|
Trivedi M, Gupta A. A lightweight deep learning architecture for the automatic detection of pneumonia using chest X-ray images. Multimed Tools Appl 2022; 81:5515-5536. [PMID: 34975283 PMCID: PMC8711865 DOI: 10.1007/s11042-021-11807-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 08/26/2021] [Accepted: 12/14/2021] [Indexed: 05/07/2023]
Abstract
Pneumonia is a life-threatening respiratory lung disease. Children are more prone to be affected by the disease and accurate manual detection is not easy. Generally, chest radiographs are used for the manual detection of pneumonia and expert radiologists are required for the assessment of the X-ray images. An automatic system would be beneficial for the diagnosis of pneumonia based on chest radiographs as manual detection is time-consuming and tedious. Therefore, a method is proposed in this paper for the fast and automatic detection of pneumonia. A deep learning-based architecture 'MobileNet' is proposed for the automatic detection of pneumonia based on the chest X-ray images. A benchmark dataset of 5856 chest X-ray images was taken for the training, testing, and evaluation of the proposed deep learning network. The proposed model was trained within 3 Hrs. and achieved a training accuracy of 97.34%, a validation accuracy of 87.5%, and a testing accuracy of 94.23% for automatic detection of pneumonia. However, the combined accuracy was achieved as 97.09% with 0.96 specificity, 0.97 precision, 0.98 recall, and 0.97 F-Score. The proposed method was found faster and computationally lesser expensive as compared to other methods in the literature and achieved a promising accuracy.
Collapse
Affiliation(s)
- Megha Trivedi
- School of Electronics and Communication Engineering, Shri Mata Vaishno Devi University, Kakryal, Katra, Jammu and Kashmir 182 320 India
| | - Abhishek Gupta
- School of Computer Science & Engineering, Shri Mata Vaishno Devi University, Kakryal, Katra, Jammu and Kashmir 182 320 India
| |
Collapse
|
22
|
Zha M, Qian W, Yi W, Hua J. A Lightweight YOLOv4-Based Forestry Pest Detection Method Using Coordinate Attention and Feature Fusion. Entropy (Basel) 2021; 23:1587. [PMID: 34945892 PMCID: PMC8700145 DOI: 10.3390/e23121587] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/12/2021] [Accepted: 11/22/2021] [Indexed: 11/16/2022]
Abstract
Traditional pest detection methods are challenging to use in complex forestry environments due to their low accuracy and speed. To address this issue, this paper proposes the YOLOv4_MF model. The YOLOv4_MF model utilizes MobileNetv2 as the feature extraction block and replaces the traditional convolution with depth-wise separated convolution to reduce the model parameters. In addition, the coordinate attention mechanism was embedded in MobileNetv2 to enhance feature information. A symmetric structure consisting of a three-layer spatial pyramid pool is presented, and an improved feature fusion structure was designed to fuse the target information. For the loss function, focal loss was used instead of cross-entropy loss to enhance the network's learning of small targets. The experimental results showed that the YOLOv4_MF model has 4.24% higher mAP, 4.37% higher precision, and 6.68% higher recall than the YOLOv4 model. The size of the proposed model was reduced to 1/6 of that of YOLOv4. Moreover, the proposed algorithm achieved 38.62% mAP with respect to some state-of-the-art algorithms on the COCO dataset.
Collapse
Affiliation(s)
| | - Wenbin Qian
- School of Software, Jiangxi Agricultural University, Nanchang 330045, China; (M.Z.); (W.Y.); (J.H.)
| | | | | |
Collapse
|
23
|
Gang L, Haixuan Z, Linning E, Ling Z, Yu L, Juming Z. Recognition of honeycomb lung in CT images based on improved MobileNet model. Med Phys 2021; 48:4304-4315. [PMID: 33826769 DOI: 10.1002/mp.14873] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 03/10/2021] [Accepted: 03/26/2021] [Indexed: 11/07/2022] Open
Abstract
PURPOSE The research is to improve the efficiency and accuracy of recognition of honeycomb lung in CT images. METHODS Deep learning methods are used to achieve automatic recognition of honeycomb lung in CT images, however, are time consuming and less accurate due to the large amount of structural parameters. In this paper, a novel recognition method based on MobileNetV1 network, multiscale feature fusion method (MSFF), and dilated convolution is explored to deal with honeycomb lung in CT image classification. Firstly, the dilated convolution with different dilated rate is used to extract features to obtain receptive fields of different sizes, and then fuse the features of different scales at multiscale feature fusion block is used to solve the problem of feature loss and incomplete feature extraction. After that, by using linear activation functions (Sigmoid) instead of nonlinear activation functions (ReLu) in the improved deep separable convolution blocks to retain the feature information of each channel. Finally, by reducing the number of improved deep separable blocks to reduce the computation and resource consumption of the model. RESULTS The experimental results show that improved MobileNet model has the best performance and the potential for recognition of honeycomb lung image datasets, which includes 6318 images. By comparing with 4 traditional models (SVM, RF, decision tree, and KNN) and 11 deep learning models (LeNet-5, AlexNet, VGG-16, GoogleNet, ResNet18, DenseNet121, SENet18, InceptionV3, InceptionV4, Xception, and MobileNetV1), our model achieved the performance with an accuracy of 99.52%, a sensitivity of 99.35%, and a specificity of 99.89%. CONCLUSION Improved MobileNet model is designed for the automatic recognition and classification of honeycomb lung in CT images. Through experiments comparative analysis of other models of machine learning and deep learning, it is proved that the proposed improved MobileNet method has the best recognition accuracy with fewer the model parameters and less the calculation time.
Collapse
Affiliation(s)
- Li Gang
- College of Software, Taiyuan University of Technology, Taiyuan, China
| | - Zhang Haixuan
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| | - E Linning
- Shanxi Bethune Hospital, Taiyuan, China
| | - Zhang Ling
- College of Software, Taiyuan University of Technology, Taiyuan, China
| | - Li Yu
- College of Data Science, Taiyuan University of Technology, Taiyuan, China
| | - Zhao Juming
- College of Information and Computer, Taiyuan University of Technology, Taiyuan, China
| |
Collapse
|
24
|
Loveymi S, Dezfoulian MH, Mansoorizadeh M. Automatic Generation of Structured Radiology Reports for Volumetric Computed Tomography Images Using Question-Specific Deep Feature Extraction and Learning. J Med Signals Sens 2021; 11:194-207. [PMID: 34466399 PMCID: PMC8382036 DOI: 10.4103/jmss.jmss_21_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Revised: 06/20/2020] [Accepted: 09/23/2020] [Indexed: 11/04/2022]
Abstract
BACKGROUND In today's modern medicine, the use of radiological imaging devices has spread at medical centers. Therefore, the need for accurate, reliable, and portable medical image analysis and understanding systems has been increasing constantly. Accompanying images with the required clinical information, in the form of structured reports, is very important, because images play a pivotal role in detect, planning, and diagnosis of different diseases. Report-writing can be exposure to error, tedious and labor-intensive for physicians and radiologists; to address these issues, there is a need for systems that generate medical image reports automatically and efficiently. Thus, automatic report generation systems are among the most desired applications. METHODS This research proposes an automatic structured-radiology report generation system that is based on deep learning methods. Extracting useful and descriptive image features to model the conceptual contents of the images is one of the main challenges in this regard. Considering the ability of deep neural networks (DNNs) in soliciting informative and effective features as well as lower resource requirements, tailored convolutional neural networks and MobileNets are employed as the main building blocks of the proposed system. To cope with challenges such as multi-slice medical images and diversity of questions asked in a radiology report, our system develops volume-level and question-specific deep features using DNNs. RESULTS We demonstrate the effectiveness of the proposed system on ImageCLEF2015 Liver computed tomography (CT) annotation task, for filling in a structured radiology report about liver CT. The results confirm the efficiency of the proposed approach, as compared to classic annotation methods. CONCLUSION We have proposed a question-specific DNNbased system for filling in structured radiology reports about medical images.
Collapse
Affiliation(s)
- Samira Loveymi
- Department of Computer Engineering, Bu-Ali Sina University, Hamedan, Iran
| | | | | |
Collapse
|
25
|
Olivas LG, Alférez GH, Castillo J. Glaucoma detection in Latino population through OCT's RNFL thickness map using transfer learning. Int Ophthalmol 2021; 41:3727-3741. [PMID: 34212255 DOI: 10.1007/s10792-021-01931-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 06/19/2021] [Indexed: 11/25/2022]
Abstract
PURPOSE Glaucoma is the leading cause of irreversible blindness worldwide. It is estimated that over 60 million people around the world have this disease, with only part of them knowing they have it. Timely and early diagnosis is vital to delay/prevent patient blindness. Deep learning (DL) could be a tool for ophthalmologists to give a more informed and objective diagnosis. However, there is a lack of studies that apply DL for glaucoma detection to Latino population. Our contribution is to use transfer learning to retrain MobileNet and Inception V3 models with images of the retinal nerve fiber layer thickness map of Mexican patients, obtained with optical coherence tomography (OCT) from the Instituto de la Visión, a clinic in the northern part of Mexico. METHODS The IBM Foundational Methodology for Data Science was used in this study. The MobileNet and Inception V3 topologies were chosen as the analytical approaches to classify OCT images in two classes, namely glaucomatous and non-glaucomatous. The OCT files were collected from a Zeiss OCT machine at the Instituto de la Visión, and classified by an expert into the two classes under study. These images conform a dataset of 333 files in total. Since this research work is focused on RNFL thickness map images, the OCT files were cropped to obtain only the RNFL thickness map images of the corresponding eye. This action was carried out for images in both classes, glaucomatous and non-glaucomatous. Since some images were damaged (with black spots in which data was missing), these images were cut-out and cut-off. After the preparation process, 50 images per class were used for training. Fifteen images per class, different than the ones used in the training stage, were used for running predictions. In total, 260 images were used in the experiments, 130 per eye. Four models were generated, two trained with MobileNet, one for the left eye and one for the right eye, and another two trained with Inception V3. TensorFlow was used for running transfer learning. RESULTS The evaluation results of the MobileNet model for the left eye are, accuracy: 86%, precision: 87%, recall: 87%, and F1 score: 87%. The evaluation results of the MobileNet model for the right eye are, accuracy: 90%, precision: 90%, recall: 90%, and F1 score: 90%. The evaluation results of the Inception V3 model for the left eye are, accuracy: 90%, precision: 90%, recall: 90%, and F1 score: 90%. The evaluation results of the Inception V3 model for the right eye are, accuracy: 90%, precision: 90%, recall: 90%, and F1 score: 90%. CONCLUSION In average, the evaluation results for right eye images were the same for both models. The Inception V3 model showed slight better average results than the MobileNet model in the case of classifying left eye images.
Collapse
Affiliation(s)
- Liza G Olivas
- School of Engineering and Technology, Universidad de Montemorelos, Montemorelos, NL, Mexico
| | - Germán H Alférez
- School of Engineering and Technology, Universidad de Montemorelos, Montemorelos, NL, Mexico.
| | - Javier Castillo
- School of Medicine, Universidad de Montemorelos, Montemorelos, NL, Mexico
| |
Collapse
|
26
|
Arora V, Ng EYK, Leekha RS, Darshan M, Singh A. Transfer learning-based approach for detecting COVID-19 ailment in lung CT scan. Comput Biol Med 2021; 135:104575. [PMID: 34153789 PMCID: PMC8196483 DOI: 10.1016/j.compbiomed.2021.104575] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 06/08/2021] [Accepted: 06/09/2021] [Indexed: 12/29/2022]
Abstract
This research work aims to identify COVID-19 through deep learning models using lung CT-SCAN images. In order to enhance lung CT scan efficiency, a super-residual dense neural network was applied. The experimentation has been carried out using benchmark datasets like SARS-COV-2 CT-Scan and Covid-CT Scan. To mark COVID-19 as positive or negative for the improved CT scan, existing pre-trained models such as XceptionNet, MobileNet, InceptionV3, DenseNet, ResNet50, and VGG (Visual Geometry Group)16 have been used. Taking CT scans with super resolution using a residual dense neural network in the pre-processing step resulted in improving the accuracy, F1 score, precision, and recall of the proposed model. On the dataset Covid-CT Scan and SARS-COV-2 CT-Scan, the MobileNet model provided a precision of 94.12% and 100% respectively.
Collapse
Affiliation(s)
- Vinay Arora
- Computer Science & Engineering Department, Thapar Institute of Engineering and Technology, Patiala, Punjab, India.
| | - Eddie Yin-Kwee Ng
- School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore.
| | | | - Medhavi Darshan
- Department of Mathematics, Kamala Nehru College, University of Delhi, Delhi, India.
| | | |
Collapse
|
27
|
Qin N, Liu L, Huang D, Wu B, Zhang Z. LeanNet: An Efficient Convolutional Neural Network for Digital Number Recognition in Industrial Products. Sensors (Basel) 2021; 21:3620. [PMID: 34067467 DOI: 10.3390/s21113620] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 05/17/2021] [Accepted: 05/19/2021] [Indexed: 11/16/2022]
Abstract
The remarkable success of convolutional neural networks (CNNs) in computer vision tasks is shown in large-scale datasets and high-performance computing platforms. However, it is infeasible to deploy large CNNs on resource constrained platforms, such as embedded devices, on account of the huge overhead. To recognize the label numbers of industrial black material product and deploy deep CNNs in real-world applications, this research uses an efficient method to simultaneously (a) reduce the network model size and (b) lower the amount of calculation without compromising accuracy. More specifically, the method is implemented by pruning channels and corresponding filters that are identified as having a trivial effect on the output accuracy. In this paper, we prune VGG-16 to obtain a compact network called LeanNet, which gives a 25× reduction in model size and a 4.5× reduction in float point operations (FLOPs), while the accuracy on our dataset is close to the original accuracy by retraining the network. Besides, we also find that LeanNet could achieve better performance on reductions in model size and computation compared to some lightweight networks like MobileNet and SqueezeNet, which are widely used in engineering applications. This research has good application value in the field of industrial production.
Collapse
|
28
|
Abstract
Efficient methods developed with deep learning in the last ten years have provided objectivity and high accuracy in the diagnosis of skin diseases. They also support accurate, cost-effective and timely treatment. In addition, they provide diagnoses without the need to touch patients, which is very desirable when the disease is contagious or the patients have another contagious disease. On the other hand, it is not possible to run deep networks on resource-constrained devices (e.g., mobile phones). Therefore, lightweight network architectures have been proposed in the literature. However, merely a few mobile applications have been developed for the diagnosis of skin diseases from colored photographs using lightweight networks. Moreover, only a few types of skin diseases have been addressed in those applications. Additionally, they do not perform as well as the deep network models, particularly for pattern recognition. Therefore, in this study, a novel model has been constructed using MobileNet. Also, a novel loss function has been developed and used. The main contributions of this study are: (i) proposing a novel hybrid loss function; (ii) proposing a modified-MobileNet architecture; (iii) designing and implementing a mobile phone application with the modified-MobileNet and a user-friendly interface. Results indicated that the proposed technique can diagnose skin diseases with 94.76% accuracy.
Collapse
Affiliation(s)
- Evgin Goceri
- Department of Biomedical Engineering, Engineering Faculty, Akdeniz University, Turkey.
| |
Collapse
|
29
|
Srinivasu PN, SivaSai JG, Ijaz MF, Bhoi AK, Kim W, Kang JJ. Classification of Skin Disease Using Deep Learning Neural Networks with MobileNet V2 and LSTM. Sensors (Basel) 2021; 21:2852. [PMID: 33919583 PMCID: PMC8074091 DOI: 10.3390/s21082852] [Citation(s) in RCA: 142] [Impact Index Per Article: 47.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Revised: 04/08/2021] [Accepted: 04/16/2021] [Indexed: 12/18/2022]
Abstract
Deep learning models are efficient in learning the features that assist in understanding complex patterns precisely. This study proposed a computerized process of classifying skin disease through deep learning based MobileNet V2 and Long Short Term Memory (LSTM). The MobileNet V2 model proved to be efficient with a better accuracy that can work on lightweight computational devices. The proposed model is efficient in maintaining stateful information for precise predictions. A grey-level co-occurrence matrix is used for assessing the progress of diseased growth. The performance has been compared against other state-of-the-art models such as Fine-Tuned Neural Networks (FTNN), Convolutional Neural Network (CNN), Very Deep Convolutional Networks for Large-Scale Image Recognition developed by Visual Geometry Group (VGG), and convolutional neural network architecture that expanded with few changes. The HAM10000 dataset is used and the proposed method has outperformed other methods with more than 85% accuracy. Its robustness in recognizing the affected region much faster with almost 2× lesser computations than the conventional MobileNet model results in minimal computational efforts. Furthermore, a mobile application is designed for instant and proper action. It helps the patient and dermatologists identify the type of disease from the affected region's image at the initial stage of the skin disease. These findings suggest that the proposed system can help general practitioners efficiently and effectively diagnose skin conditions, thereby reducing further complications and morbidity.
Collapse
Affiliation(s)
- Parvathaneni Naga Srinivasu
- Department of Computer Science and Engineering, Gitam Institute of Technology, GITAM Deemed to be University, Rushikonda, Visakhapatnam 530045, India;
| | | | - Muhammad Fazal Ijaz
- Department of Intelligent Mechatronics Engineering, Sejong University, Seoul 05006, Korea;
| | - Akash Kumar Bhoi
- Department of Electrical and Electronics Engineering, Sikkim Manipal Institute of Technology, Sikkim Manipal University, Majitar 737136, India;
| | - Wonjoon Kim
- Division of Future Convergence (HCI Science Major), Dongduk Women’s University, Seoul 02748, Korea
| | - James Jin Kang
- School of Science, Edith Cowan University, Joondalup 6027, Australia
| |
Collapse
|
30
|
Silva D, Sousa A, Costa V. A Comparative Analysis for 2D Object Recognition: A Case Study with Tactode Puzzle-Like Tiles. J Imaging 2021; 7:65. [PMID: 34460515 PMCID: PMC8321360 DOI: 10.3390/jimaging7040065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 03/26/2021] [Accepted: 03/27/2021] [Indexed: 11/17/2022] Open
Abstract
Object recognition represents the ability of a system to identify objects, humans or animals in images. Within this domain, this work presents a comparative analysis among different classification methods aiming at Tactode tile recognition. The covered methods include: (i) machine learning with HOG and SVM; (ii) deep learning with CNNs such as VGG16, VGG19, ResNet152, MobileNetV2, SSD and YOLOv4; (iii) matching of handcrafted features with SIFT, SURF, BRISK and ORB; and (iv) template matching. A dataset was created to train learning-based methods (i and ii), and with respect to the other methods (iii and iv), a template dataset was used. To evaluate the performance of the recognition methods, two test datasets were built: tactode_small and tactode_big, which consisted of 288 and 12,000 images, holding 2784 and 96,000 regions of interest for classification, respectively. SSD and YOLOv4 were the worst methods for their domain, whereas ResNet152 and MobileNetV2 showed that they were strong recognition methods. SURF, ORB and BRISK demonstrated great recognition performance, while SIFT was the worst of this type of method. The methods based on template matching attained reasonable recognition results, falling behind most other methods. The top three methods of this study were: VGG16 with an accuracy of 99.96% and 99.95% for tactode_small and tactode_big, respectively; VGG19 with an accuracy of 99.96% and 99.68% for the same datasets; and HOG and SVM, which reached an accuracy of 99.93% for tactode_small and 99.86% for tactode_big, while at the same time presenting average execution times of 0.323 s and 0.232 s on the respective datasets, being the fastest method overall. This work demonstrated that VGG16 was the best choice for this case study, since it minimised the misclassifications for both test datasets.
Collapse
Affiliation(s)
- Daniel Silva
- Institute for Systems and Computer Engineering, Technology and Science (INESC TEC), 4200-465 Porto, Portugal;
- Department of Engineering, University of Trás-os-Montes e Alto Douro (UTAD), 5000-801 Vila Real, Portugal
| | - Armando Sousa
- Institute for Systems and Computer Engineering, Technology and Science (INESC TEC), 4200-465 Porto, Portugal;
- Faculty of Engineering, University of Porto (FEUP), 4200-465 Porto, Portugal
| | - Valter Costa
- Institute of Science and Innovation in Mechanical and Industrial Engineering (INEGI), 4200-465 Porto, Portugal;
| |
Collapse
|
31
|
Kulkarni U, S M M, Gurlahosur SV, Bhogar G. Quantization Friendly MobileNet (QF-MobileNet) Architecture for Vision Based Applications on Embedded Platforms. Neural Netw 2021; 136:28-39. [PMID: 33429131 DOI: 10.1016/j.neunet.2020.12.022] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 11/10/2020] [Accepted: 12/23/2020] [Indexed: 11/17/2022]
Abstract
Deep Neural Networks (DNNs) have become popular for various applications in the domain of image and computer vision due to their well-established performance attributes. DNN algorithms involve powerful multilevel feature extractions resulting in an extensive range of parameters and memory footprints. However, memory bandwidth requirements, memory footprint and the associated power consumption of models are issues to be addressed to deploy DNN models on embedded platforms for real time vision-based applications. We present an optimized DNN model for memory and accuracy for vision-based applications on embedded platforms. In this paper we propose Quantization Friendly MobileNet (QF-MobileNet) architecture. The architecture is optimized for inference accuracy and reduced resource utilization. The optimization is obtained by addressing the redundancy and quantization loss of the existing baseline MobileNet architectures. We verify and validate the performance of the QF-MobileNet architecture for image classification task on the ImageNet dataset. The proposed model is tested for inference accuracy and resource utilization and compared to the baseline MobileNet architecture. The inference accuracy of the proposed QF-MobileNetV2 float model attained 73.36% and the quantized model has 69.51%. The MobileNetV3 float model attained an inference accuracy of 68.75% and the quantized model has 67.5% respectively. The proposed model saves 33% of time complexity for QF-MobileNetV2 and QF-MobileNetV3 models against the baseline models. The QF-MobileNet also showed optimized resource utilization with 32% fewer tunable parameters, 30% fewer MAC's operations per image and reduced inference quantization loss by approximately 5% compared to the baseline models. The model is ported onto the android application using TensorFlow API. The android application performs inference on the native devices viz. smartphones, tablets and handheld devices. Future work is focused on introducing channel-wise and layer-wise quantization schemes to the proposed model. We intend to explore quantization aware training of DNN algorithms to achieve optimized resource utilization and inference accuracy.
Collapse
Affiliation(s)
- Uday Kulkarni
- School of Computer Science & Engineering, KLE Technological University, Hubballi, India.
| | - Meena S M
- School of Computer Science & Engineering, KLE Technological University, Hubballi, India.
| | - Sunil V Gurlahosur
- School of Computer Science & Engineering, KLE Technological University, Hubballi, India.
| | - Gopal Bhogar
- School of Computer Science & Engineering, KLE Technological University, Hubballi, India.
| |
Collapse
|
32
|
Yu D, Xu Q, Guo H, Zhao C, Lin Y, Li D. An Efficient and Lightweight Convolutional Neural Network for Remote Sensing Image Scene Classification. Sensors (Basel) 2020; 20:E1999. [PMID: 32252483 DOI: 10.3390/s20071999] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2020] [Revised: 03/31/2020] [Accepted: 04/01/2020] [Indexed: 11/23/2022]
Abstract
Classifying remote sensing images is vital for interpreting image content. Presently, remote sensing image scene classification methods using convolutional neural networks have drawbacks, including excessive parameters and heavy calculation costs. More efficient and lightweight CNNs have fewer parameters and calculations, but their classification performance is generally weaker. We propose a more efficient and lightweight convolutional neural network method to improve classification accuracy with a small training dataset. Inspired by fine-grained visual recognition, this study introduces a bilinear convolutional neural network model for scene classification. First, the lightweight convolutional neural network, MobileNetv2, is used to extract deep and abstract image features. Each feature is then transformed into two features with two different convolutional layers. The transformed features are subjected to Hadamard product operation to obtain an enhanced bilinear feature. Finally, the bilinear feature after pooling and normalization is used for classification. Experiments are performed on three widely used datasets: UC Merced, AID, and NWPU-RESISC45. Compared with other state-of-art methods, the proposed method has fewer parameters and calculations, while achieving higher accuracy. By including feature fusion with bilinear pooling, performance and accuracy for remote scene classification can greatly improve. This could be applied to any remote sensing image classification task.
Collapse
|