1
|
Cao QD, Choe Y. Posthurricane damage assessment using satellite imagery and geolocation features. Risk Anal 2024; 44:1103-1113. [PMID: 37897045 DOI: 10.1111/risa.14244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
Gaining timely and reliable situation awareness after hazard events such as a hurricane is crucial to emergency managers and first responders. One effective way to achieve that goal is through damage assessment. Recently, disaster researchers have been utilizing imagery captured through satellites or drones to quantify the number of flooded/damaged buildings. In this paper, we propose a mixed-data approach, which leverages publicly available satellite imagery and geolocation features of the affected area to identify damaged buildings after a hurricane. The method demonstrated significant improvement from performing a similar task using only imagery features, based on a case study of Hurricane Harvey affecting Greater Houston area in 2017. This result opens door to a wide range of possibilities to unify the advancement in computer vision algorithms such as convolutional neural networks and traditional methods in damage assessment, for example, using flood depth or bare-earth topology. In this work, a creative choice of the geolocation features was made to provide extra information to the imagery features, but it is up to the users to decide which other features can be included to model the physical behavior of the events, depending on their domain knowledge and the type of disaster. The data set curated in this work is made openly available (DOI: 10.17603/ds2-3cca-f398).
Collapse
Affiliation(s)
- Quoc Dung Cao
- Department of Industrial and Systems Engineering, University of Washington, Seattle, Washington, USA
| | - Youngjun Choe
- Department of Industrial and Systems Engineering, University of Washington, Seattle, Washington, USA
| |
Collapse
|
2
|
Özcan ŞN, Uyar T, Karayeğen G. Comprehensive data analysis of white blood cells with classification and segmentation by using deep learning approaches. Cytometry A 2024. [PMID: 38563259 DOI: 10.1002/cyto.a.24839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 03/14/2024] [Accepted: 03/25/2024] [Indexed: 04/04/2024]
Abstract
Deep learning approaches have frequently been used in the classification and segmentation of human peripheral blood cells. The common feature of previous studies was that they used more than one dataset, but used them separately. No study has been found that combines more than two datasets to use together. In classification, five types of white blood cells were identified by using a mixture of four different datasets. In segmentation, four types of white blood cells were determined, and three different neural networks, including CNN (Convolutional Neural Network), UNet and SegNet, were applied. The classification results of the presented study were compared with those of related studies. The balanced accuracy was 98.03%, and the test accuracy of the train-independent dataset was determined to be 97.27%. For segmentation, accuracy rates of 98.9% for train-dependent dataset and 92.82% for train-independent dataset for the proposed CNN were obtained in both nucleus and cytoplasm detection. In the presented study, the proposed method showed that it could detect white blood cells from a train-independent dataset with high accuracy. Additionally, it is promising as a diagnostic tool that can be used in the clinical field, with successful results in classification and segmentation.
Collapse
Affiliation(s)
- Şeyma Nur Özcan
- Biomedical Engineering Department, Başkent University, Ankara, Turkey
| | - Tansel Uyar
- Biomedical Engineering Department, Başkent University, Ankara, Turkey
| | - Gökay Karayeğen
- Biomedical Equipment Technology, Vocational School of Technical Sciences, Başkent University, Ankara, Turkey
| |
Collapse
|
3
|
Ketawala G, Reiter CM, Fromme P, Botha S. The Pixel Anomaly Detection Tool: a user-friendly GUI for classifying detector frames using machine-learning approaches. J Appl Crystallogr 2024; 57:529-538. [PMID: 38596720 PMCID: PMC11001403 DOI: 10.1107/s1600576724000116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 01/03/2024] [Indexed: 04/11/2024] Open
Abstract
Data collection at X-ray free electron lasers has particular experimental challenges, such as continuous sample delivery or the use of novel ultrafast high-dynamic-range gain-switching X-ray detectors. This can result in a multitude of data artefacts, which can be detrimental to accurately determining structure-factor amplitudes for serial crystallography or single-particle imaging experiments. Here, a new data-classification tool is reported that offers a variety of machine-learning algorithms to sort data trained either on manual data sorting by the user or by profile fitting the intensity distribution on the detector based on the experiment. This is integrated into an easy-to-use graphical user interface, specifically designed to support the detectors, file formats and software available at most X-ray free electron laser facilities. The highly modular design makes the tool easily expandable to comply with other X-ray sources and detectors, and the supervised learning approach enables even the novice user to sort data containing unwanted artefacts or perform routine data-analysis tasks such as hit finding during an experiment, without needing to write code.
Collapse
Affiliation(s)
- Gihan Ketawala
- Biodesign Center for Applied Structural Discovery, Arizona State University, Tempe, AZ 85287-5001, USA
- School of Molecular Sciences, Arizona State University, Tempe, AZ 85287-1604, USA
| | - Caitlin M. Reiter
- NSF BioXFEL Science and Technology Center Summer Internship Program, NY 14203, USA
| | - Petra Fromme
- Biodesign Center for Applied Structural Discovery, Arizona State University, Tempe, AZ 85287-5001, USA
- School of Molecular Sciences, Arizona State University, Tempe, AZ 85287-1604, USA
| | - Sabine Botha
- Biodesign Center for Applied Structural Discovery, Arizona State University, Tempe, AZ 85287-5001, USA
- Department of Physics, Arizona State University, Tempe, AZ 85287-1504, USA
| |
Collapse
|
4
|
Lee JS, Wu WK. Breast Tumor Tissue Image Classification Using Single-Task Meta Learning with Auxiliary Network. Cancers (Basel) 2024; 16:1362. [PMID: 38611040 PMCID: PMC11010930 DOI: 10.3390/cancers16071362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Revised: 03/25/2024] [Accepted: 03/27/2024] [Indexed: 04/14/2024] Open
Abstract
Breast cancer has a high mortality rate among cancers. If the type of breast tumor can be correctly diagnosed at an early stage, the survival rate of the patients will be greatly improved. Considering the actual clinical needs, the classification model of breast pathology images needs to have the ability to make a correct classification, even in facing image data with different characteristics. The existing convolutional neural network (CNN)-based models for the classification of breast tumor pathology images lack the requisite generalization capability to maintain high accuracy when confronted with pathology images of varied characteristics. Consequently, this study introduces a new classification model, STMLAN (Single-Task Meta Learning with Auxiliary Network), which integrates Meta Learning and an auxiliary network. Single-Task Meta Learning was proposed to endow the model with generalization ability, and the auxiliary network was used to enhance the feature characteristics of breast pathology images. The experimental results demonstrate that the STMLAN model proposed in this study improves accuracy by at least 1.85% in challenging multi-classification tasks compared to the existing methods. Furthermore, the Silhouette Score corresponding to the features learned by the model has increased by 31.85%, reflecting that the proposed model can learn more discriminative features, and the generalization ability of the overall model is also improved.
Collapse
Affiliation(s)
- Jiann-Shu Lee
- Department of Computer Science and Information Engineering, National University of Tainan, Tainan 700, Taiwan;
| | | |
Collapse
|
5
|
Lyu J, Zou R, Wan Q, Xi W, Yang Q, Kodagoda S, Wang S. Cross-and-Diagonal Networks: An Indirect Self-Attention Mechanism for Image Classification. Sensors (Basel) 2024; 24:2055. [PMID: 38610267 PMCID: PMC11014102 DOI: 10.3390/s24072055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 03/11/2024] [Accepted: 03/20/2024] [Indexed: 04/14/2024]
Abstract
In recent years, computer vision has witnessed remarkable advancements in image classification, specifically in the domains of fully convolutional neural networks (FCNs) and self-attention mechanisms. Nevertheless, both approaches exhibit certain limitations. FCNs tend to prioritize local information, potentially overlooking crucial global contexts, whereas self-attention mechanisms are computationally intensive despite their adaptability. In order to surmount these challenges, this paper proposes cross-and-diagonal networks (CDNet), innovative network architecture that adeptly captures global information in images while preserving local details in a more computationally efficient manner. CDNet achieves this by establishing long-range relationships between pixels within an image, enabling the indirect acquisition of contextual information. This inventive indirect self-attention mechanism significantly enhances the network's capacity. In CDNet, a new attention mechanism named "cross and diagonal attention" is proposed. This mechanism adopts an indirect approach by integrating two distinct components, cross attention and diagonal attention. By computing attention in different directions, specifically vertical and diagonal, CDNet effectively establishes remote dependencies among pixels, resulting in improved performance in image classification tasks. Experimental results highlight several advantages of CDNet. Firstly, it introduces an indirect self-attention mechanism that can be effortlessly integrated as a module into any convolutional neural network (CNN). Additionally, the computational cost of the self-attention mechanism has been effectively reduced, resulting in improved overall computational efficiency. Lastly, CDNet attains state-of-the-art performance on three benchmark datasets for similar types of image classification networks. In essence, CDNet addresses the constraints of conventional approaches and provides an efficient and effective solution for capturing global context in image classification tasks.
Collapse
Affiliation(s)
- Jiahang Lyu
- School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China; (J.L.); (R.Z.); (Q.W.); (W.X.); (Q.Y.)
| | - Rongxin Zou
- School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China; (J.L.); (R.Z.); (Q.W.); (W.X.); (Q.Y.)
| | - Qin Wan
- School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China; (J.L.); (R.Z.); (Q.W.); (W.X.); (Q.Y.)
| | - Wang Xi
- School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China; (J.L.); (R.Z.); (Q.W.); (W.X.); (Q.Y.)
| | - Qinglin Yang
- School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China; (J.L.); (R.Z.); (Q.W.); (W.X.); (Q.Y.)
| | - Sarath Kodagoda
- Faculty of Engineering & Information Technology, University of Technology Sydney, Sydney, NWS 2007, Australia;
| | - Shifeng Wang
- School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, China; (J.L.); (R.Z.); (Q.W.); (W.X.); (Q.Y.)
- Zhongshan Institute of Changchun University of Science and Technology, Zhongshan 528400, China
| |
Collapse
|
6
|
Zeng Z, Giap BD, Kahana E, Lustre J, Mahmoud O, Mian SI, Tannen B, Nallasamy N. Evaluation of Methods for Detection and Semantic Segmentation of the Anterior Capsulotomy in Cataract Surgery Video. Clin Ophthalmol 2024; 18:647-657. [PMID: 38476358 PMCID: PMC10929120 DOI: 10.2147/opth.s453073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 02/20/2024] [Indexed: 03/14/2024] Open
Abstract
Background The capsulorhexis is one of the most important and challenging maneuvers in cataract surgery. Automated analysis of the anterior capsulotomy could aid surgical training through the provision of objective feedback and guidance to trainees. Purpose To develop and evaluate a deep learning-based system for the automated identification and semantic segmentation of the anterior capsulotomy in cataract surgery video. Methods In this study, we established a BigCat-Capsulotomy dataset comprising 1556 video frames extracted from 190 recorded cataract surgery videos for developing and validating the capsulotomy recognition system. The proposed system involves three primary stages: video preprocessing, capsulotomy video frame classification, and capsulotomy segmentation. To thoroughly evaluate its efficacy, we examined the performance of a total of eight deep learning-based classification models and eleven segmentation models, assessing both accuracy and time consumption. Furthermore, we delved into the factors influencing system performance by deploying it across various surgical phases. Results The ResNet-152 model employed in the classification step of the proposed capsulotomy recognition system attained strong performance with an overall Dice coefficient of 92.21%. Similarly, the UNet model with the DenseNet-169 backbone emerged as the most effective segmentation model among those investigated, achieving an overall Dice coefficient of 92.12%. Moreover, the time consumption of the system was low at 103.37 milliseconds per frame, facilitating its application in real-time scenarios. Phase-wise analysis indicated that the Phacoemulsification phase (nuclear disassembly) was the most challenging to segment (Dice coefficient of 86.02%). Conclusion The experimental results showed that the proposed system is highly effective in intraoperative capsulotomy recognition during cataract surgery and demonstrates both high accuracy and real-time capabilities. This system holds significant potential for applications in surgical performance analysis, education, and intraoperative guidance systems.
Collapse
Affiliation(s)
- Zixue Zeng
- School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Binh Duong Giap
- Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Ethan Kahana
- Department of Computer Science, University of Michigan, Ann Arbor, MI, USA
| | | | - Ossama Mahmoud
- School of Medicine, Wayne State University, Detroit, MI, USA
| | - Shahzad I Mian
- Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Bradford Tannen
- Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA
| | - Nambi Nallasamy
- Kellogg Eye Center, Department of Ophthalmology and Visual Sciences, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
7
|
Rusinovich Y, Rusinovich V, Buhayenka A, Liashko V, Sabanov A, Holstein DJF, Aldmour S, Doss M, Branzan D. Classification of anatomic patterns of peripheral artery disease with automated machine learning (AutoML). Vascular 2024:17085381241236571. [PMID: 38404043 DOI: 10.1177/17085381241236571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
AIM The aim of this study was to investigate the potential of novel automated machine learning (AutoML) in vascular medicine by developing a discriminative artificial intelligence (AI) model for the classification of anatomical patterns of peripheral artery disease (PAD). MATERIAL AND METHODS Random open-source angiograms of lower limbs were collected using a web-indexed search. An experienced researcher in vascular medicine labelled the angiograms according to the most applicable grade of femoropopliteal disease in the Global Limb Anatomic Staging System (GLASS). An AutoML model was trained using the Vertex AI (Google Cloud) platform to classify the angiograms according to the GLASS grade with a multi-label algorithm. Following deployment, we conducted a test using 25 random angiograms (five from each GLASS grade). Model tuning through incremental training by introducing new angiograms was executed to the limit of the allocated quota following the initial evaluation to determine its effect on the software's performance. RESULTS We collected 323 angiograms to create the AutoML model. Among these, 80 angiograms were labelled as grade 0 of femoropopliteal disease in GLASS, 114 as grade 1, 34 as grade 2, 25 as grade 3 and 70 as grade 4. After 4.5 h of training, the AI model was deployed. The AI self-assessed average precision was 0.77 (0 is minimal and 1 is maximal). During the testing phase, the AI model successfully determined the GLASS grade in 100% of the cases. The agreement with the researcher was almost perfect with the number of observed agreements being 22 (88%), Kappa = 0.85 (95% CI 0.69-1.0). The best results were achieved in predicting GLASS grade 0 and grade 4 (initial precision: 0.76 and 0.84). However, the AI model exhibited poorer results in classifying GLASS grade 3 (initial precision: 0.2) compared to other grades. Disagreements between the AI and the researcher were associated with the low resolution of the test images. Incremental training expanded the initial dataset by 23% to a total of 417 images, which improved the model's average precision by 11% to 0.86. CONCLUSION After a brief training period with a limited dataset, AutoML has demonstrated its potential in identifying and classifying the anatomical patterns of PAD, operating unhindered by the factors that can affect human analysts, such as fatigue or lack of experience. This technology bears the potential to revolutionize outcome prediction and standardize evidence-based revascularization strategies for patients with PAD, leveraging its adaptability and ability to continuously improve with additional data. The pursuit of further research in AutoML within the field of vascular medicine is both promising and warranted. However, it necessitates additional financial support to realize its full potential.
Collapse
Affiliation(s)
- Yury Rusinovich
- Department of Vascular Surgery, University Hospital Leipzig, Leipzig, Germany
| | - Volha Rusinovich
- Institute of Hygiene and Environmental Medicine, University Hospital Leipzig, Germany
| | | | - Vitalii Liashko
- Department of Vascular Surgery, Charité University Hospital, Berlin, Germany
| | - Arsen Sabanov
- Department of Vascular Surgery, University Hospital Leipzig, Leipzig, Germany
| | - David J F Holstein
- Department of Vascular Surgery, University Hospital Leipzig, Leipzig, Germany
| | - Samer Aldmour
- Department of Vascular Surgery, University Hospital Leipzig, Leipzig, Germany
| | - Markus Doss
- Department of Vascular Surgery, University Hospital Leipzig, Leipzig, Germany
| | - Daniela Branzan
- Department of Vascular Surgery, University Hospital Leipzig, Leipzig, Germany
| |
Collapse
|
8
|
Ndu H, Sheikh-Akbari A, Deng J, Mporas I. HyperVein: A Hyperspectral Image Dataset for Human Vein Detection. Sensors (Basel) 2024; 24:1118. [PMID: 38400276 PMCID: PMC10891899 DOI: 10.3390/s24041118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 01/22/2024] [Accepted: 02/07/2024] [Indexed: 02/25/2024]
Abstract
HyperSpectral Imaging (HSI) plays a pivotal role in various fields, including medical diagnostics, where precise human vein detection is crucial. HyperSpectral (HS) image data are very large and can cause computational complexities. Dimensionality reduction techniques are often employed to streamline HS image data processing. This paper presents a HS image dataset encompassing left- and right-hand images captured from 100 subjects with varying skin tones. The dataset was annotated using anatomical data to represent vein and non-vein areas within the images. This dataset is utilised to explore the effectiveness of dimensionality reduction techniques, namely: Principal Component Analysis (PCA), Folded PCA (FPCA), and Ward's Linkage Strategy using Mutual Information (WaLuMI) for vein detection. To generate experimental results, the HS image dataset was divided into train and test datasets. Optimum performing parameters for each of the dimensionality reduction techniques in conjunction with the Support Vector Machine (SVM) binary classification were determined using the Training dataset. The performance of the three dimensionality reduction-based vein detection methods was then assessed and compared using the test image dataset. Results show that the FPCA-based method outperforms the other two methods in terms of accuracy. For visualization purposes, the classification prediction image for each technique is post-processed using morphological operators, and results show the significant potential of HS imaging in vein detection.
Collapse
Affiliation(s)
- Henry Ndu
- School of Built Environment, Engineering and Computing, Leeds Beckett University, Leeds LS1 3HE, UK; (H.N.)
| | - Akbar Sheikh-Akbari
- School of Built Environment, Engineering and Computing, Leeds Beckett University, Leeds LS1 3HE, UK; (H.N.)
| | - Jiamei Deng
- School of Built Environment, Engineering and Computing, Leeds Beckett University, Leeds LS1 3HE, UK; (H.N.)
| | - Iosif Mporas
- Department of Engineering and Technology, School of Physics, Engineering & Computer Science, University of Hertfordshire, Hatfield AL10 9AB, UK
| |
Collapse
|
9
|
Atcı ŞY, Güneş A, Zontul M, Arslan Z. Identifying Diabetic Retinopathy in the Human Eye: A Hybrid Approach Based on a Computer-Aided Diagnosis System Combined with Deep Learning. Tomography 2024; 10:215-230. [PMID: 38393285 PMCID: PMC10892594 DOI: 10.3390/tomography10020017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 01/16/2024] [Accepted: 02/01/2024] [Indexed: 02/25/2024] Open
Abstract
Diagnosing and screening for diabetic retinopathy is a well-known issue in the biomedical field. A component of computer-aided diagnosis that has advanced significantly over the past few years as a result of the development and effectiveness of deep learning is the use of medical imagery from a patient's eye to identify the damage caused to blood vessels. Issues with unbalanced datasets, incorrect annotations, a lack of sample images, and improper performance evaluation measures have negatively impacted the performance of deep learning models. Using three benchmark datasets of diabetic retinopathy, we conducted a detailed comparison study comparing various state-of-the-art approaches to address the effect caused by class imbalance, with precision scores of 93%, 89%, 81%, 76%, and 96%, respectively, for normal, mild, moderate, severe, and DR phases. The analyses of the hybrid modeling, including CNN analysis and SHAP model derivation results, are compared at the end of the paper, and ideal hybrid modeling strategies for deep learning classification models for automated DR detection are identified.
Collapse
Affiliation(s)
- Şükran Yaman Atcı
- Department of Computer Engineering, İstanbul Aydın University, Istanbul 34295, Turkey; (A.G.); (Z.A.)
| | - Ali Güneş
- Department of Computer Engineering, İstanbul Aydın University, Istanbul 34295, Turkey; (A.G.); (Z.A.)
| | - Metin Zontul
- Department of Computer Engineering, Sivas University of Science and Technology, Sivas 58140, Turkey;
| | - Zafer Arslan
- Department of Computer Engineering, İstanbul Aydın University, Istanbul 34295, Turkey; (A.G.); (Z.A.)
| |
Collapse
|
10
|
Wang R, Qiu Y, Wang T, Wang M, Jin S, Cong F, Zhang Y, Xu H. MIHIC: a multiplex IHC histopathological image classification dataset for lung cancer immune microenvironment quantification. Front Immunol 2024; 15:1334348. [PMID: 38370413 PMCID: PMC10869447 DOI: 10.3389/fimmu.2024.1334348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 01/09/2024] [Indexed: 02/20/2024] Open
Abstract
Background Immunohistochemistry (IHC) is a widely used laboratory technique for cancer diagnosis, which selectively binds specific antibodies to target proteins in tissue samples and then makes the bound proteins visible through chemical staining. Deep learning approaches have the potential to be employed in quantifying tumor immune micro-environment (TIME) in digitized IHC histological slides. However, it lacks of publicly available IHC datasets explicitly collected for the in-depth TIME analysis. Method In this paper, a notable Multiplex IHC Histopathological Image Classification (MIHIC) dataset is created based on manual annotations by pathologists, which is publicly available for exploring deep learning models to quantify variables associated with the TIME in lung cancer. The MIHIC dataset comprises of totally 309,698 multiplex IHC stained histological image patches, encompassing seven distinct tissue types: Alveoli, Immune cells, Necrosis, Stroma, Tumor, Other and Background. By using the MIHIC dataset, we conduct a series of experiments that utilize both convolutional neural networks (CNNs) and transformer models to benchmark IHC stained histological image classifications. We finally quantify lung cancer immune microenvironment variables by using the top-performing model on tissue microarray (TMA) cores, which are subsequently used to predict patients' survival outcomes. Result Experiments show that transformer models tend to provide slightly better performances than CNN models in histological image classifications, although both types of models provide the highest accuracy of 0.811 on the testing dataset in MIHIC. The automatically quantified TIME variables, which reflect proportions of immune cells over stroma and tumor over tissue core, show prognostic value for overall survival of lung cancer patients. Conclusion To the best of our knowledge, MIHIC is the first publicly available lung cancer IHC histopathological dataset that includes images with 12 different IHC stains, meticulously annotated by multiple pathologists across 7 distinct categories. This dataset holds significant potential for researchers to explore novel techniques for quantifying the TIME and advancing our understanding of the interactions between the immune system and tumors.
Collapse
Affiliation(s)
- Ranran Wang
- Affiliated Cancer Hospital, Dalian University of Technology, Dalian, China
- School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, China
| | - Yusong Qiu
- Department of Pathology, Liaoning Cancer Hospital and Institute, Shenyang, China
| | - Tong Wang
- School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, China
| | - Mingkang Wang
- School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, China
| | - Shan Jin
- School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, China
| | - Fengyu Cong
- Affiliated Cancer Hospital, Dalian University of Technology, Dalian, China
- School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, China
- Key Laboratory of Integrated Circuit and Biomedical Electronic System, Dalian University of Technology, Dalian, Liaoning, China
- Faculty of Information Technology, University of Jyvaskyla, Jyvaskyla, Finland
| | - Yong Zhang
- Department of Pathology, Liaoning Cancer Hospital and Institute, Shenyang, China
| | - Hongming Xu
- Affiliated Cancer Hospital, Dalian University of Technology, Dalian, China
- School of Biomedical Engineering, Faculty of Medicine, Dalian University of Technology, Dalian, China
- Key Laboratory of Integrated Circuit and Biomedical Electronic System, Dalian University of Technology, Dalian, Liaoning, China
| |
Collapse
|
11
|
Amin M, Nakamura K, Ontaneda D. Differentiating multiple sclerosis from non-specific white matter changes using a convolutional neural network image classification model. Mult Scler Relat Disord 2024; 82:105420. [PMID: 38183693 DOI: 10.1016/j.msard.2023.105420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 11/07/2023] [Accepted: 12/30/2023] [Indexed: 01/08/2024]
Abstract
BACKGROUND The diagnosis of multiple sclerosis (MS) relies heavily on neuroimaging with magnetic resonance imaging (MRI) and exclusion of mimics. This can be a challenging task due to radiological overlap in several disorders and may require ancillary testing or longitudinal follow up. One of the most common radiological MS mimickers is non-specific white matter disease (NSWMD). We aimed to develop and evaluate models leveraging machine learning algorithms to help distinguish MS and NSWMD. METHODS All adult patients who underwent MRI brain using a demyelinating protocol with available electronic medical records between 2015 and 2019 at Cleveland Clinic affiliated facilities were included. Diagnosis of MS and NSWMD were assessed from clinical documentation. Those with a diagnosis of MS and NSWMD were matched using total T2 lesion volume (T2LV) and used to train models with logistic regression and convolutional neural networks (CNN). Performance metrices were reported for each model. RESULTS A total of 250 NSWMD MRI scans were identified, and 250 unique MS MRI scans were matched on T2LV. Cross validated logistic regression model was able to use 20 variables (including spinal cord area, regional volumes, and fractions) to predict MS compared to NSWMD with 68.0% accuracy while the CNN model was able to classify MS compared to NSWMD in two independent validation and testing cohorts with 77% and 78% accuracy on average. CONCLUSION Automated methods can be used to differentiate MS compared to NSWMD. These methods can be used to supplement currently available diagnostic tools for patients being evaluated for MS.
Collapse
Affiliation(s)
- Moein Amin
- Mellen Center for Multiple Sclerosis Treatment and Research, Neurological Institute, Cleveland Clinic, Cleveland, Ohio, USA
| | - Kunio Nakamura
- Department of Biomedical Engineering, Cleveland Clinic, Cleveland, Ohio, USA
| | - Daniel Ontaneda
- Mellen Center for Multiple Sclerosis Treatment and Research, Neurological Institute, Cleveland Clinic, Cleveland, Ohio, USA.
| |
Collapse
|
12
|
Blair JD, Gaynor KM, Palmer MS, Marshall KE. A gentle introduction to computer vision-based specimen classification in ecological datasets. J Anim Ecol 2024; 93:147-158. [PMID: 38230868 DOI: 10.1111/1365-2656.14042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 11/21/2023] [Indexed: 01/18/2024]
Abstract
Classifying specimens is a critical component of ecological research, biodiversity monitoring and conservation. However, manual classification can be prohibitively time-consuming and expensive, limiting how much data a project can afford to process. Computer vision, a form of machine learning, can help overcome these problems by rapidly, automatically and accurately classifying images of specimens. Given the diversity of animal species and contexts in which images are captured, there is no universal classifier for all species and use cases. As such, ecologists often need to train their own models. While numerous software programs exist to support this process, ecologists need a fundamental understanding of how computer vision works to select appropriate model workflows based on their specific use case, data types, computing resources and desired performance capabilities. Ecologists may also face characteristic quirks of ecological datasets, such as long-tail distributions, 'unknown' species, similarity between species and polymorphism within species, which impact the efficacy of computer vision. Despite growing interest in computer vision for ecology, there are few resources available to help ecologists face the challenges they are likely to encounter. Here, we present a gentle introduction for species classification using computer vision. In this manuscript and associated GitHub repository, we demonstrate how to prepare training data, basic model training procedures, and methods for model evaluation and selection. Throughout, we explore specific considerations ecologists should make when training classification models, such as data domains, feature extractors and class imbalances. With these basics, ecologists can adjust their workflows to achieve research goals and/or account for uncertainty in downstream analysis. Our goal is to provide guidance for ecologists for getting started in or improving their use of machine learning for visual classification tasks.
Collapse
Affiliation(s)
- Jarrett D Blair
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
| | - Kaitlyn M Gaynor
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
- Department of Botany, University of British Columbia, Vancouver, British Columbia, Canada
| | - Meredith S Palmer
- Department of Ecology & Evolutionary Biology, Princeton University, Princeton, New Jersey, USA
| | - Katie E Marshall
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
13
|
Yang J, Chen Y, Yu J. Convolutional neural network based on the fusion of image classification and segmentation module for weed detection in alfalfa. Pest Manag Sci 2024. [PMID: 38299763 DOI: 10.1002/ps.7979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 01/08/2024] [Accepted: 01/17/2024] [Indexed: 02/02/2024]
Abstract
BACKGROUND Accurate and reliable weed detection in real time is essential for realizing autonomous precision herbicide application. The objective of this research was to propose a novel neural network architecture to improve the detection accuracy for broadleaf weeds growing in alfalfa. RESULTS A novel neural network, ResNet-101-segmentation, was developed by fusing an image classification and segmentation module with the backbone selected from ResNet-101. Compared with existing neural networks (AlexNet, GoogLeNet, VGG16, and ResNet-101), ResNet-101-segmentation improved the detection of Carolina geranium, catchweed bedstraw, mugwort and speedwell from 78.27% to 98.17%, from 79.49% to 98.28%, from 67.03% to 96.23%, and from 75.95% to 98.06%, respectively. The novel network exhibited high values of confusion matrices (>90%) when trained with sufficient data sets. CONCLUSION ResNet-101-segmentation demonstrated excellent performance compared with existing models (AlexNet, GoogLeNet, VGG16, and ResNet-101) for detecting broadleaf weeds growing in alfalfa. This approach offers a promising solution to increase the accuracy of weed detection, especially in cases where weeds and crops have similar plant morphology. © 2024 Society of Chemical Industry.
Collapse
Affiliation(s)
- Jie Yang
- College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing, China
- Peking University Institute of Advanced Agricultural Sciences/Shandong Laboratory of Advanced Agricultural Sciences at Weifang, Weifang, China
| | - Yong Chen
- College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing, China
| | - Jialin Yu
- Peking University Institute of Advanced Agricultural Sciences/Shandong Laboratory of Advanced Agricultural Sciences at Weifang, Weifang, China
| |
Collapse
|
14
|
Lu Y, Zhang L, Wang J, Bian L, Ding Z, Yang C. Hyperspectral upgrade solution for biomicroscope combined with Transformer network to classify infectious bacteria. J Biophotonics 2024:e202300484. [PMID: 38297446 DOI: 10.1002/jbio.202300484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 01/05/2024] [Accepted: 01/08/2024] [Indexed: 02/02/2024]
Abstract
Infectious diseases caused by bacterial pathogens pose a significant public health threat, emphasizing the need for swift and accurate bacterial species detection methods. Hyperspectral microscopic imaging (HMI) offers nondestructive, rapid, and data-rich advantages, making it a promising tool for microbial detection. In this research, we present a highly compatible and cost-effective approach to extend a standard biomicroscope system into a hyperspectral biomicroscope using a prism-grating-prism configuration. Using this prototype, we generate 600 hyperspectral data cubes for Listeria, Bacillus typhi, Bacillus pestis, and Bacillus anthracis. Additionally, we propose a Transformer-based classification network that achieves a 99.44% accuracy in classifying these infectious pathogens, outperforming traditional methods. Our results suggest that the successful combination of HMI and the optimized Transformer-based classification network highlights the potential for rapid and precise detection of infectious disease pathogens .
Collapse
Affiliation(s)
- You Lu
- Engineering Research Center of Semiconductor Power Device Reliability Ministry of Education, Guizhou University, Guiyang, China
| | - Lan Zhang
- Engineering Research Center of Semiconductor Power Device Reliability Ministry of Education, Guizhou University, Guiyang, China
| | - Jihong Wang
- Engineering Research Center of Semiconductor Power Device Reliability Ministry of Education, Guizhou University, Guiyang, China
| | - Lifeng Bian
- Frontier Institute of Chip and System, Fudan University, Shanghai, China
| | - Zhao Ding
- Engineering Research Center of Semiconductor Power Device Reliability Ministry of Education, Guizhou University, Guiyang, China
| | - Chen Yang
- Engineering Research Center of Semiconductor Power Device Reliability Ministry of Education, Guizhou University, Guiyang, China
| |
Collapse
|
15
|
Ahmad M, Zhang L, Chowdhury MEH. FPGA Implementation of Complex-Valued Neural Network for Polar-Represented Image Classification. Sensors (Basel) 2024; 24:897. [PMID: 38339614 PMCID: PMC10857050 DOI: 10.3390/s24030897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 01/23/2024] [Accepted: 01/25/2024] [Indexed: 02/12/2024]
Abstract
This proposed research explores a novel approach to image classification by deploying a complex-valued neural network (CVNN) on a Field-Programmable Gate Array (FPGA), specifically for classifying 2D images transformed into polar form. The aim of this research is to address the limitations of existing neural network models in terms of energy and resource efficiency, by exploring the potential of FPGA-based hardware acceleration in conjunction with advanced neural network architectures like CVNNs. The methodological innovation of this research lies in the Cartesian to polar transformation of 2D images, effectively reducing the input data volume required for neural network processing. Subsequent efforts focused on constructing a CVNN model optimized for FPGA implementation, emphasizing the enhancement of computational efficiency and overall performance. The experimental findings provide empirical evidence supporting the efficacy of the image classification system developed in this study. One of the developed models, CVNN_128, achieves an accuracy of 88.3% with an inference time of just 1.6 ms and a power consumption of 4.66 mW for the classification of the MNIST test dataset, which consists of 10,000 frames. While there is a slight concession in accuracy compared to recent FPGA implementations that achieve 94.43%, our model significantly excels in classification speed and power efficiency-surpassing existing models by more than a factor of 100. In conclusion, this paper demonstrates the substantial advantages of the FPGA implementation of CVNNs for image classification tasks, particularly in scenarios where speed, resource, and power consumption are critical.
Collapse
Affiliation(s)
- Maruf Ahmad
- Faculty of Engineering and Applied Science, University of Regina, Regina, SK S4S 0A2, Canada;
| | - Lei Zhang
- Faculty of Engineering and Applied Science, University of Regina, Regina, SK S4S 0A2, Canada;
| | | |
Collapse
|
16
|
Zeng Q, Sun J, Wang S. DIC-Transformer: interpretation of plant disease classification results using image caption generation technology. Front Plant Sci 2024; 14:1273029. [PMID: 38333041 PMCID: PMC10850568 DOI: 10.3389/fpls.2023.1273029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 12/29/2023] [Indexed: 02/10/2024]
Abstract
Disease image classification systems play a crucial role in identifying disease categories in the field of agricultural diseases. However, current plant disease image classification methods can only predict the disease category and do not offer explanations for the characteristics of the predicted disease images. Due to the current situation, this paper employed image description generation technology to produce distinct descriptions for different plant disease categories. A two-stage model called DIC-Transformer, which encompasses three tasks (detection, interpretation, and classification), was proposed. In the first stage, Faster R-CNN was utilized to detect the diseased area and generate the feature vector of the diseased image, with the Swin Transformer as the backbone. In the second stage, the model utilized the Transformer to generate image captions. It then generated the image feature vector, which is weighted by text features, to improve the performance of image classification in the subsequent classification decoder. Additionally, a dataset containing text and visualizations for agricultural diseases (ADCG-18) was compiled. The dataset contains images of 18 diseases and descriptive information about their characteristics. Then, using the ADCG-18, the DIC-Transformer was compared to 11 existing classical caption generation methods and 10 image classification models. The evaluation indicators for captions include Bleu1-4, CiderD, and Rouge. The values of BLEU-1, CIDEr-D, and ROUGE were 0.756, 450.51, and 0.721. The results of DIC-Transformer were 0.01, 29.55, and 0.014 higher than those of the highest-performing comparison model, Fc. The classification evaluation metrics include accuracy, recall, and F1 score, with accuracy at 0.854, recall at 0.854, and F1 score at 0.853. The results of DIC-Transformer were 0.024, 0.078, and 0.075 higher than those of the highest-performing comparison model, MobileNetV2. The results indicate that the DIC-Transformer outperforms other comparison models in classification and caption generation.
Collapse
Affiliation(s)
| | | | - Shansong Wang
- College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, China
| |
Collapse
|
17
|
Aguerchi K, Jabrane Y, Habba M, El Hassani AH. A CNN Hyperparameters Optimization Based on Particle Swarm Optimization for Mammography Breast Cancer Classification. J Imaging 2024; 10:30. [PMID: 38392079 PMCID: PMC10889268 DOI: 10.3390/jimaging10020030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 11/30/2023] [Accepted: 12/08/2023] [Indexed: 02/24/2024] Open
Abstract
Breast cancer is considered one of the most-common types of cancers among females in the world, with a high mortality rate. Medical imaging is still one of the most-reliable tools to detect breast cancer. Unfortunately, manual image detection takes much time. This paper proposes a new deep learning method based on Convolutional Neural Networks (CNNs). Convolutional Neural Networks are widely used for image classification. However, the determination process for accurate hyperparameters and architectures is still a challenging task. In this work, a highly accurate CNN model to detect breast cancer by mammography was developed. The proposed method is based on the Particle Swarm Optimization (PSO) algorithm in order to look for suitable hyperparameters and the architecture for the CNN model. The CNN model using PSO achieved success rates of 98.23% and 97.98% on the DDSM and MIAS datasets, respectively. The experimental results proved that the proposed CNN model gave the best accuracy values in comparison with other studies in the field. As a result, CNN models for mammography classification can now be created automatically. The proposed method can be considered as a powerful technique for breast cancer prediction.
Collapse
Affiliation(s)
| | - Younes Jabrane
- MSC Laboratory, Cadi Ayyad University, Marrakech 40000, Morocco
| | - Maryam Habba
- National School of Applied Sciences of Safi, Cadi Ayyad University, Safi 46000, Morocco
| | - Amir Hajjam El Hassani
- Nanomedicine Imagery & Therapeutics Laboratory, EA4662-Bourgogne-Franche-Comté University, 90010 Belfort, France
| |
Collapse
|
18
|
Walsh R, Osman I, Abdelaziz O, Shehata MS. Fully Self-Supervised Out-of-Domain Few-Shot Learning with Masked Autoencoders. J Imaging 2024; 10:23. [PMID: 38249008 DOI: 10.3390/jimaging10010023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 12/28/2023] [Accepted: 01/03/2024] [Indexed: 01/23/2024] Open
Abstract
Few-shot learning aims to identify unseen classes with limited labelled data. Recent few-shot learning techniques have shown success in generalizing to unseen classes; however, the performance of these techniques has also been shown to degrade when tested on an out-of-domain setting. Previous work, additionally, has also demonstrated increasing reliance on supervised finetuning in an off-line or online capacity. This paper proposes a novel, fully self-supervised few-shot learning technique (FSS) that utilizes a vision transformer and masked autoencoder. The proposed technique can generalize to out-of-domain classes by finetuning the model in a fully self-supervised method for each episode. We evaluate the proposed technique using three datasets (all out-of-domain). As such, our results show that FSS has an accuracy gain of 1.05%, 0.12%, and 1.28% on the ISIC, EuroSat, and BCCD datasets, respectively, without the use of supervised training.
Collapse
Affiliation(s)
- Reece Walsh
- Irving K. Barber Faculty of Science, University of British Columbia, Kelowna, BC V1V 1V7, Canada
| | - Islam Osman
- Irving K. Barber Faculty of Science, University of British Columbia, Kelowna, BC V1V 1V7, Canada
| | - Omar Abdelaziz
- Irving K. Barber Faculty of Science, University of British Columbia, Kelowna, BC V1V 1V7, Canada
| | - Mohamed S Shehata
- Irving K. Barber Faculty of Science, University of British Columbia, Kelowna, BC V1V 1V7, Canada
| |
Collapse
|
19
|
Safran M, Alrajhi W, Alfarhood S. DPXception: a lightweight CNN for image-based date palm species classification. Front Plant Sci 2024; 14:1281724. [PMID: 38264016 PMCID: PMC10803563 DOI: 10.3389/fpls.2023.1281724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 11/30/2023] [Indexed: 01/25/2024]
Abstract
Introduction Date palm species classification is important for various agricultural and economic purposes, but it is challenging to perform based on images of date palms alone. Existing methods rely on fruit characteristics, which may not be always visible or present. In this study, we introduce a new dataset and a new model for image-based date palm species classification. Methods Our dataset consists of 2358 images of four common and valuable date palm species (Barhi, Sukkari, Ikhlas, and Saqi), which we collected ourselves. We also applied data augmentation techniques to increase the size and diversity of our dataset. Our model, called DPXception (Date Palm Xception), is a lightweight and efficient CNN architecture that we trained and fine-tuned on our dataset. Unlike the original Xception model, our DPXception model utilizes only the first 100 layers of the Xception model for feature extraction (Adapted Xception), making it more lightweight and efficient. We also applied normalization prior to adapted Xception and reduced the model dimensionality by adding an extra global average pooling layer after feature extraction by adapted Xception. Results and discussion We compared the performance of our model with seven well-known models: Xception, ResNet50, ResNet50V2, InceptionV3, DenseNet201, EfficientNetB4, and EfficientNetV2-S. Our model achieved the highest accuracy (92.9%) and F1-score (93%) among the models, as well as the lowest inference time (0.0513 seconds). We also developed an Android smartphone application that uses our model to classify date palm species from images captured by the smartphone's camera in real time. To the best of our knowledge, this is the first work to provide a public dataset of date palm images and to demonstrate a robust and practical image-based date palm species classification method. This work will open new research directions for more advanced date palm analysis tasks such as gender classification and age estimation.
Collapse
Affiliation(s)
- Mejdl Safran
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | | | | |
Collapse
|
20
|
Yang Y, Wang J. Research on breast cancer pathological image classification method based on wavelet transform and YOLOv8. J Xray Sci Technol 2024:XST230296. [PMID: 38189740 DOI: 10.3233/xst-230296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Breast cancer is one of the cancers with high morbidity and mortality in the world, which is a serious threat to the health of women. With the development of deep learning, the recognition about computer-aided diagnosis technology is getting higher and higher. And the traditional data feature extraction technology has been gradually replaced by the feature extraction technology based on convolutional neural network which helps to realize the automatic recognition and classification of pathological images. In this paper, a novel method based on deep learning and wavelet transform is proposed to classify the pathological images of breast cancer. Firstly, the image flip technique is used to expand the data set, then the two-level wavelet decomposition and reconfiguration technology is used to sharpen and enhance the pathological images. Secondly, the processed data set is divided into the training set and the test set according to 8:2 and 7:3, and the YOLOv8 network model is selected to perform the eight classification tasks of breast cancer pathological images. Finally, the classification accuracy of the proposed method is compared with the classification accuracy obtained by YOLOv8 for the original BreaKHis dataset, and it is found that the algorithm can improve the classification accuracy of images with different magnifications, which proves the effectiveness of combining two-level wavelet decomposition and reconfiguration with YOLOv8 network model.
Collapse
Affiliation(s)
- Yunfeng Yang
- Department of Mathematics and Statistics, Northeast Petroleum University, Daqing, China
| | - Jiaqi Wang
- Department of Mathematics and Statistics, Northeast Petroleum University, Daqing, China
| |
Collapse
|
21
|
Zhang L, Xu R, Zhao J. Learning technology for detection and grading of cancer tissue using tumour ultrasound images1. J Xray Sci Technol 2024; 32:157-171. [PMID: 37424493 DOI: 10.3233/xst-230085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
BACKGROUND Early diagnosis of breast cancer is crucial to perform effective therapy. Many medical imaging modalities including MRI, CT, and ultrasound are used to diagnose cancer. OBJECTIVE This study aims to investigate feasibility of applying transfer learning techniques to train convoluted neural networks (CNNs) to automatically diagnose breast cancer via ultrasound images. METHODS Transfer learning techniques helped CNNs recognise breast cancer in ultrasound images. Each model's training and validation accuracies were assessed using the ultrasound image dataset. Ultrasound images educated and tested the models. RESULTS MobileNet had the greatest accuracy during training and DenseNet121 during validation. Transfer learning algorithms can detect breast cancer in ultrasound images. CONCLUSIONS Based on the results, transfer learning models may be useful for automated breast cancer diagnosis in ultrasound images. However, only a trained medical professional should diagnose cancer, and computational approaches should only be used to help make quick decisions.
Collapse
Affiliation(s)
- Liyan Zhang
- Department of Ultrasound, Sunshine Union Hospital, Weifang, China
| | - Ruiyan Xu
- College of Health, Binzhou Polytechnical College, Binzhou, China
| | - Jingde Zhao
- Department of Imaging, Qingdao Hospital of Traditional Chinese Medicine (Qingdao HaiCi Hospital), Qingdao, China
| |
Collapse
|
22
|
Tian C, Su W, Huang S, Shao B, Li X, Zhang Y, Wang B, Yu X, Li W. Identification of gastric cancer types based on hyperspectral imaging technology. J Biophotonics 2024; 17:e202300276. [PMID: 37669431 DOI: 10.1002/jbio.202300276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 08/17/2023] [Accepted: 08/30/2023] [Indexed: 09/07/2023]
Abstract
Gastric cancer is becoming the second biggest cause of death from cancer. Treatment and prognosis of different types of gastric cancer vary greatly. However, the routine pathological examination is limited to the tissue level and is easily affected by subjective factors. In our study, we examined gastric mucosal samples from 50 normal tissue and 90 cancer tissues. Hyperspectral imaging technology was used to obtain spectral information. A two-classification model for normal tissue and cancer tissue identification and a four-classification model for cancer type identification are constructed based on the improved deep residual network (IDRN). The accuracy of the two-classification model and four-classification model are 0.947 and 0.965. Hyperspectral imaging technology was used to extract molecular information to realize real-time diagnosis and accurate typing. The results show that hyperspectral imaging technique has good effect on diagnosis and type differentiation of gastric cancer, which is expected to be used in auxiliary diagnosis and treatment.
Collapse
Affiliation(s)
- Chongxuan Tian
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Wenjing Su
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Sirui Huang
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Bowen Shao
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Xueyi Li
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Yuanbo Zhang
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Bingjie Wang
- School of Control Science and Engineering, Shandong University, Jinan, China
| | - Xiaojing Yu
- Department of Dermatology, Qilu Hospital of Shandong University, Jinan, China
| | - Wei Li
- School of Control Science and Engineering, Shandong University, Jinan, China
| |
Collapse
|
23
|
Bolocan VO, Secareanu M, Sava E, Medar C, Manolescu LSC, Cătălin Rașcu AȘ, Costache MG, Radavoi GD, Dobran RA, Jinga V. Convolutional Neural Network Model for Segmentation and Classification of Clear Cell Renal Cell Carcinoma Based on Multiphase CT Images. J Imaging 2023; 9:280. [PMID: 38132698 PMCID: PMC10743786 DOI: 10.3390/jimaging9120280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 12/08/2023] [Accepted: 12/12/2023] [Indexed: 12/23/2023] Open
Abstract
(1) Background: Computed tomography (CT) imaging challenges in diagnosing renal cell carcinoma (RCC) include distinguishing malignant from benign tissues and determining the likely subtype. The goal is to show the algorithm's ability to improve renal cell carcinoma identification and treatment, improving patient outcomes. (2) Methods: This study uses the European Deep-Health toolkit's Convolutional Neural Network with ECVL, (European Computer Vision Library), and EDDL, (European Distributed Deep Learning Library). Image segmentation utilized U-net architecture and classification with resnet101. The model's clinical efficiency was assessed utilizing kidney, tumor, Dice score, and renal cell carcinoma categorization quality. (3) Results: The raw dataset contains 457 healthy right kidneys, 456 healthy left kidneys, 76 pathological right kidneys, and 84 pathological left kidneys. Preparing raw data for analysis was crucial to algorithm implementation. Kidney segmentation performance was 0.84, and tumor segmentation mean Dice score was 0.675 for the suggested model. Renal cell carcinoma classification was 0.885 accurate. (4) Conclusion and key findings: The present study focused on analyzing data from both healthy patients and diseased renal patients, with a particular emphasis on data processing. The method achieved a kidney segmentation accuracy of 0.84 and mean Dice scores of 0.675 for tumor segmentation. The system performed well in classifying renal cell carcinoma, achieving an accuracy of 0.885, results which indicates that the technique has the potential to improve the diagnosis of kidney pathology.
Collapse
Affiliation(s)
- Vlad-Octavian Bolocan
- Department of Fundamental Sciences, Faculty of Midwifery and Nursing, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania; (V.-O.B.); (C.M.); (M.G.C.)
- Department of Clinical Laboratory of Radiology and Medical Imaging, Clinical Hospital “Prof. Dr. Theodor Burghele”, 050664 Bucharest, Romania; (M.S.); (E.S.)
| | - Mihaela Secareanu
- Department of Clinical Laboratory of Radiology and Medical Imaging, Clinical Hospital “Prof. Dr. Theodor Burghele”, 050664 Bucharest, Romania; (M.S.); (E.S.)
| | - Elena Sava
- Department of Clinical Laboratory of Radiology and Medical Imaging, Clinical Hospital “Prof. Dr. Theodor Burghele”, 050664 Bucharest, Romania; (M.S.); (E.S.)
| | - Cosmin Medar
- Department of Fundamental Sciences, Faculty of Midwifery and Nursing, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania; (V.-O.B.); (C.M.); (M.G.C.)
- Department of Clinical Laboratory of Radiology and Medical Imaging, Clinical Hospital “Prof. Dr. Theodor Burghele”, 050664 Bucharest, Romania; (M.S.); (E.S.)
| | - Loredana Sabina Cornelia Manolescu
- Department of Fundamental Sciences, Faculty of Midwifery and Nursing, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania; (V.-O.B.); (C.M.); (M.G.C.)
| | - Alexandru-Ștefan Cătălin Rașcu
- Department of Urology, Clinical Hospital “Prof. Dr. Theodor Burghele”, Faculty of Medicine, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania; (A.-Ș.C.R.); (G.D.R.); (V.J.)
- Department of Urology, Clinical Hospital “Prof. Dr. Theodor Burghele”, 050664 Bucharest, Romania
| | - Maria Glencora Costache
- Department of Fundamental Sciences, Faculty of Midwifery and Nursing, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania; (V.-O.B.); (C.M.); (M.G.C.)
| | - George Daniel Radavoi
- Department of Urology, Clinical Hospital “Prof. Dr. Theodor Burghele”, Faculty of Medicine, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania; (A.-Ș.C.R.); (G.D.R.); (V.J.)
- Department of Urology, Clinical Hospital “Prof. Dr. Theodor Burghele”, 050664 Bucharest, Romania
| | | | - Viorel Jinga
- Department of Urology, Clinical Hospital “Prof. Dr. Theodor Burghele”, Faculty of Medicine, University of Medicine and Pharmacy “Carol Davila”, 050474 Bucharest, Romania; (A.-Ș.C.R.); (G.D.R.); (V.J.)
- Department of Urology, Clinical Hospital “Prof. Dr. Theodor Burghele”, 050664 Bucharest, Romania
- Medical Sciences Section, Academy of Romanian Scientists, 050085 Bucharest, Romania
| |
Collapse
|
24
|
Xiao P, Zhang Z, Luo X, Sun J, Zhou X, Yang X, Huang L. Highway Visibility Estimation in Foggy Weather via Multi-Scale Fusion Network. Sensors (Basel) 2023; 23:9739. [PMID: 38139585 PMCID: PMC10747611 DOI: 10.3390/s23249739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 12/03/2023] [Accepted: 12/06/2023] [Indexed: 12/24/2023]
Abstract
Poor visibility has a significant impact on road safety and can even lead to traffic accidents. The traditional means of visibility monitoring no longer meet the current needs in terms of temporal and spatial accuracy. In this work, we propose a novel deep network architecture for estimating the visibility directly from highway surveillance images. Specifically, we employ several image feature extraction methods to extract detailed structural, spectral, and scene depth features from the images. Next, we design a multi-scale fusion network to adaptively extract and fuse vital features for the purpose of estimating visibility. Furthermore, we create a real-scene dataset for model learning and performance evaluation. Our experiments demonstrate the superiority of our proposed method to the existing methods.
Collapse
Affiliation(s)
- Pengfei Xiao
- Key Laboratory of Transportation Meteorology, China Meteorological Administration, Nanjing 210019, China; (P.X.); (Z.Z.); (J.S.); (X.Z.); (X.Y.)
- Jiangsu Provincial Meteorological Service Center, Nanjing 210019, China
| | - Zhendong Zhang
- Key Laboratory of Transportation Meteorology, China Meteorological Administration, Nanjing 210019, China; (P.X.); (Z.Z.); (J.S.); (X.Z.); (X.Y.)
- Jiangsu Provincial Meteorological Service Center, Nanjing 210019, China
| | - Xiaochun Luo
- Key Laboratory of Transportation Meteorology, China Meteorological Administration, Nanjing 210019, China; (P.X.); (Z.Z.); (J.S.); (X.Z.); (X.Y.)
- Jiangsu Provincial Meteorological Service Center, Nanjing 210019, China
| | - Jiaqing Sun
- Key Laboratory of Transportation Meteorology, China Meteorological Administration, Nanjing 210019, China; (P.X.); (Z.Z.); (J.S.); (X.Z.); (X.Y.)
- Jiangsu Provincial Meteorological Service Center, Nanjing 210019, China
| | - Xuecheng Zhou
- Key Laboratory of Transportation Meteorology, China Meteorological Administration, Nanjing 210019, China; (P.X.); (Z.Z.); (J.S.); (X.Z.); (X.Y.)
- Jiangsu Provincial Meteorological Service Center, Nanjing 210019, China
| | - Xixi Yang
- Key Laboratory of Transportation Meteorology, China Meteorological Administration, Nanjing 210019, China; (P.X.); (Z.Z.); (J.S.); (X.Z.); (X.Y.)
- Jiangsu Provincial Meteorological Service Center, Nanjing 210019, China
| | - Liang Huang
- Key Laboratory of Transportation Meteorology, China Meteorological Administration, Nanjing 210019, China; (P.X.); (Z.Z.); (J.S.); (X.Z.); (X.Y.)
- Jiangsu Provincial Meteorological Service Center, Nanjing 210019, China
| |
Collapse
|
25
|
Maulana A, Noviandy TR, Suhendra R, Earlia N, Bulqiah M, Idroes GM, Niode NJ, Sofyan H, Subianto M, Idroes R. Evaluation of atopic dermatitis severity using artificial intelligence. Narra J 2023; 3:e511. [PMID: 38450339 PMCID: PMC10914065 DOI: 10.52225/narra.v3i3.511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 12/18/2023] [Indexed: 03/08/2024]
Abstract
Atopic dermatitis is a prevalent and persistent chronic inflammatory skin disorder that poses significant challenges when it comes to accurately assessing its severity. The aim of this study was to evaluate deep learning models for automated atopic dermatitis severity scoring using a dataset of Aceh ethnicity individuals in Indonesia. The dataset of clinical images was collected from 250 patients at Dr. Zainoel Abidin Hospital, Banda Aceh, Indonesia and labeled by dermatologists as mild, moderate, severe, or none. Five pretrained convolutional neural networks (CNN) architectures were evaluated: ResNet50, VGGNet19, MobileNetV3, MnasNet, and EfficientNetB0. The evaluation metrics, including accuracy, precision, sensitivity, specificity, and F1-score, were employed to assess the models. Among the models, ResNet50 emerged as the most proficient, demonstrating an accuracy of 89.8%, precision of 90.00%, sensitivity of 89.80%, specificity of 96.60%, and an F1-score of 89.85%. These results highlight the potential of incorporating advanced, data-driven models into the field of dermatology. These models can serve as invaluable tools to assist dermatologists in making early and precise assessments of atopic dermatitis severity and therefore improve patient care and outcomes.
Collapse
Affiliation(s)
- Aga Maulana
- Department of Informatics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh, Indonesia
| | - Teuku R Noviandy
- Department of Informatics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh, Indonesia
| | - Rivansyah Suhendra
- Department of Information Technology, Faculty of Engineering, Universitas Teuku Umar, Meulaboh, Indonesia
| | - Nanda Earlia
- Dermatology Division, Dr. Zainoel Abidin Hospital, Banda Aceh, Indonesia
- Department of Dermatology and Venereology, Faculty of Medicine, Universitas Syiah Kuala, Banda Aceh, Indonesia
| | - Mikyal Bulqiah
- Dermatology Division, Dr. Zainoel Abidin Hospital, Banda Aceh, Indonesia
| | - Ghazi M Idroes
- Department of Occupational Health and Safety, Faculty of Health Sciences, Universitas Abulyatama, Aceh Besar, Indonesia
| | - Nurdjannah J Niode
- Department of Dermatology and Venereology, Faculty of Medicine, Sam Ratulangi University, Manado, Indonesia
| | - Hizir Sofyan
- Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh, Indonesia
| | - Muhammad Subianto
- Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh, Indonesia
| | - Rinaldi Idroes
- Department of Pharmacy, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh, Indonesia
- Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Syiah Kuala, Banda Aceh, Indonesia
| |
Collapse
|
26
|
Iglesias PA, Revilla M, Heppt B, Volodina A, Lechner C. Protocol for a web survey experiment studying the feasibility of asking respondents to capture and submit photos of the books they have at home and the resulting data quality. Open Res Eur 2023; 3:202. [PMID: 38629059 PMCID: PMC11019288 DOI: 10.12688/openreseurope.16507.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 11/06/2023] [Indexed: 04/19/2024]
Abstract
This document presents the protocol of a study conducted as a part of the WEB DATA OPP project, which is funded by the H2020 program. The study aimed to investigate different aspects of the collection of images through web surveys. To do this, we implemented a mobile web survey in an opt-in online panel in Spain. The survey had various questions, some of which were about the books that the participants have at their main residence. The questions related to books were asked in three different ways: regular survey questions showing visual examples of how different numbers of books fit in a 74 centimetre wide shelf depending on their thickness, regular survey questions without the visual examples, and questions where participants were asked to send photos of the books at their home. This report explains how the study was designed and conducted. It covers important aspects such as the experimental design, the questionnaire used, the characteristics of the participants, ethical considerations, and plans for disseminating the results.
Collapse
Affiliation(s)
- Patricia A. Iglesias
- Research and Expertise Centre for Survey Methodology, Department of Political and Social Sciences, Universitat Pompeu Fabra, Barcelona, Catalonia, 08005, Spain
| | - Melanie Revilla
- Institut Barcelona d'Estudis Internacionals, Barcelona, Catalonia, 08005, Spain
| | - Birgit Heppt
- Humboldt-Universitat zu Berlin, Berlin, Berlin, Germany
| | - Anna Volodina
- Institute for Educational Quality Improvement at the Humboldt-Universitat zu Berlin, Berlin, Berlin, Germany
| | - Clemens Lechner
- GESIS – Leibniz Institute for the Social Sciences, Mannheim, Germany
| |
Collapse
|
27
|
Hernandez-Torres SI, Bedolla C, Berard D, Snider EJ. An extended focused assessment with sonography in trauma ultrasound tissue-mimicking phantom for developing automated diagnostic technologies. Front Bioeng Biotechnol 2023; 11:1244616. [PMID: 38033814 PMCID: PMC10682760 DOI: 10.3389/fbioe.2023.1244616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 10/30/2023] [Indexed: 12/02/2023] Open
Abstract
Introduction: Medical imaging-based triage is critical for ensuring medical treatment is timely and prioritized. However, without proper image collection and interpretation, triage decisions can be hard to make. While automation approaches can enhance these triage applications, tissue phantoms must be developed to train and mature these novel technologies. Here, we have developed a tissue phantom modeling the ultrasound views imaged during the enhanced focused assessment with sonography in trauma exam (eFAST). Methods: The tissue phantom utilized synthetic clear ballistic gel with carveouts in the abdomen and rib cage corresponding to the various eFAST scan points. Various approaches were taken to simulate proper physiology without injuries present or to mimic pneumothorax, hemothorax, or abdominal hemorrhage at multiple locations in the torso. Multiple ultrasound imaging systems were used to acquire ultrasound scans with or without injury present and were used to train deep learning image classification predictive models. Results: Performance of the artificial intelligent (AI) models trained in this study achieved over 97% accuracy for each eFAST scan site. We used a previously trained AI model for pneumothorax which achieved 74% accuracy in blind predictions for images collected with the novel eFAST tissue phantom. Grad-CAM heat map overlays for the predictions identified that the AI models were tracking the area of interest for each scan point in the tissue phantom. Discussion: Overall, the eFAST tissue phantom ultrasound scans resembled human images and were successful in training AI models. Tissue phantoms are critical first steps in troubleshooting and developing medical imaging automation technologies for this application that can accelerate the widespread use of ultrasound imaging for emergency triage.
Collapse
Affiliation(s)
| | | | | | - Eric J. Snider
- Organ Support and Automation Technologies Group, U.S. Army Institute of Surgical Research, JBSA Fort Sam Houston, San Antonio, TX, United States
| |
Collapse
|
28
|
Yang E, Shankar K, Kumar S, Seo C. Bioinspired Garra Rufa Optimization-Assisted Deep Learning Model for Object Classification on Pedestrian Walkways. Biomimetics (Basel) 2023; 8:541. [PMID: 37999182 PMCID: PMC10669902 DOI: 10.3390/biomimetics8070541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 10/14/2023] [Accepted: 11/02/2023] [Indexed: 11/25/2023] Open
Abstract
Object detection in pedestrian walkways is a crucial area of research that is widely used to improve the safety of pedestrians. It is not only challenging but also a tedious process to manually examine the labeling of abnormal actions, owing to its broad applications in video surveillance systems and the larger number of videos captured. Thus, an automatic surveillance system that identifies the anomalies has become indispensable for computer vision (CV) researcher workers. The recent advancements in deep learning (DL) algorithms have attracted wide attention for CV processes such as object detection and object classification based on supervised learning that requires labels. The current research study designs the bioinspired Garra rufa optimization-assisted deep learning model for object classification (BGRODL-OC) technique on pedestrian walkways. The objective of the BGRODL-OC technique is to recognize the presence of pedestrians and objects in the surveillance video. To achieve this goal, the BGRODL-OC technique primarily applies the GhostNet feature extractors to produce a set of feature vectors. In addition to this, the BGRODL-OC technique makes use of the GRO algorithm for hyperparameter tuning process. Finally, the object classification is performed via the attention-based long short-term memory (ALSTM) network. A wide range of experimental analysis was conducted to validate the superior performance of the BGRODL-OC technique. The experimental values established the superior performance of the BGRODL-OC algorithm over other existing approaches.
Collapse
Affiliation(s)
- Eunmok Yang
- Department of Financial Information Security, Kookmin University, Seoul 02707, Republic of Korea;
| | - K. Shankar
- Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai 602105, India;
- Big Data and Machine Learning Lab, South Ural State University, Chelyabinsk 454080, Russia
| | - Sachin Kumar
- College of IBS, National University of Science and Technology, MISiS, Moscow 119049, Russia;
| | - Changho Seo
- Department of Convergence Science, Kongju National University, Gongju-si 32588, Chungcheongnam-do, Republic of Korea
- Basic Science Research Institution, Kongju National University, Gongju-si 32588, Chungcheongnam-do, Republic of Korea
| |
Collapse
|
29
|
Thunold HH, Riegler MA, Yazidi A, Hammer HL. A Deep Diagnostic Framework Using Explainable Artificial Intelligence and Clustering. Diagnostics (Basel) 2023; 13:3413. [PMID: 37998548 PMCID: PMC10670034 DOI: 10.3390/diagnostics13223413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/03/2023] [Accepted: 11/06/2023] [Indexed: 11/25/2023] Open
Abstract
An important part of diagnostics is to gain insight into properties that characterize a disease. Machine learning has been used for this purpose, for instance, to identify biomarkers in genomics. However, when patient data are presented as images, identifying properties that characterize a disease becomes far more challenging. A common strategy involves extracting features from the images and analyzing their occurrence in healthy versus pathological images. A limitation of this approach is that the ability to gain new insights into the disease from the data is constrained by the information in the extracted features. Typically, these features are manually extracted by humans, which further limits the potential for new insights. To overcome these limitations, in this paper, we propose a novel framework that provides insights into diseases without relying on handcrafted features or human intervention. Our framework is based on deep learning (DL), explainable artificial intelligence (XAI), and clustering. DL is employed to learn deep patterns, enabling efficient differentiation between healthy and pathological images. Explainable artificial intelligence (XAI) visualizes these patterns, and a novel "explanation-weighted" clustering technique is introduced to gain an overview of these patterns across multiple patients. We applied the method to images from the gastrointestinal tract. In addition to real healthy images and real images of polyps, some of the images had synthetic shapes added to represent other types of pathologies than polyps. The results show that our proposed method was capable of organizing the images based on the reasons they were diagnosed as pathological, achieving high cluster quality and a rand index close to or equal to one.
Collapse
Affiliation(s)
- Håvard Horgen Thunold
- Department of Compute Science, Faculty of Technology, Art and Design, Oslo Metropolitan University, 0176 Oslo, Norway; (H.H.T.); (M.A.R.); (A.Y.)
| | - Michael A. Riegler
- Department of Compute Science, Faculty of Technology, Art and Design, Oslo Metropolitan University, 0176 Oslo, Norway; (H.H.T.); (M.A.R.); (A.Y.)
- Department of Holistic Systems, SimulaMet, 0176 Oslo, Norway
| | - Anis Yazidi
- Department of Compute Science, Faculty of Technology, Art and Design, Oslo Metropolitan University, 0176 Oslo, Norway; (H.H.T.); (M.A.R.); (A.Y.)
| | - Hugo L. Hammer
- Department of Compute Science, Faculty of Technology, Art and Design, Oslo Metropolitan University, 0176 Oslo, Norway; (H.H.T.); (M.A.R.); (A.Y.)
- Department of Holistic Systems, SimulaMet, 0176 Oslo, Norway
| |
Collapse
|
30
|
Ang KM, Lim WH, Tiang SS, Sharma A, Eid MM, Tawfeek SM, Khafaga DS, Alharbi AH, Abdelhamid AA. Optimizing Image Classification: Automated Deep Learning Architecture Crafting with Network and Learning Hyperparameter Tuning. Biomimetics (Basel) 2023; 8:525. [PMID: 37999166 PMCID: PMC10669013 DOI: 10.3390/biomimetics8070525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 11/01/2023] [Accepted: 11/02/2023] [Indexed: 11/25/2023] Open
Abstract
This study introduces ETLBOCBL-CNN, an automated approach for optimizing convolutional neural network (CNN) architectures to address classification tasks of varying complexities. ETLBOCBL-CNN employs an effective encoding scheme to optimize network and learning hyperparameters, enabling the discovery of innovative CNN structures. To enhance the search process, it incorporates a competency-based learning concept inspired by mixed-ability classrooms during the teacher phase. This categorizes learners into competency-based groups, guiding each learner's search process by utilizing the knowledge of the predominant peers, the teacher solution, and the population mean. This approach fosters diversity within the population and promotes the discovery of innovative network architectures. During the learner phase, ETLBOCBL-CNN integrates a stochastic peer interaction scheme that encourages collaborative learning among learners, enhancing the optimization of CNN architectures. To preserve valuable network information and promote long-term population quality improvement, ETLBOCBL-CNN introduces a tri-criterion selection scheme that considers fitness, diversity, and learners' improvement rates. The performance of ETLBOCBL-CNN is evaluated on nine different image datasets and compared to state-of-the-art methods. Notably, ELTLBOCBL-CNN achieves outstanding accuracies on various datasets, including MNIST (99.72%), MNIST-RD (96.67%), MNIST-RB (98.28%), MNIST-BI (97.22%), MNST-RD + BI (83.45%), Rectangles (99.99%), Rectangles-I (97.41%), Convex (98.35%), and MNIST-Fashion (93.70%). These results highlight the remarkable classification accuracy of ETLBOCBL-CNN, underscoring its potential for advancing smart device infrastructure development.
Collapse
Affiliation(s)
- Koon Meng Ang
- Faculty of Engineering, Technology and Built Environment, UCSI University, Kuala Lumpur 56000, Malaysia; (K.M.A.); (S.S.T.)
| | - Wei Hong Lim
- Faculty of Engineering, Technology and Built Environment, UCSI University, Kuala Lumpur 56000, Malaysia; (K.M.A.); (S.S.T.)
| | - Sew Sun Tiang
- Faculty of Engineering, Technology and Built Environment, UCSI University, Kuala Lumpur 56000, Malaysia; (K.M.A.); (S.S.T.)
| | - Abhishek Sharma
- Department of Computer Science and Engineering, Graphic Era Deemed to be University, Dehradun 248002, India;
| | - Marwa M. Eid
- Delta Higher Institute for Engineering and Technology, Mansoura 35511, Egypt;
- Faculty of Artificial Intelligence, Delta University for Science and Technology, Mansoura 35111, Egypt
| | - Sayed M. Tawfeek
- Delta Higher Institute for Engineering and Technology, Mansoura 35511, Egypt;
| | - Doaa Sami Khafaga
- Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia; (D.S.K.); (A.H.A.)
| | - Amal H. Alharbi
- Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia; (D.S.K.); (A.H.A.)
| | - Abdelaziz A. Abdelhamid
- Department of Computer Science, Faculty of Computer and Information Sciences, Ain Shams University, Cairo 11566, Egypt;
- Department of Computer Science, College of Computing and Information Technology, Shaqra University, Sahqra 11961, Saudi Arabia
| |
Collapse
|
31
|
Mohanty S, Shivanna DB, Rao RS, Astekar M, Chandrashekar C, Radhakrishnan R, Sanjeevareddygari S, Kotrashetti V, Kumar P. Building Automation Pipeline for Diagnostic Classification of Sporadic Odontogenic Keratocysts and Non-Keratocysts Using Whole-Slide Images. Diagnostics (Basel) 2023; 13:3384. [PMID: 37958281 PMCID: PMC10648794 DOI: 10.3390/diagnostics13213384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 10/13/2023] [Accepted: 10/27/2023] [Indexed: 11/15/2023] Open
Abstract
The microscopic diagnostic differentiation of odontogenic cysts from other cysts is intricate and may cause perplexity for both clinicians and pathologists. Of particular interest is the odontogenic keratocyst (OKC), a developmental cyst with unique histopathological and clinical characteristics. Nevertheless, what distinguishes this cyst is its aggressive nature and high tendency for recurrence. Clinicians encounter challenges in dealing with this frequently encountered jaw lesion, as there is no consensus on surgical treatment. Therefore, the accurate and early diagnosis of such cysts will benefit clinicians in terms of treatment management and spare subjects from the mental agony of suffering from aggressive OKCs, which impact their quality of life. The objective of this research is to develop an automated OKC diagnostic system that can function as a decision support tool for pathologists, whether they are working locally or remotely. This system will provide them with additional data and insights to enhance their decision-making abilities. This research aims to provide an automation pipeline to classify whole-slide images of OKCs and non-keratocysts (non-KCs: dentigerous and radicular cysts). OKC diagnosis and prognosis using the histopathological analysis of tissues using whole-slide images (WSIs) with a deep-learning approach is an emerging research area. WSIs have the unique advantage of magnifying tissues with high resolution without losing information. The contribution of this research is a novel, deep-learning-based, and efficient algorithm that reduces the trainable parameters and, in turn, the memory footprint. This is achieved using principal component analysis (PCA) and the ReliefF feature selection algorithm (ReliefF) in a convolutional neural network (CNN) named P-C-ReliefF. The proposed model reduces the trainable parameters compared to standard CNN, achieving 97% classification accuracy.
Collapse
Affiliation(s)
- Samahit Mohanty
- Department of Computer Science and Engineering, M S Ramaiah University of Applied Sciences, Bengaluru 560054, India;
| | - Divya B. Shivanna
- Department of Computer Science and Engineering, M S Ramaiah University of Applied Sciences, Bengaluru 560054, India;
| | - Roopa S. Rao
- Department of Oral Pathology and Microbiology, Faculty of Dental Sciences, M S Ramaiah University of Applied Sciences, Bengaluru 560054, India;
| | - Madhusudan Astekar
- Department of Oral Pathology, Institute of Dental Sciences, Bareilly 243006, India;
| | - Chetana Chandrashekar
- Department of Oral & Maxillofacial Pathology & Microbiology, Manipal College of Dental Sciences, Manipal 576104, India; (C.C.); (R.R.)
| | - Raghu Radhakrishnan
- Department of Oral & Maxillofacial Pathology & Microbiology, Manipal College of Dental Sciences, Manipal 576104, India; (C.C.); (R.R.)
| | | | - Vijayalakshmi Kotrashetti
- Department of Oral & Maxillofacial Pathology & Microbiology, Maratha Mandal’s Nathajirao G Halgekar, Institute of Dental Science & Research Centre, Belgaum 590010, India;
| | - Prashant Kumar
- Department of Oral & Maxillofacial Pathology, Nijalingappa Institute of Dental Science & Research, Gulbarga 585105, India;
| |
Collapse
|
32
|
Zhao S, Tu K, Ye S, Tang H, Hu Y, Xie C. Land Use and Land Cover Classification Meets Deep Learning: A Review. Sensors (Basel) 2023; 23:8966. [PMID: 37960665 PMCID: PMC10649958 DOI: 10.3390/s23218966] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Revised: 10/24/2023] [Accepted: 11/02/2023] [Indexed: 11/15/2023]
Abstract
As one of the important components of Earth observation technology, land use and land cover (LULC) image classification plays an essential role. It uses remote sensing techniques to classify specific categories of ground cover as a means of analyzing and understanding the natural attributes of the Earth's surface and the state of land use. It provides important information for applications in environmental protection, urban planning, and land resource management. However, remote sensing images are usually high-dimensional data and have limited available labeled samples, so performing the LULC classification task faces great challenges. In recent years, due to the emergence of deep learning technology, remote sensing data processing methods based on deep learning have achieved remarkable results, bringing new possibilities for the research and development of LULC classification. In this paper, we present a systematic review of deep-learning-based LULC classification, mainly covering the following five aspects: (1) introduction of the main components of five typical deep learning networks, how they work, and their unique benefits; (2) summary of two baseline datasets for LULC classification (pixel-level, patch-level) and performance metrics for evaluating different models (OA, AA, F1, and MIOU); (3) review of deep learning strategies in LULC classification studies, including convolutional neural networks (CNNs), autoencoders (AEs), generative adversarial networks (GANs), and recurrent neural networks (RNNs); (4) challenges faced by LULC classification and processing schemes under limited training samples; (5) outlooks on the future development of deep-learning-based LULC classification.
Collapse
Affiliation(s)
- Shengyu Zhao
- College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing 210037, China
| | - Kaiwen Tu
- College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing 210037, China
| | - Shutong Ye
- College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing 210037, China
| | - Hao Tang
- College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing 210037, China
| | - Yaocong Hu
- School of Electrical Engineering, Anhui Polytechnic University, Wuhu 241000, China
| | - Chao Xie
- College of Mechanical and Electronic Engineering, Nanjing Forestry University, Nanjing 210037, China
- College of Landscape Architecture, Nanjing Forestry University, Nanjing 210037, China
| |
Collapse
|
33
|
Misra S, Yoon C, Kim K, Managuli R, Barr RG, Baek J, Kim C. Deep learning-based multimodal fusion network for segmentation and classification of breast cancers using B-mode and elastography ultrasound images. Bioeng Transl Med 2023; 8:e10480. [PMID: 38023698 PMCID: PMC10658476 DOI: 10.1002/btm2.10480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 12/02/2022] [Accepted: 12/13/2022] [Indexed: 12/01/2023] Open
Abstract
Ultrasonography is one of the key medical imaging modalities for evaluating breast lesions. For differentiating benign from malignant lesions, computer-aided diagnosis (CAD) systems have greatly assisted radiologists by automatically segmenting and identifying features of lesions. Here, we present deep learning (DL)-based methods to segment the lesions and then classify benign from malignant, utilizing both B-mode and strain elastography (SE-mode) images. We propose a weighted multimodal U-Net (W-MM-U-Net) model for segmenting lesions where optimum weight is assigned on different imaging modalities using a weighted-skip connection method to emphasize its importance. We design a multimodal fusion framework (MFF) on cropped B-mode and SE-mode ultrasound (US) lesion images to classify benign and malignant lesions. The MFF consists of an integrated feature network (IFN) and a decision network (DN). Unlike other recent fusion methods, the proposed MFF method can simultaneously learn complementary information from convolutional neural networks (CNNs) trained using B-mode and SE-mode US images. The features from the CNNs are ensembled using the multimodal EmbraceNet model and DN classifies the images using those features. The experimental results (sensitivity of 100 ± 0.00% and specificity of 94.28 ± 7.00%) on the real-world clinical data showed that the proposed method outperforms the existing single- and multimodal methods. The proposed method predicts seven benign patients as benign three times out of five trials and six malignant patients as malignant five out of five trials. The proposed method would potentially enhance the classification accuracy of radiologists for breast cancer detection in US images.
Collapse
Affiliation(s)
- Sampa Misra
- Department of Electrical Engineering, Convergence IT Engineering, Mechanical Engineering, Medical Device Innovation Center, and Graduate School of Artificial IntelligencePohang University of Science and TechnologyPohangSouth Korea
| | - Chiho Yoon
- Department of Electrical Engineering, Convergence IT Engineering, Mechanical Engineering, Medical Device Innovation Center, and Graduate School of Artificial IntelligencePohang University of Science and TechnologyPohangSouth Korea
| | - Kwang‐Ju Kim
- Daegu‐Gyeongbuk Research CenterElectronics and Telecommunications Research Institute (ETRI)DaeguSouth Korea
| | - Ravi Managuli
- Department of BioengineeringUniversity of WashingtonSeattleWashingtonUSA
| | - Richard G. Barr
- Department of RadiologyNortheastern Ohio Medical UniversityYoungstownOhioUSA
| | - Jongduk Baek
- School of Integrated TechnologyYonsei UniversitySeoulSouth Korea
| | - Chulhong Kim
- Department of Electrical Engineering, Convergence IT Engineering, Mechanical Engineering, Medical Device Innovation Center, and Graduate School of Artificial IntelligencePohang University of Science and TechnologyPohangSouth Korea
| |
Collapse
|
34
|
He C, Fan X, Zhou K, Ye Z. Unsupervised Domain Adaptation with Asymmetrical Margin Disparity loss and Outlier Sample Extraction. Neural Netw 2023; 168:602-614. [PMID: 37839331 DOI: 10.1016/j.neunet.2023.09.045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 09/12/2023] [Accepted: 09/25/2023] [Indexed: 10/17/2023]
Abstract
Unsupervised domain adaptation (UDA) trains models using labeled data from a specific source domain and then transferring the knowledge to certain target domains that have few or no labels. Many prior measurement-based works achieve lots of progress, but their feature distinguishing abilities to classify target samples with similar features are not enough; they do not adequately consider the confusing samples in the target domain that are similar to the source domain; and they don't consider negative transfer of the outlier sample in source domain. We address these issues in our work and propose an UDA method with asymmetrical margin disparity loss and outlier sample extraction, called AMD-Net with OSE. We propose an Asymmetrical Margin Disparity Discrepancy (AMD) method and a training strategy based on sample selection mechanism to make the network have better feature extraction ability and the network gets rid of local optimal. Firstly, in the AMD method, we design a multi-label entropy metric to evaluate the marginal disparity loss of the confusing samples in the target domain. This asymmetric marginal disparity loss designment uses the different entropy measurement algorithms of the two domains to excavate the differences of the two domains as much as possible, so as to find the common features of the two domains. Secondly, A sample selection mechanism is designed to evaluate which part of the sample in target domain is confusable. We define the certainty of the sample in the target domain, adopt a progressive learning scheme, and adopt one-hot marginal disparity loss for most of the samples in the target domain with low uncertainty and easy to distinguish. The multi-label marginal calculation method is used only for the uncertainty samples in the target domain whose certainty is less than the threshold value, so that the network can get rid of the local optimal as much as possible. At last, we further propose an outlier sample extraction algorithm (OSE) based on weighted cosine similarity distance for source domain to reduce the negative migration effect caused by outlier samples in the source domain. Extensive experiments on four datasets Office-31, Office-Home, VisDA-2017 and DomainNet demonstrate that our method works well in various UDA settings and outperforms the state-of-the-art methods.
Collapse
Affiliation(s)
- Chunmei He
- School of Computer Science, School of Cyberspace Science, Xiangtan University, Xiangtan, Hunan 411105, China.
| | - Xianjun Fan
- School of Computer Science, School of Cyberspace Science, Xiangtan University, Xiangtan, Hunan 411105, China.
| | - Kang Zhou
- School of Computer Science, School of Cyberspace Science, Xiangtan University, Xiangtan, Hunan 411105, China.
| | - Zhengchun Ye
- School of Mechanical Engineering, Xiangtan University, Xiangtan, Hunan 411105, China.
| |
Collapse
|
35
|
Wen Z, Curran JM, Harbison S, Wevers GE. Classification of firing pin impressions using HOG-SVM. J Forensic Sci 2023; 68:1946-1957. [PMID: 37691406 DOI: 10.1111/1556-4029.15377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Revised: 08/14/2023] [Accepted: 08/24/2023] [Indexed: 09/12/2023]
Abstract
Crimes, such as robbery and murder, often involve firearms. In order to assist with the investigation into the crime, firearm examiners are asked to determine whether cartridge cases found at a crime scene had been fired from a suspect's firearm. This examination is based on a comparison of the marks left on the surfaces of cartridge cases. Firing pin impressions can be one of the most commonly used of these marks. In this study, a total of nine Ruger model 10/22 semiautomatic rifles were used. Fifty cartridges were fired from each rifle. The cartridge cases were collected, and each firing pin impression was then cast and photographed using a comparison microscope. In this paper, we will describe how one may use a computer vision algorithm, the Histogram of Orientated Gradient (HOG), and a machine learning method, Support Vector Machines (SVMs), to classify images of firing pin impressions. Our method achieved a reasonably high accuracy at 93%. This can be used to associate a firearm with a cartridge case recovered from a scene. We also compared our method with other feature extraction algorithms. The comparison results showed that the HOG-SVM method had the highest performance in this classification task.
Collapse
Affiliation(s)
- Zhijian Wen
- Institute of Environmental Science and Research Limited, Auckland, New Zealand
| | - James M Curran
- Institute of Environmental Science and Research Limited, Auckland, New Zealand
| | - SallyAnn Harbison
- Institute of Environmental Science and Research Limited, Auckland, New Zealand
- Department of Statistics, University of Auckland, Auckland, New Zealand
| | - Gerhard E Wevers
- Department of Statistics, University of Auckland, Auckland, New Zealand
| |
Collapse
|
36
|
Hou X, Zhang F, Gulati D, Tan T, Zhang W. E2VIDX: improved bridge between conventional vision and bionic vision. Front Neurorobot 2023; 17:1277160. [PMID: 37954492 PMCID: PMC10639115 DOI: 10.3389/fnbot.2023.1277160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/05/2023] [Indexed: 11/14/2023] Open
Abstract
Common RGBD, CMOS, and CCD-based cameras produce motion blur and incorrect exposure under high-speed and improper lighting conditions. According to the bionic principle, the event camera developed has the advantages of low delay, high dynamic range, and no motion blur. However, due to its unique data representation, it encounters significant obstacles in practical applications. The image reconstruction algorithm based on an event camera solves the problem by converting a series of "events" into common frames to apply existing vision algorithms. Due to the rapid development of neural networks, this field has made significant breakthroughs in past few years. Based on the most popular Events-to-Video (E2VID) method, this study designs a new network called E2VIDX. The proposed network includes group convolution and sub-pixel convolution, which not only achieves better feature fusion but also the network model size is reduced by 25%. Futhermore, we propose a new loss function. The loss function is divided into two parts, first part calculates the high level features and the second part calculates the low level features of the reconstructed image. The experimental results clearly outperform against the state-of-the-art method. Compared with the original method, Structural Similarity (SSIM) increases by 1.3%, Learned Perceptual Image Patch Similarity (LPIPS) decreases by 1.7%, Mean Squared Error (MSE) decreases by 2.5%, and it runs faster on GPU and CPU. Additionally, we evaluate the results of E2VIDX with application to image classification, object detection, and instance segmentation. The experiments show that conversions using our method can help event cameras directly apply existing vision algorithms in most scenarios.
Collapse
Affiliation(s)
- Xujia Hou
- School of Marine Science and Technology, Northwestern Polytechnical University, Xi'An, China
| | - Feihu Zhang
- School of Marine Science and Technology, Northwestern Polytechnical University, Xi'An, China
| | | | - Tingfeng Tan
- School of Marine Science and Technology, Northwestern Polytechnical University, Xi'An, China
| | - Wei Zhang
- School of Marine Science and Technology, Northwestern Polytechnical University, Xi'An, China
| |
Collapse
|
37
|
Brancaccio R, Albertin F, Seracini M, Bettuzzi M, Morigi MP. A Geometric Feature-Based Algorithm for the Virtual Reading of Closed Historical Manuscripts. J Imaging 2023; 9:230. [PMID: 37888337 PMCID: PMC10607176 DOI: 10.3390/jimaging9100230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 10/10/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023] Open
Abstract
X-ray Computed Tomography (CT), a commonly used technique in a wide variety of research fields, nowadays represents a unique and powerful procedure to discover, reveal and preserve a fundamental part of our patrimony: ancient handwritten documents. For modern and well-preserved ones, traditional document scanning systems are suitable for their correct digitization, and, consequently, for their preservation; however, the digitization of ancient, fragile and damaged manuscripts is still a formidable challenge for conservators. The X-ray tomographic approach has already proven its effectiveness in data acquisition, but the algorithmic steps from tomographic images to real page-by-page extraction and reading are still a difficult undertaking. In this work, we propose a new procedure for the segmentation of single pages from the 3D tomographic data of closed historical manuscripts, based on geometric features and flood fill methods. The achieved results prove the capability of the methodology in segmenting the different pages recorded starting from the whole CT acquired volume.
Collapse
Affiliation(s)
- Rosa Brancaccio
- Department of Physics and Astronomy “Augusto Righi”, University of Bologna, 6/2, Viale Carlo Berti Pichat, 40127 Bologna, Italy; (R.B.); (M.B.); (M.P.M.)
| | - Fauzia Albertin
- National Institute of Nuclear Physics & Istituto Nazionale di Fisica Nucleare, CHNet, Division of Bologna, Via Berti Pichat 6/2, 40127 Bologna, Italy
| | - Marco Seracini
- Department of Physics and Astronomy “Augusto Righi”, University of Bologna, 6/2, Viale Carlo Berti Pichat, 40127 Bologna, Italy; (R.B.); (M.B.); (M.P.M.)
| | - Matteo Bettuzzi
- Department of Physics and Astronomy “Augusto Righi”, University of Bologna, 6/2, Viale Carlo Berti Pichat, 40127 Bologna, Italy; (R.B.); (M.B.); (M.P.M.)
| | - Maria Pia Morigi
- Department of Physics and Astronomy “Augusto Righi”, University of Bologna, 6/2, Viale Carlo Berti Pichat, 40127 Bologna, Italy; (R.B.); (M.B.); (M.P.M.)
| |
Collapse
|
38
|
Guillen Bonilla JT, Franco Rodríguez NE, Guillen Bonilla H, Guillen Bonilla A, Rodríguez Betancourtt VM, Jiménez Rodríguez M, Sánchez Morales ME, Blanco Alonso O. A New Texture Spectrum Based on Parallel Encoded Texture Unit and Its Application on Image Classification: A Potential Prospect for Vision Sensing. Sensors (Basel) 2023; 23:8368. [PMID: 37896461 PMCID: PMC10610789 DOI: 10.3390/s23208368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 10/04/2023] [Accepted: 10/08/2023] [Indexed: 10/29/2023]
Abstract
In industrial applications based on texture classification, efficient and fast classifiers are extremely useful for quality control of industrial processes. The classifier of texture images has to satisfy two requirements: It must be efficient and fast. In this work, a texture unit is coded in parallel, and using observation windows larger than 3×3, a new texture spectrum called Texture Spectrum based on the Parallel Encoded Texture Unit (TS_PETU) is proposed, calculated, and used as a characteristic vector in a multi-class classifier, and then two image databases are classified. The first database contains images from the company Interceramic®® and the images were acquired under controlled conditions, and the second database contains tree stems and the images were acquired in natural environments. Based on our experimental results, the TS_PETU satisfied both requirements (efficiency and speed), was developed for binary images, and had high efficiency, and its compute time could be reduced by applying parallel coding concepts. The classification efficiency increased by using larger observational windows, and this one was selected based on the window size. Since the TS_PETU had high efficiency for Interceramic®® tile classification, we consider that the proposed technique has significant industrial applications.
Collapse
Affiliation(s)
- José Trinidad Guillen Bonilla
- Departamento de Electro-Fotónica, Centro Universitario de Ciencias Exactas e Ingenierías, Universidad de Guadalajara, Blvd-M. García Barragán 1421, Guadalajara 44430, Jalisco, Mexico
| | - Nancy Elizabeth Franco Rodríguez
- Departamento de Farmacología, Centro Universitario de Ciencias Exactas e Ingenierías, Universidad de Guadalajara, Blvd-M. García Barragán 1421, Guadalajara 44430, Jalisco, Mexico;
| | - Héctor Guillen Bonilla
- Departamento de Ingeniería de Proyectos, Centro Universitario de Ciencias Exactas e Ingenierías, Universidad de Guadalajara, Blvd-M. García Barragán 1421, Guadalajara 44430, Jalisco, Mexico; (H.G.B.); (V.M.R.B.)
| | - Alex Guillen Bonilla
- Departamento de Ciencias Computacionales e Ingenierías, CUVALLES, Universidad de Guadalajara, Carretera Guadalajara-Ameca Km. 45.5, Ameca 46600, Jalisco, Mexico;
| | - Verónica María Rodríguez Betancourtt
- Departamento de Ingeniería de Proyectos, Centro Universitario de Ciencias Exactas e Ingenierías, Universidad de Guadalajara, Blvd-M. García Barragán 1421, Guadalajara 44430, Jalisco, Mexico; (H.G.B.); (V.M.R.B.)
| | - Maricela Jiménez Rodríguez
- Departamento de Ciencias Básicas, Centro Universitario de la Ciénega (CUCienéga), Universidad de Guadalajara, Av. Universidad No. 1115, LindaVista, Ocotlán 47810, Jalisco, Mexico;
| | - María Eugenia Sánchez Morales
- Departamento de Ciencias Tecnológicas, Centro Universitario de la Ciénega (CUCienéga), Universidad de Guadalajara, Av. Universidad No. 1115, LindaVista, Ocotlán 47810, Jalisco, Mexico;
| | - Oscar Blanco Alonso
- Departamento de Física, Centro Universitario de Ciencias Exactas e Ingenierías, Universidad de Guadalajara, Blvd-M. García Barragán 1421, Guadalajara 44430, Jalisco, Mexico;
| |
Collapse
|
39
|
Abraham A, Jose R, Ahmad J, Joshi J, Jacob T, Khalid AUR, Ali H, Patel P, Singh J, Toma M. Comparative Analysis of Machine Learning Models for Image Detection of Colonic Polyps vs. Resected Polyps. J Imaging 2023; 9:215. [PMID: 37888322 PMCID: PMC10607441 DOI: 10.3390/jimaging9100215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/29/2023] [Accepted: 10/07/2023] [Indexed: 10/28/2023] Open
Abstract
(1) Background: Colon polyps are common protrusions in the colon's lumen, with potential risks of developing colorectal cancer. Early detection and intervention of these polyps are vital for reducing colorectal cancer incidence and mortality rates. This research aims to evaluate and compare the performance of three machine learning image classification models' performance in detecting and classifying colon polyps. (2) Methods: The performance of three machine learning image classification models, Google Teachable Machine (GTM), Roboflow3 (RF3), and You Only Look Once version 8 (YOLOv8n), in the detection and classification of colon polyps was evaluated using the testing split for each model. The external validity of the test was analyzed using 90 images that were not used to test, train, or validate the model. The study used a dataset of colonoscopy images of normal colon, polyps, and resected polyps. The study assessed the models' ability to correctly classify the images into their respective classes using precision, recall, and F1 score generated from confusion matrix analysis and performance graphs. (3) Results: All three models successfully distinguished between normal colon, polyps, and resected polyps in colonoscopy images. GTM achieved the highest accuracies: 0.99, with consistent precision, recall, and F1 scores of 1.00 for the 'normal' class, 0.97-1.00 for 'polyps', and 0.97-1.00 for 'resected polyps'. While GTM exclusively classified images into these three categories, both YOLOv8n and RF3 were able to detect and specify the location of normal colonic tissue, polyps, and resected polyps, with YOLOv8n and RF3 achieving overall accuracies of 0.84 and 0.87, respectively. (4) Conclusions: Machine learning, particularly models like GTM, shows promising results in ensuring comprehensive detection of polyps during colonoscopies.
Collapse
Affiliation(s)
- Adriel Abraham
- New York Institute of Technology, College of Osteopathic Medicine, Old Westbury, NY 11568, USA; (A.A.); (R.J.); (J.A.); (J.J.); (T.J.); (A.-u.-r.K.)
| | - Rejath Jose
- New York Institute of Technology, College of Osteopathic Medicine, Old Westbury, NY 11568, USA; (A.A.); (R.J.); (J.A.); (J.J.); (T.J.); (A.-u.-r.K.)
| | - Jawad Ahmad
- New York Institute of Technology, College of Osteopathic Medicine, Old Westbury, NY 11568, USA; (A.A.); (R.J.); (J.A.); (J.J.); (T.J.); (A.-u.-r.K.)
| | - Jai Joshi
- New York Institute of Technology, College of Osteopathic Medicine, Old Westbury, NY 11568, USA; (A.A.); (R.J.); (J.A.); (J.J.); (T.J.); (A.-u.-r.K.)
| | - Thomas Jacob
- New York Institute of Technology, College of Osteopathic Medicine, Old Westbury, NY 11568, USA; (A.A.); (R.J.); (J.A.); (J.J.); (T.J.); (A.-u.-r.K.)
| | - Aziz-ur-rahman Khalid
- New York Institute of Technology, College of Osteopathic Medicine, Old Westbury, NY 11568, USA; (A.A.); (R.J.); (J.A.); (J.J.); (T.J.); (A.-u.-r.K.)
| | - Hassam Ali
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Internal Medicine, Brody School of Medicine, East Carolina University, Greenville, NC 27858, USA;
| | - Pratik Patel
- Department of Gastroenterology, Northwell Mather Hospital, Port Jefferson, NY 11777, USA (J.S.)
| | - Jaspreet Singh
- Department of Gastroenterology, Northwell Mather Hospital, Port Jefferson, NY 11777, USA (J.S.)
| | - Milan Toma
- New York Institute of Technology, College of Osteopathic Medicine, Old Westbury, NY 11568, USA; (A.A.); (R.J.); (J.A.); (J.J.); (T.J.); (A.-u.-r.K.)
| |
Collapse
|
40
|
Baek SC, Lee KH, Kim IH, Seo DM, Park K. Construction of Asbestos Slate Deep-Learning Training-Data Model Based on Drone Images. Sensors (Basel) 2023; 23:8021. [PMID: 37836851 PMCID: PMC10575463 DOI: 10.3390/s23198021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 09/11/2023] [Accepted: 09/20/2023] [Indexed: 10/15/2023]
Abstract
The detection of asbestos roof slate by drone is necessary to avoid the safety risks and costs associated with visual inspection. Moreover, the use of deep-learning models increases the speed as well as reduces the cost of analyzing the images provided by the drone. In this study, we developed a comprehensive learning model using supervised and unsupervised classification techniques for the accurate classification of roof slate. We ensured the accuracy of our model using a low altitude of 100 m, which led to a ground sampling distance of 3 cm/pixel. Furthermore, we ensured that the model was comprehensive by including images captured under a variety of light and meteorological conditions and from a variety of angles. After applying the two classification methods to develop the learning dataset and employing the as-developed model for classification, 12 images were misclassified out of 475. Visual inspection and an adjustment of the classification system were performed, and the model was updated to precisely classify all 475 images. These results show that supervised and unsupervised classification can be used together to improve the accuracy of a deep-learning model for the detection of asbestos roof slate.
Collapse
Affiliation(s)
- Seung-Chan Baek
- Department of Architecture, Kyungil University, Gyeongsan 38428, Republic of Korea; (S.-C.B.); (K.-H.L.)
| | - Kwang-Hyun Lee
- Department of Architecture, Kyungil University, Gyeongsan 38428, Republic of Korea; (S.-C.B.); (K.-H.L.)
| | - In-Ho Kim
- Department of Civil Engineering, Kunsan National University, Kunsan 54150, Republic of Korea;
| | - Dong-Min Seo
- School of Architecture, Civil, Environmental and Energy Engineering, Kyungpook National University, Daegu 41566, Republic of Korea;
| | - Kiyong Park
- Department of Big Data, Chungbuk National University, Cheongju 28644, Republic of Korea
| |
Collapse
|
41
|
Wang H, Wang K, Yan T, Zhou H, Cao E, Lu Y, Wang Y, Luo J, Pang Y. Endoscopic image classification algorithm based on Poolformer. Front Neurosci 2023; 17:1273686. [PMID: 37811325 PMCID: PMC10551176 DOI: 10.3389/fnins.2023.1273686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 09/04/2023] [Indexed: 10/10/2023] Open
Abstract
Image desmoking is a significant aspect of endoscopic image processing, effectively mitigating visual field obstructions without the need for additional surgical interventions. However, current smoke removal techniques tend to apply comprehensive video enhancement to all frames, encompassing both smoke-free and smoke-affected images, which not only escalates computational costs but also introduces potential noise during the enhancement of smoke-free images. In response to this challenge, this paper introduces an approach for classifying images that contain surgical smoke within endoscopic scenes. This classification method provides crucial target frame information for enhancing surgical smoke removal, improving the scientific robustness, and enhancing the real-time processing capabilities of image-based smoke removal method. The proposed endoscopic smoke image classification algorithm based on the improved Poolformer model, augments the model's capacity for endoscopic image feature extraction. This enhancement is achieved by transforming the Token Mixer within the encoder into a multi-branch structure akin to ConvNeXt, a pure convolutional neural network. Moreover, the conversion to a single-path topology during the prediction phase elevates processing speed. Experiments use the endoscopic dataset sourced from the Hamlyn Centre Laparoscopic/Endoscopic Video Dataset, augmented by Blender software rendering. The dataset comprises 3,800 training images and 1,200 test images, distributed in a 4:1 ratio of smoke-free to smoke-containing images. The outcomes affirm the superior performance of this paper's approach across multiple parameters. Comparative assessments against existing models, such as mobilenet_v3, efficientnet_b7, and ViT-B/16, substantiate that the proposed method excels in accuracy, sensitivity, and inference speed. Notably, when contrasted with the Poolformer_s12 network, the proposed method achieves a 2.3% enhancement in accuracy, an 8.2% boost in sensitivity, while incurring a mere 6.4 frames per second reduction in processing speed, maintaining 87 frames per second. The results authenticate the improved performance of the refined Poolformer model in endoscopic smoke image classification tasks. This advancement presents a lightweight yet effective solution for the automatic detection of smoke-containing images in endoscopy. This approach strikes a balance between the accuracy and real-time processing requirements of endoscopic image analysis, offering valuable insights for targeted desmoking process.
Collapse
Affiliation(s)
- Huiqian Wang
- Postdoctoral Research Station, Chongqing Key Laboratory of Photoelectronic Information Sensing and Transmitting Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
- Chongqing Xishan Science & Technology Co., Ltd., Chongqing, China
| | - Kun Wang
- Postdoctoral Research Station, Chongqing Key Laboratory of Photoelectronic Information Sensing and Transmitting Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Tian Yan
- Postdoctoral Research Station, Chongqing Key Laboratory of Photoelectronic Information Sensing and Transmitting Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Hekai Zhou
- Postdoctoral Research Station, Chongqing Key Laboratory of Photoelectronic Information Sensing and Transmitting Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Enling Cao
- Postdoctoral Research Station, Chongqing Key Laboratory of Photoelectronic Information Sensing and Transmitting Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Yi Lu
- Postdoctoral Research Station, Chongqing Key Laboratory of Photoelectronic Information Sensing and Transmitting Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Yuanfa Wang
- Postdoctoral Research Station, Chongqing Key Laboratory of Photoelectronic Information Sensing and Transmitting Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
- Chongqing Xishan Science & Technology Co., Ltd., Chongqing, China
| | - Jiasai Luo
- Postdoctoral Research Station, Chongqing Key Laboratory of Photoelectronic Information Sensing and Transmitting Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Yu Pang
- Postdoctoral Research Station, Chongqing Key Laboratory of Photoelectronic Information Sensing and Transmitting Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
| |
Collapse
|
42
|
Mustafa Z, Nsour H. Using Computer Vision Techniques to Automatically Detect Abnormalities in Chest X-rays. Diagnostics (Basel) 2023; 13:2979. [PMID: 37761345 PMCID: PMC10530162 DOI: 10.3390/diagnostics13182979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 07/23/2023] [Accepted: 08/07/2023] [Indexed: 09/29/2023] Open
Abstract
Our research focused on creating an advanced machine-learning algorithm that accurately detects anomalies in chest X-ray images to provide healthcare professionals with a reliable tool for diagnosing various lung conditions. To achieve this, we analysed a vast collection of X-ray images and utilised sophisticated visual analysis techniques; such as deep learning (DL) algorithms, object recognition, and categorisation models. To create our model, we used a large training dataset of chest X-rays, which provided valuable information for visualising and categorising abnormalities. We also utilised various data augmentation methods; such as scaling, rotation, and imitation; to increase the diversity of images used for training. We adopted the widely used You Only Look Once (YOLO) v8 algorithm, an object recognition paradigm that has demonstrated positive outcomes in computer vision applications, and modified it to classify X-ray images into distinct categories; such as respiratory infections, tuberculosis (TB), and lung nodules. It was particularly effective in identifying unique and crucial outcomes that may, otherwise, be difficult to detect using traditional diagnostic methods. Our findings demonstrate that healthcare practitioners can reliably use machine learning (ML) algorithms to diagnose various lung disorders with greater accuracy and efficiency.
Collapse
Affiliation(s)
- Zaid Mustafa
- Department of Computer Information Systems, Prince Abdullah Bin Ghazi Faculty of Information and Communication Technology, Al-Balqa Applied University, Al-Salt 19117, Jordan
| | - Heba Nsour
- Department of Computer Science, Prince Abdullah Bin Ghazi Faculty of Information and Communication Technology, Al-Balqa Applied University, Al-Salt 19117, Jordan
| |
Collapse
|
43
|
Cui Z, Li K, Kang C, Wu Y, Li T, Li M. Plant and Disease Recognition Based on PMF Pipeline Domain Adaptation Method: Using Bark Images as Meta-Dataset. Plants (Basel) 2023; 12:3280. [PMID: 37765444 PMCID: PMC10534746 DOI: 10.3390/plants12183280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/11/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023]
Abstract
Efficient image recognition is important in crop and forest management. However, it faces many challenges, such as the large number of plant species and diseases, the variability of plant appearance, and the scarcity of labeled data for training. To address this issue, we modified a SOTA Cross-Domain Few-shot Learning (CDFSL) method based on prototypical networks and attention mechanisms. We employed attention mechanisms to perform feature extraction and prototype generation by focusing on the most relevant parts of the images, then used prototypical networks to learn the prototype of each category and classify new instances. Finally, we demonstrated the effectiveness of the modified CDFSL method on several plant and disease recognition datasets. The results showed that the modified pipeline was able to recognize several cross-domain datasets using generic representations, and achieved up to 96.95% and 94.07% classification accuracy on datasets with the same and different domains, respectively. In addition, we visualized the experimental results, demonstrating the model's stable transfer capability between datasets and the model's high visual correlation with plant and disease biological characteristics. Moreover, by extending the classes of different semantics within the training dataset, our model can be generalized to other domains, which implies broad applicability.
Collapse
Affiliation(s)
| | | | | | | | | | - Mingyang Li
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing 210037, China; (Z.C.); (K.L.); (C.K.); (Y.W.); (T.L.)
| |
Collapse
|
44
|
Figueroa-Flores C, San-Martin P. Deep learning for Chilean native flora classification: a comparative analysis. Front Plant Sci 2023; 14:1211490. [PMID: 37767291 PMCID: PMC10520280 DOI: 10.3389/fpls.2023.1211490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 08/15/2023] [Indexed: 09/29/2023]
Abstract
The limited availability of information on Chilean native flora has resulted in a lack of knowledge among the general public, and the classification of these plants poses challenges without extensive expertise. This study evaluates the performance of several Deep Learning (DL) models, namely InceptionV3, VGG19, ResNet152, and MobileNetV2, in classifying images representing Chilean native flora. The models are pre-trained on Imagenet. A dataset containing 500 images for each of the 10 classes of native flowers in Chile was curated, resulting in a total of 5000 images. The DL models were applied to this dataset, and their performance was compared based on accuracy and other relevant metrics. The findings highlight the potential of DL models to accurately classify images of Chilean native flora. The results contribute to enhancing the understanding of these plant species and fostering awareness among the general public. Further improvements and applications of DL in ecology and biodiversity research are discussed.
Collapse
Affiliation(s)
- Carola Figueroa-Flores
- Department of Computer Science and Information Technology, Universidad del Bío Bío, Chillán, Chile
| | - Pablo San-Martin
- School of Computer and Information Engineering, Universidad del Bío-Bío, Chillán, Chile
| |
Collapse
|
45
|
Li C, Chen Z, Jing W, Wu X, Zhao Y. A lightweight method for maize seed defects identification based on Convolutional Block Attention Module. Front Plant Sci 2023; 14:1153226. [PMID: 37731985 PMCID: PMC10508185 DOI: 10.3389/fpls.2023.1153226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 08/15/2023] [Indexed: 09/22/2023]
Abstract
Maize is widely cultivated and planted all over the world, which is one of the main food resources. Accurately identifying the defect of maize seeds is of great significance in both food safety and agricultural production. In recent years, methods based on deep learning have performed well in image processing, but their potential in the identification of maize seed defects has not been fully realized. Therefore, in this paper, a lightweight and effective network for maize seed defect identification is proposed. In the proposed network, the Convolutional Block Attention Module (CBAM) was integrated into the pretrained MobileNetv3 network for extracting important features in the channel and spatial domain. In this way, the network can be focused on useful feature information, and making it easier to converge. To verify the effectiveness of the proposed network, a total of 12784 images was collected, and 7 defect types were defined. Compared with other popular pretrained models, the proposed network converges with the least number of iterations and achieves the true positive rate is 93.14% and the false positive rate is 1.14%.
Collapse
Affiliation(s)
- Chao Li
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, China
| | - Zhenyu Chen
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, China
| | - Weipeng Jing
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, China
| | - Xiaoqiang Wu
- School of Mechanical Engineering, Inner Mongolia University for Nationalities, Tongliao, Inner Mongolia Autonomous Region, China
| | - Yonghui Zhao
- College of Computer and Control Engineering, Northeast Forestry University, Harbin, China
| |
Collapse
|
46
|
Sanaullah, Koravuna S, Rückert U, Jungeblut T. Evaluation of Spiking Neural Nets-Based Image Classification Using the Runtime Simulator RAVSim. Int J Neural Syst 2023; 33:2350044. [PMID: 37604777 DOI: 10.1142/s0129065723500442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
Spiking Neural Networks (SNNs) help achieve brain-like efficiency and functionality by building neurons and synapses that mimic the human brain's transmission of electrical signals. However, optimal SNN implementation requires a precise balance of parametric values. To design such ubiquitous neural networks, a graphical tool for visualizing, analyzing, and explaining the internal behavior of spikes is crucial. Although some popular SNN simulators are available, these tools do not allow users to interact with the neural network during simulation. To this end, we have introduced the first runtime interactive simulator, called Runtime Analyzing and Visualization Simulator (RAVSim),a developed to analyze and dynamically visualize the behavior of SNNs, allowing end-users to interact, observe output concentration reactions, and make changes directly during the simulation. In this paper, we present RAVSim with the current implementation of runtime interaction using the LIF neural model with different connectivity schemes, an image classification model using SNNs, and a dataset creation feature. Our main objective is to primarily investigate binary classification using SNNs with RGB images. We created a feed-forward network using the LIF neural model for an image classification algorithm and evaluated it by using RAVSim. The algorithm classifies faces with and without masks, achieving an accuracy of 91.8% using 1000 neurons in a hidden layer, 0.0758 MSE, and an execution time of ∼10[Formula: see text]min on the CPU. The experimental results show that using RAVSim not only increases network design speed but also accelerates user learning capability.
Collapse
Affiliation(s)
- Sanaullah
- Department of Engineering and Mathematics, Bielefeld University of Applied Science, Bielefeld, Germany
| | - Shamini Koravuna
- Department of Cognitive Interaction Technology Center, Bielefeld University, Bielefeld, Germany
| | - Ulrich Rückert
- Department of Cognitive Interaction Technology Center, Bielefeld University, Bielefeld, Germany
| | - Thorsten Jungeblut
- Department of Engineering and Mathematics, Bielefeld University of Applied Science, Bielefeld, Germany
| |
Collapse
|
47
|
Li Y, Huang WC, Song PH. A face image classification method of autistic children based on the two-phase transfer learning. Front Psychol 2023; 14:1226470. [PMID: 37720633 PMCID: PMC10501480 DOI: 10.3389/fpsyg.2023.1226470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 07/17/2023] [Indexed: 09/19/2023] Open
Abstract
Autism spectrum disorder (ASD) is a neurodevelopmental disorder, which seriously affects children's normal life. Screening potential autistic children before professional diagnose is helpful to early detection and early intervention. Autistic children have some different facial features from non-autistic children, so the potential autistic children can be screened by taking children's facial images and analyzing them with a mobile phone. The area under curve (AUC) is a more robust metrics than accuracy in evaluating the performance of a model used to carry out the two-category classification, and the AUC of the deep learning model suitable for the mobile terminal in the existing research can be further improved. Moreover, the size of an input image is large, which is not fit for a mobile phone. A deep transfer learning method is proposed in this research, which can use images with smaller size and improve the AUC of existing studies. The proposed transfer method uses the two-phase transfer learning mode and the multi-classifier integration mode. For MobileNetV2 and MobileNetV3-Large that are suitable for a mobile phone, the two-phase transfer learning mode is used to improve their classification performance, and then the multi-classifier integration mode is used to integrate them to further improve the classification performance. A multi-classifier integrating calculation method is also proposed to calculate the final classification results according to the classifying results of the participating models. The experimental results show that compared with the one-phase transfer learning, the two-phase transfer learning can significantly improve the classification performance of MobileNetV2 and MobileNetV3-Large, and the classification performance of the integrated classifier is better than that of any participating classifiers. The accuracy of the integrated classifier in this research is 90.5%, and the AUC is 96.32%, which is 3.51% greater than the AUC (92.81%) of the previous studies.
Collapse
Affiliation(s)
- Ying Li
- Guangxi Key Laboratory of Human-machine Interaction and Intelligent Decision, School of Logistics Management and Engineering, Nanning Normal University, Nanning, China
| | - Wen-Cong Huang
- Department of Sports and Health, Guangxi College for Preschool Education, Nanning, China
| | - Pei-Hua Song
- Guangxi Key Laboratory of Human-machine Interaction and Intelligent Decision, School of Logistics Management and Engineering, Nanning Normal University, Nanning, China
| |
Collapse
|
48
|
Baena E, Fortes S, Muro F, Baena C, Barco R. Beyond REM: A New Approach to the Use of Image Classifiers for the Management of 6G Networks. Sensors (Basel) 2023; 23:7494. [PMID: 37687951 PMCID: PMC10490823 DOI: 10.3390/s23177494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 08/05/2023] [Accepted: 08/15/2023] [Indexed: 09/10/2023]
Abstract
The management of cellular networks, particularly within the environment rapidly advancing to 6G, presents considerable challenges due to the highly dynamic radio environment. Traditional tools such as Radio Environment Maps (REMs) have proven inadequate for real-time network changes, underlining the need for more sophisticated solutions. In response to these challenges, this work introduces a novel approach that harnesses the unprecedented power of state-of-the-art image classifiers for network management. This method involves the generation of Network Synthetic Images (NSIs), which are enriched heat maps that precisely reflect varying cellular network operating states. Created from user location traces linked with Key Performance Indicators (KPIs), NSIs are strategically designed to meet the intricate demands of 6G networks. This research delves deep into a comprehensive analysis of the diverse factors that could potentially impact the successful application of this methodology in the realm of 6G. The results from this investigation, coupled with a comparative assessment against traditional REM usage, emphasize the superior performance of this innovative method. Additionally, a case study involving an automatic network diagnosis scenario validates the effectiveness of this approach. The findings reveal that a generic Convolutional Neural Network (CNN), one of the most powerful tools in the arsenal of modern image classifiers, delivers enhanced performance, even with a reduced demand for positioning accuracy. This contributes significantly to the real-time, robust management of cellular networks as we transition into the era of 6G.
Collapse
Affiliation(s)
- Eduardo Baena
- Instituto de Telecomunicación (TELMA), CEI Andalucía TECH, E.T.S. Ingeniería de Telecomunicación, Universidad de Málaga, 29010 Málaga, Spain
| | | | | | | | | |
Collapse
|
49
|
Fan X, Zhang H, Zhang Y. IDSNN: Towards High-Performance and Low-Latency SNN Training via Initialization and Distillation. Biomimetics (Basel) 2023; 8:375. [PMID: 37622980 PMCID: PMC10452895 DOI: 10.3390/biomimetics8040375] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 08/03/2023] [Accepted: 08/15/2023] [Indexed: 08/26/2023] Open
Abstract
Spiking neural networks (SNNs) are widely recognized for their biomimetic and efficient computing features. They utilize spikes to encode and transmit information. Despite the many advantages of SNNs, they suffer from the problems of low accuracy and large inference latency, which are, respectively, caused by the direct training and conversion from artificial neural network (ANN) training methods. Aiming to address these limitations, we propose a novel training pipeline (called IDSNN) based on parameter initialization and knowledge distillation, using ANN as a parameter source and teacher. IDSNN maximizes the knowledge extracted from ANNs and achieves competitive top-1 accuracy for CIFAR10 (94.22%) and CIFAR100 (75.41%) with low latency. More importantly, it can achieve 14× faster convergence speed than directly training SNNs under limited training resources, which demonstrates its practical value in applications.
Collapse
Affiliation(s)
- Xiongfei Fan
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China; (X.F.); (H.Z.)
| | - Hong Zhang
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China; (X.F.); (H.Z.)
| | - Yu Zhang
- State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China; (X.F.); (H.Z.)
- Key Laboratory of Collaborative Sensing and Autonomous Unmanned Systems of Zhejiang Province, Hangzhou 310027, China
| |
Collapse
|
50
|
Madusanka N, Jayalath P, Fernando D, Yasakethu L, Lee BI. Impact of H&E Stain Normalization on Deep Learning Models in Cancer Image Classification: Performance, Complexity, and Trade-Offs. Cancers (Basel) 2023; 15:4144. [PMID: 37627172 PMCID: PMC10452714 DOI: 10.3390/cancers15164144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 07/28/2023] [Accepted: 08/02/2023] [Indexed: 08/27/2023] Open
Abstract
Accurate classification of cancer images plays a crucial role in diagnosis and treatment planning. Deep learning (DL) models have shown promise in achieving high accuracy, but their performance can be influenced by variations in Hematoxylin and Eosin (H&E) staining techniques. In this study, we investigate the impact of H&E stain normalization on the performance of DL models in cancer image classification. We evaluate the performance of VGG19, VGG16, ResNet50, MobileNet, Xception, and InceptionV3 on a dataset of H&E-stained cancer images. Our findings reveal that while VGG16 exhibits strong performance, VGG19 and ResNet50 demonstrate limitations in this context. Notably, stain normalization techniques significantly improve the performance of less complex models such as MobileNet and Xception. These models emerge as competitive alternatives with lower computational complexity and resource requirements and high computational efficiency. The results highlight the importance of optimizing less complex models through stain normalization to achieve accurate and reliable cancer image classification. This research holds tremendous potential for advancing the development of computationally efficient cancer classification systems, ultimately benefiting cancer diagnosis and treatment.
Collapse
Affiliation(s)
- Nuwan Madusanka
- Digital Healthcare Research Center, Pukyong National University, Busan 48513, Republic of Korea;
| | - Pramudini Jayalath
- Institute of Biochemistry, Faculty of Mathematics and Natural Science, University of Cologne, 50923 Cologne, Germany;
| | - Dileepa Fernando
- School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore;
| | - Lasith Yasakethu
- Department of Software Engineering, Sri Lanka Technological Campus (SLTC), Padukka 10500, Sri Lanka;
| | - Byeong-Il Lee
- Digital Healthcare Research Center, Pukyong National University, Busan 48513, Republic of Korea;
- Division of Smart Healthcare, College of Information Technology and Convergence, Pukyong National University, Busan 48513, Republic of Korea
- Department of Industry 4.0 Convergence Bionics Engineering, Pukyoung National University, Busan 48513, Republic of Korea
| |
Collapse
|