1
|
Gómez-Martínez V, Chushig-Muzo D, Veierød MB, Granja C, Soguero-Ruiz C. Ensemble feature selection and tabular data augmentation with generative adversarial networks to enhance cutaneous melanoma identification and interpretability. BioData Min 2024; 17:46. [PMID: 39478549 PMCID: PMC11526724 DOI: 10.1186/s13040-024-00397-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Accepted: 10/09/2024] [Indexed: 11/02/2024] Open
Abstract
BACKGROUND Cutaneous melanoma is the most aggressive form of skin cancer, responsible for most skin cancer-related deaths. Recent advances in artificial intelligence, jointly with the availability of public dermoscopy image datasets, have allowed to assist dermatologists in melanoma identification. While image feature extraction holds potential for melanoma detection, it often leads to high-dimensional data. Furthermore, most image datasets present the class imbalance problem, where a few classes have numerous samples, whereas others are under-represented. METHODS In this paper, we propose to combine ensemble feature selection (FS) methods and data augmentation with the conditional tabular generative adversarial networks (CTGAN) to enhance melanoma identification in imbalanced datasets. We employed dermoscopy images from two public datasets, PH2 and Derm7pt, which contain melanoma and not-melanoma lesions. To capture intrinsic information from skin lesions, we conduct two feature extraction (FE) approaches, including handcrafted and embedding features. For the former, color, geometric and first-, second-, and higher-order texture features were extracted, whereas for the latter, embeddings were obtained using ResNet-based models. To alleviate the high-dimensionality in the FE, ensemble FS with filter methods were used and evaluated. For data augmentation, we conducted a progressive analysis of the imbalance ratio (IR), related to the amount of synthetic samples created, and evaluated the impact on the predictive results. To gain interpretability on predictive models, we used SHAP, bootstrap resampling statistical tests and UMAP visualizations. RESULTS The combination of ensemble FS, CTGAN, and linear models achieved the best predictive results, achieving AUCROC values of 87% (with support vector machine and IR=0.9) and 76% (with LASSO and IR=1.0) for the PH2 and Derm7pt, respectively. We also identified that melanoma lesions were mainly characterized by features related to color, while not-melanoma lesions were characterized by texture features. CONCLUSIONS Our results demonstrate the effectiveness of ensemble FS and synthetic data in the development of models that accurately identify melanoma. This research advances skin lesion analysis, contributing to both melanoma detection and the interpretation of main features for its identification.
Collapse
Affiliation(s)
- Vanesa Gómez-Martínez
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid, 28943, Spain.
| | - David Chushig-Muzo
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid, 28943, Spain
| | - Marit B Veierød
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, Institute of Basic Medical Sciences, University of Oslo, Oslo, Norway
| | - Conceição Granja
- Norwegian Centre for E-health Research, University Hospital of North Norway, Tromsø, 9019, Norway
| | - Cristina Soguero-Ruiz
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid, 28943, Spain
| |
Collapse
|
2
|
Lyakhova UA, Lyakhov PA. Systematic review of approaches to detection and classification of skin cancer using artificial intelligence: Development and prospects. Comput Biol Med 2024; 178:108742. [PMID: 38875908 DOI: 10.1016/j.compbiomed.2024.108742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 06/03/2024] [Accepted: 06/08/2024] [Indexed: 06/16/2024]
Abstract
In recent years, there has been a significant improvement in the accuracy of the classification of pigmented skin lesions using artificial intelligence algorithms. Intelligent analysis and classification systems are significantly superior to visual diagnostic methods used by dermatologists and oncologists. However, the application of such systems in clinical practice is severely limited due to a lack of generalizability and risks of potential misclassification. Successful implementation of artificial intelligence-based tools into clinicopathological practice requires a comprehensive study of the effectiveness and performance of existing models, as well as further promising areas for potential research development. The purpose of this systematic review is to investigate and evaluate the accuracy of artificial intelligence technologies for detecting malignant forms of pigmented skin lesions. For the study, 10,589 scientific research and review articles were selected from electronic scientific publishers, of which 171 articles were included in the presented systematic review. All selected scientific articles are distributed according to the proposed neural network algorithms from machine learning to multimodal intelligent architectures and are described in the corresponding sections of the manuscript. This research aims to explore automated skin cancer recognition systems, from simple machine learning algorithms to multimodal ensemble systems based on advanced encoder-decoder models, visual transformers (ViT), and generative and spiking neural networks. In addition, as a result of the analysis, future directions of research, prospects, and potential for further development of automated neural network systems for classifying pigmented skin lesions are discussed.
Collapse
Affiliation(s)
- U A Lyakhova
- Department of Mathematical Modeling, North-Caucasus Federal University, 355017, Stavropol, Russia.
| | - P A Lyakhov
- Department of Mathematical Modeling, North-Caucasus Federal University, 355017, Stavropol, Russia; North-Caucasus Center for Mathematical Research, North-Caucasus Federal University, 355017, Stavropol, Russia.
| |
Collapse
|
3
|
Debelee TG. Skin Lesion Classification and Detection Using Machine Learning Techniques: A Systematic Review. Diagnostics (Basel) 2023; 13:3147. [PMID: 37835889 PMCID: PMC10572538 DOI: 10.3390/diagnostics13193147] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 09/22/2023] [Accepted: 09/24/2023] [Indexed: 10/15/2023] Open
Abstract
Skin lesions are essential for the early detection and management of a number of dermatological disorders. Learning-based methods for skin lesion analysis have drawn much attention lately because of improvements in computer vision and machine learning techniques. A review of the most-recent methods for skin lesion classification, segmentation, and detection is presented in this survey paper. The significance of skin lesion analysis in healthcare and the difficulties of physical inspection are discussed in this survey paper. The review of state-of-the-art papers targeting skin lesion classification is then covered in depth with the goal of correctly identifying the type of skin lesion from dermoscopic, macroscopic, and other lesion image formats. The contribution and limitations of various techniques used in the selected study papers, including deep learning architectures and conventional machine learning methods, are examined. The survey then looks into study papers focused on skin lesion segmentation and detection techniques that aimed to identify the precise borders of skin lesions and classify them accordingly. These techniques make it easier to conduct subsequent analyses and allow for precise measurements and quantitative evaluations. The survey paper discusses well-known segmentation algorithms, including deep-learning-based, graph-based, and region-based ones. The difficulties, datasets, and evaluation metrics particular to skin lesion segmentation are also discussed. Throughout the survey, notable datasets, benchmark challenges, and evaluation metrics relevant to skin lesion analysis are highlighted, providing a comprehensive overview of the field. The paper concludes with a summary of the major trends, challenges, and potential future directions in skin lesion classification, segmentation, and detection, aiming to inspire further advancements in this critical domain of dermatological research.
Collapse
Affiliation(s)
- Taye Girma Debelee
- Ethiopian Artificial Intelligence Institute, Addis Ababa 40782, Ethiopia;
- Department of Electrical and Computer Engineering, Addis Ababa Science and Technology University, Addis Ababa 16417, Ethiopia
| |
Collapse
|
4
|
Zhang Q, Sheng J, Zhang Q, Wang L, Yang Z, Xin Y. Enhanced Harris hawks optimization-based fuzzy k-nearest neighbor algorithm for diagnosis of Alzheimer's disease. Comput Biol Med 2023; 165:107392. [PMID: 37669585 DOI: 10.1016/j.compbiomed.2023.107392] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 07/30/2023] [Accepted: 08/25/2023] [Indexed: 09/07/2023]
Abstract
In order to stop deterioration and give patients with Alzheimer's disease (AD) early therapy, it is crucial to correctly diagnose AD and its early stage, mild cognitive impairment (MCI). A framework for diagnosing AD is presented in this paper, which includes magnetic resonance imaging (MRI) image preprocessing, feature extraction, and the Fuzzy k-nearest neighbor algorithm (FKNN) model. In particular, the framework's novelty lies in the use of an improved Harris Hawks Optimization (HHO) algorithm named SSFSHHO, which integrates the Sobol sequence and Stochastic Fractal Search (SFS) mechanisms for optimizing the parameters of FKNN. The HHO method improves the quality of the initial population overall by incorporating the Sobol sequence, and the SFS mechanism increases the algorithm's capacity to get out of the local optimum solution. Comparisons with other classical meta-heuristic algorithms, state-of-the-art HHO variants in low and high dimensions, and enhanced meta-heuristic algorithms on 30 typical IEEE CEC2014 benchmark test problems show that the overall performance of SSFSHHO is significantly better than other comparative algorithms. Moreover, the created framework based on the SSFSHHO-FKNN model is employed to classify AD and MCI using MRI scans from the ADNI dataset, achieving high classification performance for 6 representative cases. Experimental findings indicate that the proposed algorithm performs better than a number of high-performance optimization algorithms and classical machine learning algorithms, thus offering a promising approach for AD classification. Additionally, the proposed strategy can successfully identify relevant features and enhance classification performance for AD diagnosis.
Collapse
Affiliation(s)
- Qian Zhang
- College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang, 310018, China; School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, Zhejiang, 325035, China
| | - Jinhua Sheng
- College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang, 310018, China.
| | - Qiao Zhang
- Beijing Hospital, Beijing, 100730, China; National Center of Gerontology, Beijing, 100730, China; Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, 100730, China
| | - Luyun Wang
- College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang, 310018, China
| | - Ze Yang
- College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang, 310018, China
| | - Yu Xin
- College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang, 310018, China
| |
Collapse
|
5
|
Wang L, Jiang Z, Shao A, Liu Z, Gu R, Ge R, Jia G, Wang Y, Ye J. Self-supervised learning mechanism for identification of eyelid malignant melanoma in pathologic slides with limited annotation. Front Med (Lausanne) 2022; 9:976467. [PMID: 36237543 PMCID: PMC9550873 DOI: 10.3389/fmed.2022.976467] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 08/26/2022] [Indexed: 11/13/2022] Open
Abstract
Purpose The lack of finely annotated pathologic data has limited the application of deep learning systems (DLS) to the automated interpretation of pathologic slides. Therefore, this study develops a robust self-supervised learning (SSL) pathology diagnostic system to automatically detect malignant melanoma (MM) in the eyelid with limited annotation. Design Development of a self-supervised diagnosis pipeline based on a public dataset, then refined and tested on a private, real-world clinical dataset. Subjects A. Patchcamelyon (PCam)-a publicly accessible dataset for the classification task of patch-level histopathologic images. B. The Second Affiliated Hospital, Zhejiang University School of Medicine (ZJU-2) dataset – 524,307 patches (small sections cut from pathologic slide images) from 192 H&E-stained whole-slide-images (WSIs); only 72 WSIs were labeled by pathologists. Methods Patchcamelyon was used to select a convolutional neural network (CNN) as the backbone for our SSL-based model. This model was further developed in the ZJU-2 dataset for patch-level classification with both labeled and unlabeled images to test its diagnosis ability. Then the algorithm retrieved information based on patch-level prediction to generate WSI-level classification results using random forest. A heatmap was computed for visualizing the decision-making process. Main outcome measure(s) The area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity were used to evaluate the performance of the algorithm in identifying MM. Results ResNet50 was selected as the backbone of the SSL-based model using the PCam dataset. This algorithm then achieved an AUC of 0.981 with an accuracy, sensitivity, and specificity of 90.9, 85.2, and 96.3% for the patch-level classification of the ZJU-2 dataset. For WSI-level diagnosis, the AUC, accuracy, sensitivity, and specificity were 0.974, 93.8%, 75.0%, and 100%, separately. For every WSI, a heatmap was generated based on the malignancy probability. Conclusion Our diagnostic system, which is based on SSL and trained with a dataset of limited annotation, can automatically identify MM in pathologic slides and highlight MM areas in WSIs by a probabilistic heatmap. In addition, this labor-saving and cost-efficient model has the potential to be refined to help diagnose other ophthalmic and non-ophthalmic malignancies.
Collapse
Affiliation(s)
- Linyan Wang
- Department of Ophthalmology, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Zijing Jiang
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China
| | - An Shao
- Department of Ophthalmology, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Zhengyun Liu
- Department of Pathology, Lishui Municipal Central Hospital, Lishui, China
| | - Renshu Gu
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China
| | - Ruiquan Ge
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China
| | - Gangyong Jia
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China
| | - Yaqi Wang
- College of Media Engineering, The Communication University of Zhejiang, Hangzhou, China
- Yaqi Wang,
| | - Juan Ye
- Department of Ophthalmology, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
- *Correspondence: Juan Ye,
| |
Collapse
|