1
|
Li Z. Image analysis and teaching strategy optimization of folk dance training based on the deep neural network. Sci Rep 2024; 14:10909. [PMID: 38740903 DOI: 10.1038/s41598-024-61134-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 05/02/2024] [Indexed: 05/16/2024] Open
Abstract
To improve the recognition effect of the folk dance image recognition model and put forward new suggestions for teachers' teaching strategies, this study introduces a Deep Neural Network (DNN) to optimize the folk dance training image recognition model. Moreover, a corresponding teaching strategy optimization scheme is proposed according to the experimental results. Firstly, the image preprocessing and feature extraction of DNN are optimized. Secondly, classification and target detection models are established to analyze the folk dance training images, and the C-dance dataset is used for experiments. Finally, the results are compared with those of the Naive Bayes classifier, K-nearest neighbor, decision tree classifier, support vector machine, and logistic regression models. The results of this study provide new suggestions for teaching strategies. The research results indicate that the optimized classification model shows a significant improvement in classification accuracy across various aspects such as action complexity, dance types, movement speed, dance styles, body dynamics, and rhythm. The accuracy, precision, recall, and F1 scores have increased by approximately 14.7, 11.8, 13.2, and 17.4%, respectively. In the study of factors such as different training images, changes in perspective, lighting conditions, and noise interference, the optimized model demonstrates a substantial enhancement in recognition accuracy and robustness. These findings suggest that, compared to traditional models, the optimized model performs better in identifying various dances and movements, enhancing the accuracy and stability of classification. Based on the experimental results, strategies for optimizing the real-time feedback and assessment mechanism in folk dance teaching, as well as the design of personalized learning paths, are proposed. Therefore, this study holds the potential to be applied in the field of folk dance, promoting the development and innovation of folk dance education.
Collapse
Affiliation(s)
- Zhou Li
- Art College of Shaanxi University of Technology, Hanzhong, 723001, Shaanxi, China.
- Universidad Católica San Antonio de Murcia, 30335, Murcia Region, Spain.
| |
Collapse
|
2
|
Ayana G, Barki H, Choe SW. Pathological Insights: Enhanced Vision Transformers for the Early Detection of Colorectal Cancer. Cancers (Basel) 2024; 16:1441. [PMID: 38611117 PMCID: PMC11010958 DOI: 10.3390/cancers16071441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 04/02/2024] [Accepted: 04/04/2024] [Indexed: 04/14/2024] Open
Abstract
Endoscopic pathological findings of the gastrointestinal tract are crucial for the early diagnosis of colorectal cancer (CRC). Previous deep learning works, aimed at improving CRC detection performance and reducing subjective analysis errors, are limited to polyp segmentation. Pathological findings were not considered and only convolutional neural networks (CNNs), which are not able to handle global image feature information, were utilized. This work introduces a novel vision transformer (ViT)-based approach for early CRC detection. The core components of the proposed approach are ViTCol, a boosted vision transformer for classifying endoscopic pathological findings, and PUTS, a vision transformer-based model for polyp segmentation. Results demonstrate the superiority of this vision transformer-based CRC detection method over existing CNN and vision transformer models. ViTCol exhibited an outstanding performance in classifying pathological findings, with an area under the receiver operating curve (AUC) value of 0.9999 ± 0.001 on the Kvasir dataset. PUTS provided outstanding results in segmenting polyp images, with mean intersection over union (mIoU) of 0.8673 and 0.9092 on the Kvasir-SEG and CVC-Clinic datasets, respectively. This work underscores the value of spatial transformers in localizing input images, which can seamlessly integrate into the main vision transformer network, enhancing the automated identification of critical image features for early CRC detection.
Collapse
Affiliation(s)
- Gelan Ayana
- Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39253, Republic of Korea;
- School of Biomedical Engineering, Jimma University, Jimma 378, Ethiopia
| | - Hika Barki
- Department of Artificial Intelligence Convergence, Pukyong National University, Busan 48513, Republic of Korea;
| | - Se-woon Choe
- Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39253, Republic of Korea;
- Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39253, Republic of Korea
- Emerging Pathogens Institute, University of Florida, Gainesville, FL 32608, USA
| |
Collapse
|
3
|
Ayana G, Lee E, Choe SW. Vision Transformers for Breast Cancer Human Epidermal Growth Factor Receptor 2 Expression Staging without Immunohistochemical Staining. THE AMERICAN JOURNAL OF PATHOLOGY 2024; 194:402-414. [PMID: 38096984 DOI: 10.1016/j.ajpath.2023.11.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 10/10/2023] [Accepted: 11/20/2023] [Indexed: 12/31/2023]
Abstract
Accurate staging of human epidermal growth factor receptor 2 (HER2) expression is vital for evaluating breast cancer treatment efficacy. However, it typically involves costly and complex immunohistochemical staining, along with hematoxylin and eosin staining. This work presents customized vision transformers for staging HER2 expression in breast cancer using only hematoxylin and eosin-stained images. The proposed algorithm comprised three modules: a localization module for weakly localizing critical image features using spatial transformers, an attention module for global learning via vision transformers, and a loss module to determine proximity to a HER2 expression level based on input images by calculating ordinal loss. Results, reported with 95% CIs, reveal the proposed approach's success in HER2 expression staging: area under the receiver operating characteristic curve, 0.9202 ± 0.01; precision, 0.922 ± 0.01; sensitivity, 0.876 ± 0.01; and specificity, 0.959 ± 0.02 over fivefold cross-validation. Comparatively, this approach significantly outperformed conventional vision transformer models and state-of-the-art convolutional neural network models (P < 0.001). Furthermore, it surpassed existing methods when evaluated on an independent test data set. This work holds great importance, aiding HER2 expression staging in breast cancer treatment while circumventing the costly and time-consuming immunohistochemical staining procedure, thereby addressing diagnostic disparities in low-resource settings and low-income countries.
Collapse
Affiliation(s)
- Gelan Ayana
- Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi, Republic of Korea; School of Biomedical Engineering, Jimma University, Jimma, Ethiopia
| | - Eonjin Lee
- Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi, Republic of Korea
| | - Se-Woon Choe
- Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi, Republic of Korea; Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi, Republic of Korea.
| |
Collapse
|
4
|
Jabeen K, Khan MA, Hameed MA, Alqahtani O, Alouane MTH, Masood A. A novel fusion framework of deep bottleneck residual convolutional neural network for breast cancer classification from mammogram images. Front Oncol 2024; 14:1347856. [PMID: 38454931 PMCID: PMC10917916 DOI: 10.3389/fonc.2024.1347856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Accepted: 02/05/2024] [Indexed: 03/09/2024] Open
Abstract
With over 2.1 million new cases of breast cancer diagnosed annually, the incidence and mortality rate of this disease pose severe global health issues for women. Identifying the disease's influence is the only practical way to lessen it immediately. Numerous research works have developed automated methods using different medical imaging to identify BC. Still, the precision of each strategy differs based on the available resources, the issue's nature, and the dataset being used. We proposed a novel deep bottleneck convolutional neural network with a quantum optimization algorithm for breast cancer classification and diagnosis from mammogram images. Two novel deep architectures named three-residual blocks bottleneck and four-residual blocks bottle have been proposed with parallel and single paths. Bayesian Optimization (BO) has been employed to initialize hyperparameter values and train the architectures on the selected dataset. Deep features are extracted from the global average pool layer of both models. After that, a kernel-based canonical correlation analysis and entropy technique is proposed for the extracted deep features fusion. The fused feature set is further refined using an optimization technique named quantum generalized normal distribution optimization. The selected features are finally classified using several neural network classifiers, such as bi-layered and wide-neural networks. The experimental process was conducted on a publicly available mammogram imaging dataset named INbreast, and a maximum accuracy of 96.5% was obtained. Moreover, for the proposed method, the sensitivity rate is 96.45, the precision rate is 96.5, the F1 score value is 96.64, the MCC value is 92.97%, and the Kappa value is 92.97%, respectively. The proposed architectures are further utilized for the diagnosis process of infected regions. In addition, a detailed comparison has been conducted with a few recent techniques showing the proposed framework's higher accuracy and precision rate.
Collapse
Affiliation(s)
- Kiran Jabeen
- Department of Computer Science, HITEC University, Taxila, Pakistan
| | - Muhammad Attique Khan
- Department of Computer Science, HITEC University, Taxila, Pakistan
- Department of Computer Science and Mathematics, Lebanese American University, Beirut, Lebanon
| | - Mohamed Abdel Hameed
- Department of Computer Science, Faculty of Computers and Information, Luxor University, Luxor, Egypt
| | - Omar Alqahtani
- College of Computer Science, King Khalid University, Abha, Saudi Arabia
| | | | - Anum Masood
- Department of Physics, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
5
|
Mudeng V, Farid MN, Ayana G, Choe SW. Domain and Histopathology Adaptations-Based Classification for Malignancy Grading System. THE AMERICAN JOURNAL OF PATHOLOGY 2023; 193:2080-2098. [PMID: 37673327 DOI: 10.1016/j.ajpath.2023.07.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 06/30/2023] [Accepted: 07/19/2023] [Indexed: 09/08/2023]
Abstract
Accurate proliferation rate quantification can be used to devise an appropriate treatment for breast cancer. Pathologists use breast tissue biopsy glass slides stained with hematoxylin and eosin to obtain grading information. However, this manual evaluation may lead to high costs and be ineffective because diagnosis depends on the facility and the pathologists' insights and experiences. Convolutional neural network acts as a computer-based observer to improve clinicians' capacity in grading breast cancer. Therefore, this study proposes a novel scheme for automatic breast cancer malignancy grading from invasive ductal carcinoma. The proposed classifiers implement multistage transfer learning incorporating domain and histopathologic transformations. Domain adaptation using pretrained models, such as InceptionResNetV2, InceptionV3, NASNet-Large, ResNet50, ResNet101, VGG19, and Xception, was applied to classify the ×40 magnification BreaKHis data set into eight classes. Subsequently, InceptionV3 and Xception, which contain the domain and histopathology pretrained weights, were determined to be the best for this study and used to categorize the Databiox database into grades 1, 2, or 3. To provide a comprehensive report, this study offered a patchless automated grading system for magnification-dependent and magnification-independent classifications. With an overall accuracy (means ± SD) of 90.17% ± 3.08% to 97.67% ± 1.09% and an F1 score of 0.9013 to 0.9760 for magnification-dependent classification, the classifiers in this work achieved outstanding performance. The proposed approach could be used for breast cancer grading systems in clinical settings.
Collapse
Affiliation(s)
- Vicky Mudeng
- Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi, Republic of Korea; Department of Electrical Engineering, Institut Teknologi Kalimantan, Balikpapan, Indonesia
| | - Mifta Nur Farid
- Department of Electrical Engineering, Institut Teknologi Kalimantan, Balikpapan, Indonesia
| | - Gelan Ayana
- Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi, Republic of Korea
| | - Se-Woon Choe
- Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi, Republic of Korea; Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi, Republic of Korea.
| |
Collapse
|
6
|
Reis HC, Turk V, Khoshelham K, Kaya S. MediNet: transfer learning approach with MediNet medical visual database. MULTIMEDIA TOOLS AND APPLICATIONS 2023; 82:1-44. [PMID: 37362724 PMCID: PMC10025796 DOI: 10.1007/s11042-023-14831-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 04/06/2022] [Accepted: 02/06/2023] [Indexed: 06/28/2023]
Abstract
The rapid development of machine learning has increased interest in the use of deep learning methods in medical research. Deep learning in the medical field is used in disease detection and classification problems in the clinical decision-making process. Large amounts of labeled datasets are often required to train deep neural networks; however, in the medical field, the lack of a sufficient number of images in datasets and the difficulties encountered during data collection are among the main problems. In this study, we propose MediNet, a new 10-class visual dataset consisting of Rontgen (X-ray), Computed Tomography (CT), Magnetic Resonance Imaging (MRI), Ultrasound, and Histopathological images such as calcaneal normal, calcaneal tumor, colon benign colon adenocarcinoma, brain normal, brain tumor, breast benign, breast malignant, chest normal, chest pneumonia. AlexNet, VGG19-BN, Inception V3, DenseNet 121, ResNet 101, EfficientNet B0, Nested-LSTM + CNN, and proposed RdiNet deep learning algorithms are used in the transfer learning for pre-training and classification application. Transfer learning aims to apply previously learned knowledge in a new task. Seven algorithms were trained with the MediNet dataset, and the models obtained from these algorithms, namely feature vectors, were recorded. Pre-training models were used for classification studies on chest X-ray images, diabetic retinopathy, and Covid-19 datasets with the transfer learning technique. In performance measurement, an accuracy of 94.84% was obtained in the traditional classification study for the InceptionV3 model in the classification study performed on the Chest X-Ray Images dataset, and the accuracy was increased 98.71% after the transfer learning technique was applied. In the Covid-19 dataset, the classification success of the DenseNet121 model before pre-trained was 88%, while the performance after the transfer application with MediNet was 92%. In the Diabetic retinopathy dataset, the classification success of the Nested-LSTM + CNN model before pre-trained was 79.35%, while the classification success was 81.52% after the transfer application with MediNet. The comparison of results obtained from experimental studies observed that the proposed method produced more successful results. Graphical abstract
Collapse
Affiliation(s)
- Hatice Catal Reis
- Department of Geomatics Engineering, Gumushane University, 2900 Gumushane, Turkey
| | - Veysel Turk
- Department of Computer Engineering, University of Harran, Sanliurfa, Turkey
| | - Kourosh Khoshelham
- Department of Infrastructure Engineering, The University of Melbourne, Parkville, 3052 Australia
| | - Serhat Kaya
- Department of Mining Engineering, Dicle University, Diyarbakir, Turkey
| |
Collapse
|
7
|
Goceri E. Medical image data augmentation: techniques, comparisons and interpretations. Artif Intell Rev 2023; 56:1-45. [PMID: 37362888 PMCID: PMC10027281 DOI: 10.1007/s10462-023-10453-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2023] [Indexed: 03/29/2023]
Abstract
Designing deep learning based methods with medical images has always been an attractive area of research to assist clinicians in rapid examination and accurate diagnosis. Those methods need a large number of datasets including all variations in their training stages. On the other hand, medical images are always scarce due to several reasons, such as not enough patients for some diseases, patients do not want to allow their images to be used, lack of medical equipment or equipment, inability to obtain images that meet the desired criteria. This issue leads to bias in datasets, overfitting, and inaccurate results. Data augmentation is a common solution to overcome this issue and various augmentation techniques have been applied to different types of images in the literature. However, it is not clear which data augmentation technique provides more efficient results for which image type since different diseases are handled, different network architectures are used, and these architectures are trained and tested with different numbers of data sets in the literature. Therefore, in this work, the augmentation techniques used to improve performances of deep learning based diagnosis of the diseases in different organs (brain, lung, breast, and eye) from different imaging modalities (MR, CT, mammography, and fundoscopy) have been examined. Also, the most commonly used augmentation methods have been implemented, and their effectiveness in classifications with a deep network has been discussed based on quantitative performance evaluations. Experiments indicated that augmentation techniques should be chosen carefully according to image types.
Collapse
Affiliation(s)
- Evgin Goceri
- Department of Biomedical Engineering, Engineering Faculty, Akdeniz University, Antalya, Turkey
| |
Collapse
|
8
|
Reis HC, Turk V. Transfer Learning Approach and Nucleus Segmentation with MedCLNet Colon Cancer Database. J Digit Imaging 2023; 36:306-325. [PMID: 36127531 PMCID: PMC9984669 DOI: 10.1007/s10278-022-00701-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 09/07/2022] [Accepted: 09/08/2022] [Indexed: 11/30/2022] Open
Abstract
Machine learning has been recently used especially in the medical field. In the diagnosis of serious diseases such as cancer, deep learning techniques can be used to reduce the workload of experts and to produce quick solutions. The nuclei found in the histopathology dataset are an essential parameter in disease detection. The nucleus segmentation was performed using the colorectal histology MNIST dataset for nucleus detection in this study. The graph theory, PSO, watershed, and random walker algorithms were used for the segmentation process. In addition, we present the 10-class MedCLNet visual dataset consisting of the NCT-CRC-HE-100 K dataset, LC25000 dataset, and GlaS dataset that can be used in transfer learning studies from deep learning techniques. The study proposes a transfer learning technique using the MedCLNet database. Deep neural networks pre-trained with the proposed transfer learning method were used in the classification with the colorectal histology MNIST dataset in the experimental process. DenseNet201, DenseNet169, InceptionResNetV2, InceptionV3, ResNet152V2, ResNet101V2, and Xception deep learning algorithms were used in transfer learning and the classification studies. The proposed approach was analyzed before and after transfer learning with different methods (DenseNet169 + SVM, DenseNet169 + GRU). In the performance measurement, using the colorectal histology MNIST dataset, 94.29% accuracy was obtained in the DenseNet169 model, which was initiated with random weights in the multi-classification study, and 95.00% accuracy after transfer learning was applied. In comparison with the results obtained from empirical studies, it was demonstrated that the proposed method produced satisfactory outcomes. The application is expected to provide a secondary evaluation for physicians in colon cancer detection and the segmentation.
Collapse
Affiliation(s)
- Hatice Catal Reis
- Department of Geomatics Engineering, Gumushane University, Gumushane, 2900, Turkey.
| | - Veysel Turk
- Department of Computer Engineering, University of Harran, Sanliurfa, Turkey
| |
Collapse
|
9
|
Ayana G, Dese K, Dereje Y, Kebede Y, Barki H, Amdissa D, Husen N, Mulugeta F, Habtamu B, Choe SW. Vision-Transformer-Based Transfer Learning for Mammogram Classification. Diagnostics (Basel) 2023; 13:diagnostics13020178. [PMID: 36672988 PMCID: PMC9857963 DOI: 10.3390/diagnostics13020178] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 12/27/2022] [Accepted: 12/27/2022] [Indexed: 01/06/2023] Open
Abstract
Breast mass identification is a crucial procedure during mammogram-based early breast cancer diagnosis. However, it is difficult to determine whether a breast lump is benign or cancerous at early stages. Convolutional neural networks (CNNs) have been used to solve this problem and have provided useful advancements. However, CNNs focus only on a certain portion of the mammogram while ignoring the remaining and present computational complexity because of multiple convolutions. Recently, vision transformers have been developed as a technique to overcome such limitations of CNNs, ensuring better or comparable performance in natural image classification. However, the utility of this technique has not been thoroughly investigated in the medical image domain. In this study, we developed a transfer learning technique based on vision transformers to classify breast mass mammograms. The area under the receiver operating curve of the new model was estimated as 1 ± 0, thus outperforming the CNN-based transfer-learning models and vision transformer models trained from scratch. The technique can, hence, be applied in a clinical setting, to improve the early diagnosis of breast cancer.
Collapse
Affiliation(s)
- Gelan Ayana
- Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39253, Republic of Korea
- School of Biomedical Engineering, Jimma University, Jimma 378, Ethiopia
| | - Kokeb Dese
- School of Biomedical Engineering, Jimma University, Jimma 378, Ethiopia
| | - Yisak Dereje
- Department of Information Engineering, Marche Polytechnic University, 60121 Ancona, Italy
| | - Yonas Kebede
- Biomedical Engineering Unit, Black Lion Hospital, Addis Ababa University, Addis Ababa 1000, Ethiopia
| | - Hika Barki
- Department of Artificial Intelligence Convergence, Pukyong National University, Busan 48513, Republic of Korea
| | - Dechassa Amdissa
- Department of Basic and Applied Science for Engineering, Sapienza University of Rome, 00161 Roma, Italy
| | - Nahimiya Husen
- Department of Bioengineering and Robotics, Campus Bio-Medico University of Rome, 00128 Roma, Italy
| | - Fikadu Mulugeta
- Center of Biomedical Engineering, Addis Ababa Institute of Technology, Addis Ababa University, Addis Ababa 1000, Ethiopia
| | - Bontu Habtamu
- School of Biomedical Engineering, Jimma University, Jimma 378, Ethiopia
| | - Se-Woon Choe
- Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39253, Republic of Korea
- Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39253, Republic of Korea
- Correspondence: ; Tel.: +82-54-478-7781; Fax: +82-54-462-1049
| |
Collapse
|
10
|
Multi-Stage Classification-Based Deep Learning for Gleason System Grading Using Histopathological Images. Cancers (Basel) 2022; 14:cancers14235897. [PMID: 36497378 PMCID: PMC9738124 DOI: 10.3390/cancers14235897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 11/23/2022] [Accepted: 11/24/2022] [Indexed: 12/05/2022] Open
Abstract
In this work, we introduced an automated diagnostic system for Gleason system grading and grade groups (GG) classification using whole slide images (WSIs) of digitized prostate biopsy specimens (PBSs). Our system first classifies the Gleason pattern (GP) from PBSs and then identifies the Gleason score (GS) and GG. We developed a comprehensive DL-based approach to develop a grading pipeline system for the digitized PBSs and consider GP as a classification problem (not segmentation) compared to current research studies (deals with as a segmentation problem). A multilevel binary classification was implemented to enhance the segmentation accuracy for GP. Also, we created three levels of analysis (pyramidal levels) to extract different types of features. Each level has four shallow binary CNN to classify five GP labels. A majority fusion is applied for each pixel that has a total of 39 labeled images to create the final output for GP. The proposed framework is trained, validated, and tested on 3080 WSIs of PBS. The overall diagnostic accuracy for each CNN is evaluated using several metrics: precision (PR), recall (RE), and accuracy, which are documented by the confusion matrices.The results proved our system's potential for classifying all five GP and, thus, GG. The overall accuracy for the GG is evaluated using two metrics, PR and RE. The grade GG results are between 50% to 92% for RE and 50% to 92% for PR. Also, a comparison between our CNN architecture and the standard CNN (ResNet50) highlights our system's advantage. Finally, our deep-learning system achieved an agreement with the consensus grade groups.
Collapse
|
11
|
Ayana G, Choe SW. BUViTNet: Breast Ultrasound Detection via Vision Transformers. Diagnostics (Basel) 2022; 12:diagnostics12112654. [PMID: 36359497 PMCID: PMC9689470 DOI: 10.3390/diagnostics12112654] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 10/26/2022] [Accepted: 10/26/2022] [Indexed: 11/06/2022] Open
Abstract
Convolutional neural networks (CNNs) have enhanced ultrasound image-based early breast cancer detection. Vision transformers (ViTs) have recently surpassed CNNs as the most effective method for natural image analysis. ViTs have proven their capability of incorporating more global information than CNNs at lower layers, and their skip connections are more powerful than those of CNNs, which endows ViTs with superior performance. However, the effectiveness of ViTs in breast ultrasound imaging has not yet been investigated. Here, we present BUViTNet breast ultrasound detection via ViTs, where ViT-based multistage transfer learning is performed using ImageNet and cancer cell image datasets prior to transfer learning for classifying breast ultrasound images. We utilized two publicly available ultrasound breast image datasets, Mendeley and breast ultrasound images (BUSI), to train and evaluate our algorithm. The proposed method achieved the highest area under the receiver operating characteristics curve (AUC) of 1 ± 0, Matthew’s correlation coefficient (MCC) of 1 ± 0, and kappa score of 1 ± 0 on the Mendeley dataset. Furthermore, BUViTNet achieved the highest AUC of 0.968 ± 0.02, MCC of 0.961 ± 0.01, and kappa score of 0.959 ± 0.02 on the BUSI dataset. BUViTNet outperformed ViT trained from scratch, ViT-based conventional transfer learning, and CNN-based transfer learning in classifying breast ultrasound images (p < 0.01 in all cases). Our findings indicate that improved transformers are effective in analyzing breast images and can provide an improved diagnosis if used in clinical settings. Future work will consider the use of a wide range of datasets and parameters for optimized performance.
Collapse
Affiliation(s)
- Gelan Ayana
- Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39253, Korea
| | - Se-woon Choe
- Department of Medical IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39253, Korea
- Department of IT Convergence Engineering, Kumoh National Institute of Technology, Gumi 39253, Korea
- Correspondence: ; Tel.: +82-54-478-7781; Fax: +82-54-462-1049
| |
Collapse
|