1
|
Shankari N, Kudva V, Hegde RB. Breast Mass Detection and Classification Using Machine Learning Approaches on Two-Dimensional Mammogram: A Review. Crit Rev Biomed Eng 2024; 52:41-60. [PMID: 38780105 DOI: 10.1615/critrevbiomedeng.2024051166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Breast cancer is a leading cause of mortality among women, both in India and globally. The prevalence of breast masses is notably common in women aged 20 to 60. These breast masses are classified, according to the breast imaging-reporting and data systems (BI-RADS) standard, into categories such as fibroadenoma, breast cysts, benign, and malignant masses. To aid in the diagnosis of breast disorders, imaging plays a vital role, with mammography being the most widely used modality for detecting breast abnormalities over the years. However, the process of identifying breast diseases through mammograms can be time-consuming, requiring experienced radiologists to review a significant volume of images. Early detection of breast masses is crucial for effective disease management, ultimately reducing mortality rates. To address this challenge, advancements in image processing techniques, specifically utilizing artificial intelligence (AI) and machine learning (ML), have tiled the way for the development of decision support systems. These systems assist radiologists in the accurate identification and classification of breast disorders. This paper presents a review of various studies where diverse machine learning approaches have been applied to digital mammograms. These approaches aim to identify breast masses and classify them into distinct subclasses such as normal, benign and malignant. Additionally, the paper highlights both the advantages and limitations of existing techniques, offering valuable insights for the benefit of future research endeavors in this critical area of medical imaging and breast health.
Collapse
Affiliation(s)
- N Shankari
- NITTE (Deemed to be University), Department of Electronics and Communication Engineering, NMAM Institute of Technology, Nitte 574110, Karnataka, India
| | - Vidya Kudva
- School of Information Sciences, Manipal Academy of Higher Education, Manipal, India -576104; Nitte Mahalinga Adyanthaya Memorial Institute of Technology, Nitte, India - 574110
| | - Roopa B Hegde
- NITTE (Deemed to be University), Department of Electronics and Communication Engineering, NMAM Institute of Technology, Nitte - 574110, Karnataka, India
| |
Collapse
|
2
|
Ejiyi CJ, Qin Z, Monday H, Ejiyi MB, Ukwuoma C, Ejiyi TU, Agbesi VK, Agu A, Orakwue C. Breast cancer diagnosis and management guided by data augmentation, utilizing an integrated framework of SHAP and random augmentation. Biofactors 2024; 50:114-134. [PMID: 37695269 DOI: 10.1002/biof.1995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 07/18/2023] [Indexed: 09/12/2023]
Abstract
Recent research indicates that early detection of breast cancer (BC) is critical in achieving favorable treatment outcomes and reducing the mortality rate associated with it. With the difficulty in obtaining a balanced dataset that is primarily sourced for the diagnosis of the disease, many researchers have relied on data augmentation techniques, thereby having varying datasets with varying quality and results. The dataset we focused on in this study is crafted from SHapley Additive exPlanations (SHAP)-augmentation and random augmentation (RA) approaches to dealing with imbalanced data. This was carried out on the Wisconsin BC dataset and the effectiveness of this approach to the diagnosis of BC was checked using six machine-learning algorithms. RA synthetically generated some parts of the dataset while SHAP helped in assessing the quality of the attributes, which were selected and used for the training of the models. The result from our analysis shows that the performance of the models used generally increased to more than 3% for most of the models using the dataset obtained by the integration of SHAP and RA. Additionally, after diagnosis, it is important to focus on providing quality care to ensure the best possible outcomes for patients. The need for proper management of the disease state is crucial so as to reduce the recurrence of the disease and other associated complications. Thus the interpretability provided by SHAP enlightens the management strategies in this study focusing on the quality of care given to the patient and how timely the care is.
Collapse
Affiliation(s)
- Chukwuebuka Joseph Ejiyi
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Zhen Qin
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Happy Monday
- Department of Computer Science, Oxford Brookes University and Chengdu University of Technology of China, Chengdu, China
| | | | - Chiagoziem Ukwuoma
- School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Thomas Ugochukwu Ejiyi
- Department of Pure and Industrial Chemistry, University of Nigeria Nsukka, Enugu, Nigeria
| | - Victor Kwaku Agbesi
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Amarachi Agu
- Department of Public Health, University of Nigeria Enugu Campus, Enugu, Nigeria
| | - Chiduzie Orakwue
- Department of Agricultural and Bio-Resources Engineering, College of Engineering Federal University of Agriculture Abeokuta, Nigeria
| |
Collapse
|
3
|
Reshan MSA, Amin S, Zeb MA, Sulaiman A, Alshahrani H, Azar AT, Shaikh A. Enhancing Breast Cancer Detection and Classification Using Advanced Multi-Model Features and Ensemble Machine Learning Techniques. Life (Basel) 2023; 13:2093. [PMID: 37895474 PMCID: PMC10608611 DOI: 10.3390/life13102093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2023] [Revised: 10/18/2023] [Accepted: 10/19/2023] [Indexed: 10/29/2023] Open
Abstract
Breast cancer (BC) is the most common cancer among women, making it essential to have an accurate and dependable system for diagnosing benign or malignant tumors. It is essential to detect this cancer early in order to inform subsequent treatments. Currently, fine needle aspiration (FNA) cytology and machine learning (ML) models can be used to detect and diagnose this cancer more accurately. Consequently, an effective and dependable approach needs to be developed to enhance the clinical capacity to diagnose this illness. This study aims to detect and divide BC into two categories using the Wisconsin Diagnostic Breast Cancer (WDBC) benchmark feature set and to select the fewest features to attain the highest accuracy. To this end, this study explores automated BC prediction using multi-model features and ensemble machine learning (EML) techniques. To achieve this, we propose an advanced ensemble technique, which incorporates voting, bagging, stacking, and boosting as combination techniques for the classifier in the proposed EML methods to distinguish benign breast tumors from malignant cancers. In the feature extraction process, we suggest a recursive feature elimination technique to find the most important features of the WDBC that are pertinent to BC detection and classification. Furthermore, we conducted cross-validation experiments, and the comparative results demonstrated that our method can effectively enhance classification performance and attain the highest value in six evaluation metrics, including precision, sensitivity, area under the curve (AUC), specificity, accuracy, and F1-score. Overall, the stacking model achieved the best average accuracy, at 99.89%, and its sensitivity, specificity, F1-score, precision, and AUC/ROC were 1.00%, 0.999%, 1.00%, 1.00%, and 1.00%, respectively, thus generating excellent results. The findings of this study can be used to establish a reliable clinical detection system, enabling experts to make more precise and operative decisions in the future. Additionally, the proposed technology might be used to detect a variety of cancers.
Collapse
Affiliation(s)
- Mana Saleh Al Reshan
- Department of Information Systems, College of Computer Science and Information Systems, Najran University, Najran 61441, Saudi Arabia; (M.S.A.R.); (A.S.)
| | - Samina Amin
- Institute of Computing, Kohat University of Science and Technology, Kohat 26000, Pakistan; (S.A.); (M.A.Z.)
| | - Muhammad Ali Zeb
- Institute of Computing, Kohat University of Science and Technology, Kohat 26000, Pakistan; (S.A.); (M.A.Z.)
| | - Adel Sulaiman
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran 61441, Saudi Arabia; (A.S.); (H.A.)
| | - Hani Alshahrani
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran 61441, Saudi Arabia; (A.S.); (H.A.)
| | - Ahmad Taher Azar
- College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia
- Automated Systems and Soft Computing Lab (ASSCL), Prince Sultan University, Riyadh 11586, Saudi Arabia
| | - Asadullah Shaikh
- Department of Information Systems, College of Computer Science and Information Systems, Najran University, Najran 61441, Saudi Arabia; (M.S.A.R.); (A.S.)
| |
Collapse
|
4
|
Ensemble Deep Learning Ultimate Tensile Strength Classification Model for Weld Seam of Asymmetric Friction Stir Welding. Processes (Basel) 2023. [DOI: 10.3390/pr11020434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Friction stir welding is a material processing technique used to combine dissimilar and similar materials. Ultimate tensile strength (UTS) is one of the most common objectives of welding, especially friction stir welding (FSW). Typically, destructive testing is utilized to measure the UTS of a welded seam. Testing for the UTS of a weld seam typically involves cutting the specimen and utilizing a machine capable of testing for UTS. In this study, an ensemble deep learning model was developed to classify the UTS of the FSW weld seam. Consequently, the model could classify the quality of the weld seam in relation to its UTS using only an image of the weld seam. Five distinct convolutional neural networks (CNNs) were employed to form the heterogeneous ensemble deep learning model in the proposed model. In addition, image segmentation, image augmentation, and an efficient decision fusion approach were implemented in the proposed model. To test the model, 1664 pictures of weld seams were created and tested using the model. The weld seam UTS quality was divided into three categories: below 70% (low quality), 70–85% (moderate quality), and above 85% (high quality) of the base material. AA5083 and AA5061 were the base materials used for this study. The computational results demonstrate that the accuracy of the suggested model is 96.23%, which is 0.35% to 8.91% greater than the accuracy of the literature’s most advanced CNN model.
Collapse
|
5
|
Jakhar AK, Gupta A, Singh M. SELF: a stacked-based ensemble learning framework for breast cancer classification. EVOLUTIONARY INTELLIGENCE 2023. [DOI: 10.1007/s12065-023-00824-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
6
|
Classification and diagnostic prediction of breast cancer metastasis on clinical data using machine learning algorithms. Sci Rep 2023; 13:485. [PMID: 36627367 PMCID: PMC9831019 DOI: 10.1038/s41598-023-27548-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 01/04/2023] [Indexed: 01/12/2023] Open
Abstract
Metastatic Breast Cancer (MBC) is one of the primary causes of cancer-related deaths in women. Despite several limitations, histopathological information about the malignancy is used for the classification of cancer. The objective of our study is to develop a non-invasive breast cancer classification system for the diagnosis of cancer metastases. The anaconda-Jupyter notebook is used to develop various python programming modules for text mining, data processing, and Machine Learning (ML) methods. Utilizing classification model cross-validation criteria, including accuracy, AUC, and ROC, the prediction performance of the ML models is assessed. Welch Unpaired t-test was used to ascertain the statistical significance of the datasets. Text mining framework from the Electronic Medical Records (EMR) made it easier to separate the blood profile data and identify MBC patients. Monocytes revealed a noticeable mean difference between MBC patients as compared to healthy individuals. The accuracy of ML models was dramatically improved by removing outliers from the blood profile data. A Decision Tree (DT) classifier displayed an accuracy of 83% with an AUC of 0.87. Next, we deployed DT classifiers using Flask to create a web application for robust diagnosis of MBC patients. Taken together, we conclude that ML models based on blood profile data may assist physicians in selecting intensive-care MBC patients to enhance the overall survival outcome.
Collapse
|
7
|
Data augmentation guided breast cancer diagnosis and prognosis using an integrated deep-generative framework based on breast tumor’s morphological information. INFORMATICS IN MEDICINE UNLOCKED 2023. [DOI: 10.1016/j.imu.2023.101171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
|
8
|
Han L, Yin Z. A hybrid breast cancer classification algorithm based on meta-learning and artificial neural networks. Front Oncol 2022; 12:1042964. [DOI: 10.3389/fonc.2022.1042964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 10/13/2022] [Indexed: 11/13/2022] Open
Abstract
The incidence of breast cancer in women has surpassed that of lung cancer as the world’s leading new cancer case. Regular screening and measures become an effective way to prevent breast cancer and also provide a good foundation for later treatment. Women should receive regular checkups in the hospital after reaching a certain age. The use of computer-aided technology can improve the accuracy and efficiency of physicians’ decision-making. Data pre-processing is required before data analysis, and 16 features are selected using a correlation-based feature selection method. In this paper, meta-learning and Artificial Neural Networks (ANN) are combined to create a hybrid algorithm. The proposed hybrid algorithm for predicting breast cancer was attempted to achieve 98.74% accuracy and 98.02% F1-score by creating a combination of various meta-learning models whose output was used as input features for creating ANN models. Therefore, the hybrid algorithm proposed in this paper can obtain better prediction results than a single model.
Collapse
|
9
|
Mavrogiorgou A, Kiourtis A, Kleftakis S, Mavrogiorgos K, Zafeiropoulos N, Kyriazis D. A Catalogue of Machine Learning Algorithms for Healthcare Risk Predictions. SENSORS (BASEL, SWITZERLAND) 2022; 22:8615. [PMID: 36433212 PMCID: PMC9695983 DOI: 10.3390/s22228615] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 11/04/2022] [Accepted: 11/04/2022] [Indexed: 05/27/2023]
Abstract
Extracting useful knowledge from proper data analysis is a very challenging task for efficient and timely decision-making. To achieve this, there exist a plethora of machine learning (ML) algorithms, while, especially in healthcare, this complexity increases due to the domain's requirements for analytics-based risk predictions. This manuscript proposes a data analysis mechanism experimented in diverse healthcare scenarios, towards constructing a catalogue of the most efficient ML algorithms to be used depending on the healthcare scenario's requirements and datasets, for efficiently predicting the onset of a disease. To this context, seven (7) different ML algorithms (Naïve Bayes, K-Nearest Neighbors, Decision Tree, Logistic Regression, Random Forest, Neural Networks, Stochastic Gradient Descent) have been executed on top of diverse healthcare scenarios (stroke, COVID-19, diabetes, breast cancer, kidney disease, heart failure). Based on a variety of performance metrics (accuracy, recall, precision, F1-score, specificity, confusion matrix), it has been identified that a sub-set of ML algorithms are more efficient for timely predictions under specific healthcare scenarios, and that is why the envisioned ML catalogue prioritizes the ML algorithms to be used, depending on the scenarios' nature and needed metrics. Further evaluation must be performed considering additional scenarios, involving state-of-the-art techniques (e.g., cloud deployment, federated ML) for improving the mechanism's efficiency.
Collapse
Affiliation(s)
- Argyro Mavrogiorgou
- Department of Digital Systems, University of Piraeus, 185 34 Piraeus, Greece
| | | | | | | | | | | |
Collapse
|
10
|
Predicting Breast Cancer from Risk Factors Using SVM and Extra-Trees-Based Feature Selection Method. COMPUTERS 2022. [DOI: 10.3390/computers11090136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Developing a prediction model from risk factors can provide an efficient method to recognize breast cancer. Machine learning (ML) algorithms have been applied to increase the efficiency of diagnosis at the early stage. This paper studies a support vector machine (SVM) combined with an extremely randomized trees classifier (extra-trees) to provide a diagnosis of breast cancer at the early stage based on risk factors. The extra-trees classifier was used to remove irrelevant features, while SVM was utilized to diagnose the breast cancer status. A breast cancer dataset consisting of 116 subjects was utilized by machine learning models to predict breast cancer, while the stratified 10-fold cross-validation was employed for the model evaluation. Our proposed combined SVM and extra-trees model reached the highest accuracy up to 80.23%, which was significantly better than the other ML model. The experimental results demonstrated that by applying extra-trees-based feature selection, the average ML prediction accuracy was improved by up to 7.29% as contrasted to ML without the feature selection method. Our proposed model is expected to increase the efficiency of breast cancer diagnosis based on risk factors. In addition, we presented the proposed prediction model that could be employed for web-based breast cancer prediction. The proposed model is expected to improve diagnostic decision-support systems by predicting breast cancer disease accurately.
Collapse
|
11
|
Arooj S, Atta-ur-Rahman, Zubair M, Khan MF, Alissa K, Khan MA, Mosavi A. Breast Cancer Detection and Classification Empowered With Transfer Learning. Front Public Health 2022; 10:924432. [PMID: 35859776 PMCID: PMC9289190 DOI: 10.3389/fpubh.2022.924432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 05/31/2022] [Indexed: 11/29/2022] Open
Abstract
Cancer is a major public health issue in the modern world. Breast cancer is a type of cancer that starts in the breast and spreads to other parts of the body. One of the most common types of cancer that kill women is breast cancer. When cells become uncontrollably large, cancer develops. There are various types of breast cancer. The proposed model discussed benign and malignant breast cancer. In computer-aided diagnosis systems, the identification and classification of breast cancer using histopathology and ultrasound images are critical steps. Investigators have demonstrated the ability to automate the initial level identification and classification of the tumor throughout the last few decades. Breast cancer can be detected early, allowing patients to obtain proper therapy and thereby increase their chances of survival. Deep learning (DL), machine learning (ML), and transfer learning (TL) techniques are used to solve many medical issues. There are several scientific studies in the previous literature on the categorization and identification of cancer tumors using various types of models but with some limitations. However, research is hampered by the lack of a dataset. The proposed methodology is created to help with the automatic identification and diagnosis of breast cancer. Our main contribution is that the proposed model used the transfer learning technique on three datasets, A, B, C, and A2, A2 is the dataset A with two classes. In this study, ultrasound images and histopathology images are used. The model used in this work is a customized CNN-AlexNet, which was trained according to the requirements of the datasets. This is also one of the contributions of this work. The results have shown that the proposed system empowered with transfer learning achieved the highest accuracy than the existing models on datasets A, B, C, and A2.
Collapse
|
12
|
Performance evaluation of machine learning for breast cancer diagnosis: A case study. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.101009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|