1
|
Du H, Yang Q, Ge A, Zhao C, Ma Y, Wang S. Explainable machine learning models for early gastric cancer diagnosis. Sci Rep 2024; 14:17457. [PMID: 39075116 PMCID: PMC11286780 DOI: 10.1038/s41598-024-67892-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Accepted: 07/17/2024] [Indexed: 07/31/2024] Open
Abstract
Gastric cancer remains a significant global health concern, with a notably high incidence in East Asia. This paper explores the potential of explainable machine learning models in enhancing the early diagnosis of gastric cancer. Through comprehensive evaluations, various machine learning models, including WeightedEnsemble, CatBoost, and RandomForest, demonstrated high potential in accurately diagnosing early gastric cancer. The study emphasizes the importance of model explainability in medical diagnostics, showing how transparent, explainable models can increase trust and clinical acceptance, thereby improving diagnostic accuracy and patient outcomes. This research not only highlights key biomarkers and clinical features critical for early detection but also presents a versatile approach that could be applied to other medical diagnostics, promoting broader adoption of machine learning in clinical settings.
Collapse
Affiliation(s)
- Hongyang Du
- Heze Administrative Approval Guarantee Center, 3443 Huanghe East Road, Heze City, 274000, Shandong Province, China
| | - Qingfen Yang
- Heze Municipal Hospital, 2888 Caozhou West Road, Heze City, 274031, Shandong Province, China
| | - Aimin Ge
- Heze Municipal Hospital, 2888 Caozhou West Road, Heze City, 274031, Shandong Province, China
| | - Chenhao Zhao
- Heze Municipal Hospital, 2888 Caozhou West Road, Heze City, 274031, Shandong Province, China
| | - Yunhua Ma
- Heze Municipal Hospital, 2888 Caozhou West Road, Heze City, 274031, Shandong Province, China
| | - Shuyu Wang
- Heze Municipal Hospital, 2888 Caozhou West Road, Heze City, 274031, Shandong Province, China.
| |
Collapse
|
2
|
Mondal S, Choudhary P, Rathee P. Detection of cardiac abnormalities from 12-lead ecg using complex wavelet sub-band features. Biomed Phys Eng Express 2024; 10:035023. [PMID: 38316022 DOI: 10.1088/2057-1976/ad2631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 02/05/2024] [Indexed: 02/07/2024]
Abstract
AIM OF THE STUDY This research endeavours to optimize cardiac anomaly detection by introducing a method focused on selecting the most effective Daubechis wavelet families. The principal aim is to differentiate between cardiac states that are normal and abnormal by utilizing longer electrocardiogram (ECG) signal events based on the Apnea ECG dataset. Apnea ECG is often used to detect sleep apnea, a sleep disorder characterized by repeated interruptions in breathing during sleep. By using machine learning methods, such as Principal Component Analysis (PCA) and different classifiers, the goal is to improve the precision of cardiac irregularity identification. Used method. To extract important statistical and sub-band information from lengthy ECG signal episodes, the study uses a novel method that combines discrete wavelet transform with Principal Component Analysis (PCA) for dimension reduction. The methodology focuses on successfully categorizing ECG signals by utilizing several classifiers, including multilayer perceptron (MLP) neural network, Ensemble Subspace K-Nearest Neighbour(KNN), and Ensemble Bagged Trees, together with varied Daubechis wavelet families (db2, db3, db4, db5, db6). Brief Description of Results. The results emphasize the importance of the chosen Daubechis wavelet family, db5, and its superiority in ECG representation. The method distinguishes normal and abnormal ECG signals well on the Physionet Apnea ECG database. The Neural Network-based method accurately recognizes 100% of healthy signals and 97.8% of problematic ones with 98.6% accuracy. FINDINGS The Ensemble Subspace K-Nearest Neighbour (KNN) and Ensemble Bagged Trees methods got 87.1% accuracy and 0.89 and 0.87 AOC curve values on this dataset, showing that the method works. Precision values of 0.96, 0.86, and 0.86 for MLP Neural Network, KNN Subspace, and Ensemble Bagged Trees confirm their robustness. These findings suggest wavelet families and machine learning can improve cardiac abnormality detection and categorization.
Collapse
Affiliation(s)
- Sourav Mondal
- Department of Computer Science and Engineering, National Institute of Technology Hamirpur, Hamirpur, Himachal Pradesh-177005, India
| | - Prakash Choudhary
- Department of Computer Science and Engineering, Central University of Rajasthan, NH-8, Bandar Sindri, Kishangarh, Rajasthan 305817, India
| | - Priyanka Rathee
- Department of Computer Science and Engineering, National Institute of Technology Hamirpur, Hamirpur, Himachal Pradesh-177005, India
| |
Collapse
|
3
|
V JP, S AAV, P GK, N K K. A novel attention-based cross-modal transfer learning framework for predicting cardiovascular disease. Comput Biol Med 2024; 170:107977. [PMID: 38217974 DOI: 10.1016/j.compbiomed.2024.107977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/19/2023] [Accepted: 01/08/2024] [Indexed: 01/15/2024]
Abstract
Cardiovascular disease (CVD) remains a leading cause of death globally, presenting significant challenges in early detection and treatment. The complexity of CVD arises from its multifaceted nature, influenced by a combination of genetic, environmental, and lifestyle factors. Traditional diagnostic approaches often struggle to effectively integrate and interpret the heterogeneous data associated with CVD. Addressing this challenge, we introduce a novel Attention-Based Cross-Modal (ABCM) transfer learning framework. This framework innovatively merges diverse data types, including clinical records, medical imagery, and genetic information, through an attention-driven mechanism. This mechanism adeptly identifies and focuses on the most pertinent attributes from each data source, thereby enhancing the model's ability to discern intricate interrelationships among various data types. Our extensive testing and validation demonstrate that the ABCM framework significantly surpasses traditional single-source models and other advanced multi-source methods in predicting CVD. Specifically, our approach achieves an accuracy of 93.5%, precision of 92.0%, recall of 94.5%, and an impressive area under the curve (AUC) of 97.2%. These results not only underscore the superior predictive capability of our model but also highlight its potential in offering more accurate and early detection of CVD. The integration of cross-modal data through attention-based mechanisms provides a deeper understanding of the disease, paving the way for more informed clinical decision-making and personalized patient care.
Collapse
Affiliation(s)
- Jothi Prakash V
- Karpagam College of Engineering, Myleripalayam Village, Coimbatore, 641032, Tamil Nadu, India.
| | - Arul Antran Vijay S
- Karpagam College of Engineering, Myleripalayam Village, Coimbatore, 641032, Tamil Nadu, India.
| | - Ganesh Kumar P
- College of Engineering, Guindy, Anna University, Chennai, 600025, Tamil Nadu, India.
| | - Karthikeyan N K
- Coimbatore Institute of Technology, Peelamedu, Coimbatore, 641014, Tamil Nadu, India.
| |
Collapse
|
4
|
Subramanian AAV, Venugopal JP. A deep ensemble network model for classifying and predicting breast cancer. Comput Intell 2022. [DOI: 10.1111/coin.12563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
5
|
Qin Q, Yang X, Zhang R, Liu M, Ma Y. An Application of Deep Belief Networks in Early Warning for Cerebrovascular Disease Risk. J ORGAN END USER COM 2022. [DOI: 10.4018/joeuc.287574] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
To reduce the incidence of cerebrovascular disease and mortality, identifying the risks of cerebrovascular disease in advance and taking certain preventive measures are significant. This article was aimed to investigate the risk factors of cerebrovascular disease (CVD) in the primary prevention, and to build an early warning model based on the existing technology. The authors use the information entropy algorithm of rough set theory to establish the index system suitable for early warning model. Then, using the limited Boltzmann machine and direction propagation algorithm, the depth trust network is established by building and stacking RBM, and the back propagation is used to fine-tune the parameters of the network at the top layer. Compared with the LM-BP early-warning model, the deep confidence network model is more effective than traditional artificial neural network, which can help to identify the risk of cerebrovascular disease in advance and promote the primary prevention.
Collapse
Affiliation(s)
| | - Xing Yang
- China Unicom Research Institute, China
| | | | - Manlu Liu
- Rochester Institute of Technology, USA
| | - Yuhan Ma
- Beijing Jiaotong University, China
| |
Collapse
|
6
|
Suri JS, Bhagawati M, Paul S, Protogerou AD, Sfikakis PP, Kitas GD, Khanna NN, Ruzsa Z, Sharma AM, Saxena S, Faa G, Laird JR, Johri AM, Kalra MK, Paraskevas KI, Saba L. A Powerful Paradigm for Cardiovascular Risk Stratification Using Multiclass, Multi-Label, and Ensemble-Based Machine Learning Paradigms: A Narrative Review. Diagnostics (Basel) 2022; 12:diagnostics12030722. [PMID: 35328275 PMCID: PMC8947682 DOI: 10.3390/diagnostics12030722] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 03/10/2022] [Accepted: 03/13/2022] [Indexed: 12/16/2022] Open
Abstract
Background and Motivation: Cardiovascular disease (CVD) causes the highest mortality globally. With escalating healthcare costs, early non-invasive CVD risk assessment is vital. Conventional methods have shown poor performance compared to more recent and fast-evolving Artificial Intelligence (AI) methods. The proposed study reviews the three most recent paradigms for CVD risk assessment, namely multiclass, multi-label, and ensemble-based methods in (i) office-based and (ii) stress-test laboratories. Methods: A total of 265 CVD-based studies were selected using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) model. Due to its popularity and recent development, the study analyzed the above three paradigms using machine learning (ML) frameworks. We review comprehensively these three methods using attributes, such as architecture, applications, pro-and-cons, scientific validation, clinical evaluation, and AI risk-of-bias (RoB) in the CVD framework. These ML techniques were then extended under mobile and cloud-based infrastructure. Findings: Most popular biomarkers used were office-based, laboratory-based, image-based phenotypes, and medication usage. Surrogate carotid scanning for coronary artery risk prediction had shown promising results. Ground truth (GT) selection for AI-based training along with scientific and clinical validation is very important for CVD stratification to avoid RoB. It was observed that the most popular classification paradigm is multiclass followed by the ensemble, and multi-label. The use of deep learning techniques in CVD risk stratification is in a very early stage of development. Mobile and cloud-based AI technologies are more likely to be the future. Conclusions: AI-based methods for CVD risk assessment are most promising and successful. Choice of GT is most vital in AI-based models to prevent the RoB. The amalgamation of image-based strategies with conventional risk factors provides the highest stability when using the three CVD paradigms in non-cloud and cloud-based frameworks.
Collapse
Affiliation(s)
- Jasjit S. Suri
- Stroke Diagnostic and Monitoring Division, AtheroPoint™, Roseville, CA 95661, USA
- Correspondence: ; Tel.: +1-(916)-749-5628
| | - Mrinalini Bhagawati
- Department of Biomedical Engineering, North-Eastern Hill University, Shillong 793022, India; (M.B.); (S.P.)
| | - Sudip Paul
- Department of Biomedical Engineering, North-Eastern Hill University, Shillong 793022, India; (M.B.); (S.P.)
| | - Athanasios D. Protogerou
- Research Unit Clinic, Laboratory of Pathophysiology, Department of Cardiovascular Prevention, National and Kapodistrian University of Athens, 11527 Athens, Greece;
| | - Petros P. Sfikakis
- Rheumatology Unit, National Kapodistrian University of Athens, 11527 Athens, Greece;
| | - George D. Kitas
- Arthritis Research UK Centre for Epidemiology, Manchester University, Manchester 46962, UK;
| | - Narendra N. Khanna
- Department of Cardiology, Indraprastha APOLLO Hospitals, New Delhi 110020, India;
| | - Zoltan Ruzsa
- Department of Internal Medicines, Invasive Cardiology Division, University of Szeged, 6720 Szeged, Hungary;
| | - Aditya M. Sharma
- Division of Cardiovascular Medicine, University of Virginia, Charlottesville, VA 22903, USA;
| | - Sanjay Saxena
- Department of CSE, International Institute of Information Technology, Bhubaneswar 751003, India;
| | - Gavino Faa
- Department of Pathology, A.O.U., di Cagliari-Polo di Monserrato s.s., 09045 Cagliari, Italy;
| | - John R. Laird
- Cardiology Department, St. Helena Hospital, St. Helena, CA 94574, USA;
| | - Amer M. Johri
- Department of Medicine, Division of Cardiology, Queen’s University, Kingston, ON K7L 3N6, Canada;
| | - Manudeep K. Kalra
- Department of Radiology, Massachusetts General Hospital, Boston, MA 02114, USA;
| | - Kosmas I. Paraskevas
- Department of Vascular Surgery, Central Clinic of Athens, N. Iraklio, 14122 Athens, Greece;
| | - Luca Saba
- Department of Radiology, A.O.U., di Cagliari-Polo di Monserrato s.s., 09045 Cagliari, Italy;
| |
Collapse
|
7
|
Prasanna SL, Challa NP. Heart Disease Prediction Using Optimal Mayfly Technique with Ensemble Models. INTERNATIONAL JOURNAL OF SWARM INTELLIGENCE RESEARCH 2022. [DOI: 10.4018/ijsir.313665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
This paper proposes a methodology consisting of two phases: attributes selection and classification based on the attributes selected. Phase 1 uses the introduced new feature selection algorithm which is the optimal mayfly algorithm (OMA) to solve the feature selection technique problem. Mayfly algorithm has derived features of physiological and anatomical relevance, like ST depression, the highest heart rate, cholesterol, chest pain, and heart vessels. In the second phase, the selected attributes use the ensemble classifiers like random subspace, bagging, and boosting. Optimal mayfly algorithm (OMA) with boosting technique had the highest accuracy. Therefore, true disease, false disease, accuracy, and specificity are measured to evaluate the proposed system's efficiency. It has been discovered that the proposed method, which combines feature selection and ensemble techniques performs well, the performance of the optimal mayfly algorithm along with ensemble classifiers of boosting method with a model accuracy of 97.12% which is the highest accuracy value compared to any single model.
Collapse
|
8
|
Computational Learning Model for Prediction of Heart Disease Using Machine Learning Based on a New Regularizer. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:8628335. [PMID: 34804150 PMCID: PMC8601816 DOI: 10.1155/2021/8628335] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Accepted: 10/25/2021] [Indexed: 11/17/2022]
Abstract
Heart diseases are characterized as heterogeneous diseases comprising multiple subtypes. Early diagnosis and prognosis of heart disease are essential to facilitate the clinical management of patients. In this research, a new computational model for predicting early heart disease is proposed. The predictive model is embedded in a new regularization based on decaying the weights according to the weight matrices' standard deviation and comparing the results against its parents (RSD-ANN). The performance of RSD-ANN is far better than that of the existing methods. Based on our experiments, the average validation accuracy computed was 96.30% using either the tenfold cross-validation or holdout method.
Collapse
|