1
|
Saleem MA, Javeed A, Akarathanawat W, Chutinet A, Suwanwela NC, Kaewplung P, Chaitusaney S, Deelertpaiboon S, Srisiri W, Benjapolakul W. An intelligent learning system based on electronic health records for unbiased stroke prediction. Sci Rep 2024; 14:23052. [PMID: 39367027 PMCID: PMC11452373 DOI: 10.1038/s41598-024-73570-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 09/18/2024] [Indexed: 10/06/2024] Open
Abstract
Stroke has a negative impact on people's lives and is one of the leading causes of death and disability worldwide. Early detection of symptoms can significantly help predict stroke and promote a healthy lifestyle. Researchers have developed several methods to predict strokes using machine learning (ML) techniques. However, the proposed systems have suffered from the following two main problems. The first problem is that the machine learning models are biased due to the uneven distribution of classes in the dataset. Recent research has not adequately addressed this problem, and no preventive measures have been taken. Synthetic Minority Oversampling (SMOTE) has been used to remove bias and balance the training of the proposed ML model. The second problem is to solve the problem of lower classification accuracy of machine learning models. We proposed a learning system that combines an autoencoder with a linear discriminant analysis (LDA) model to increase the accuracy of the proposed ML model for stroke prediction. Relevant features are extracted from the feature space using the autoencoder, and the extracted subset is then fed into the LDA model for stroke classification. The hyperparameters of the LDA model are found using a grid search strategy. However, the conventional accuracy metric does not truly reflect the performance of ML models. Therefore, we employed several evaluation metrics to validate the efficiency of the proposed model. Consequently, we evaluated the proposed model's accuracy, sensitivity, specificity, area under the curve (AUC), and receiver operator characteristic (ROC). The experimental results show that the proposed model achieves a sensitivity and specificity of 98.51% and 97.56%, respectively, with an accuracy of 99.24% and a balanced accuracy of 98.00%.
Collapse
Affiliation(s)
- Muhammad Asim Saleem
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Ashir Javeed
- Aging Research Center, Karolinska Institutet, 171 65, Stockholm, Sweden
| | - Wasan Akarathanawat
- Division of Neurology, Department of Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, 10330, Thailand
- Chulalongkorn Stroke Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, 10330, Thailand
- Chula Neuroscience Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, 10330, Thailand
| | - Aurauma Chutinet
- Division of Neurology, Department of Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, 10330, Thailand
- Chulalongkorn Stroke Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, 10330, Thailand
- Chula Neuroscience Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, 10330, Thailand
| | - Nijasri Charnnarong Suwanwela
- Division of Neurology, Department of Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, 10330, Thailand
- Chulalongkorn Stroke Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, 10330, Thailand
- Chula Neuroscience Center, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, 10330, Thailand
| | - Pasu Kaewplung
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand.
| | - Surachai Chaitusaney
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Sunchai Deelertpaiboon
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Wattanasak Srisiri
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Watit Benjapolakul
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, 10330, Thailand.
| |
Collapse
|
2
|
Nyholm J, Ghazi AN, Ghazi SN, Sanmartin Berglund J. Prediction of dementia based on older adults' sleep disturbances using machine learning. Comput Biol Med 2024; 171:108126. [PMID: 38342045 DOI: 10.1016/j.compbiomed.2024.108126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 12/14/2023] [Accepted: 02/06/2024] [Indexed: 02/13/2024]
Abstract
BACKGROUND The most common degenerative condition in older adults is dementia, which can be predicted using a number of indicators and whose progression can be slowed down. One of the indicators of an increased risk of dementia is sleep disturbances. This study aims to examine if machine learning can predict dementia and which sleep disturbance factors impact dementia. METHODS This study uses five machine learning algorithms (gradient boosting, logistic regression, gaussian naive Bayes, random forest and support vector machine) and data on the older population (60+) in Sweden from the Swedish National Study on Ageing and Care - Blekinge (n=4175). Each algorithm uses 10-fold stratified cross-validation to obtain the results, which consist of the Brier score for checking accuracy and the feature importance for examining the factors which impact dementia. The algorithms use 16 features which are on personal and sleep disturbance factors. RESULTS Logistic regression found an association between dementia and sleep disturbances. However, it is slight for the features in the study. Gradient boosting was the most accurate algorithm with 92.9% accuracy, 0.926 f1-score, 0.974 ROC AUC and 0.056 Brier score. The significant factors were different in each machine learning algorithm. If the person sleeps more than two hours during the day, their sex, education level, age, waking up during the night and if the person snores are the variables that most consistently have the highest feature importance in all algorithms. CONCLUSION There is an association between sleep disturbances and dementia, which machine learning algorithms can predict. Furthermore, the risk factors for dementia are different across the algorithms, but sleep disturbances can predict dementia.
Collapse
Affiliation(s)
- Joel Nyholm
- Department of Computer Science, Blekinge Institute of Technology, Karlskrona, 37179, Blekinge, Sweden
| | - Ahmad Nauman Ghazi
- Department of Software Engineering, Blekinge Institute of Technology, Karlskrona, 37179, Blekinge, Sweden.
| | - Sarah Nauman Ghazi
- Department of Health, Blekinge Institute of Technology, Karlskrona, 37179, Blekinge, Sweden
| | | |
Collapse
|
3
|
Tang Y, Xiong X, Tong G, Yang Y, Zhang H. Multimodal diagnosis model of Alzheimer's disease based on improved Transformer. Biomed Eng Online 2024; 23:8. [PMID: 38243275 PMCID: PMC10799436 DOI: 10.1186/s12938-024-01204-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Accepted: 01/08/2024] [Indexed: 01/21/2024] Open
Abstract
PURPOSE Recent technological advancements in data acquisition tools allowed neuroscientists to acquire different modality data to diagnosis Alzheimer's disease (AD). However, how to fuse these enormous amount different modality data to improve recognizing rate and find significance brain regions is still challenging. METHODS The algorithm used multimodal medical images [structural magnetic resonance imaging (sMRI) and positron emission tomography (PET)] as experimental data. Deep feature representations of sMRI and PET images are extracted by 3D convolution neural network (3DCNN). An improved Transformer is then used to progressively learn global correlation information among features. Finally, the information from different modalities is fused for identification. A model-based visualization method is used to explain the decisions of the model and identify brain regions related to AD. RESULTS The model attained a noteworthy classification accuracy of 98.1% for Alzheimer's disease (AD) using the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. Upon examining the visualization results, distinct brain regions associated with AD diagnosis were observed across different image modalities. Notably, the left parahippocampal region emerged consistently as a prominent and significant brain area. CONCLUSIONS A large number of comparative experiments have been carried out for the model, and the experimental results verify the reliability of the model. In addition, the model adopts a visualization analysis method based on the characteristics of the model, which improves the interpretability of the model. Some disease-related brain regions were found in the visualization results, which provides reliable information for AD clinical research.
Collapse
Affiliation(s)
- Yan Tang
- School of Electronic Information, Central South University, Changsha, 410008, Hunan, People's Republic of China
- Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin, 541004, Guangxi, People's Republic of China
| | - Xing Xiong
- School of Computer Science and Engineering, Central South University, Changsha, 410008, Hunan, People's Republic of China
| | - Gan Tong
- School of Computer Science and Engineering, Central South University, Changsha, 410008, Hunan, People's Republic of China
| | - Yuan Yang
- Department of Bioengineering, University of Illinois Urbana-Champaign, Grainger College of Engineering, Urbana, IL, USA
| | - Hao Zhang
- School of Electronic Information, Central South University, Changsha, 410008, Hunan, People's Republic of China.
| |
Collapse
|
4
|
Javeed A, Anderberg P, Ghazi AN, Noor A, Elmståhl S, Berglund JS. Breaking barriers: a statistical and machine learning-based hybrid system for predicting dementia. Front Bioeng Biotechnol 2024; 11:1336255. [PMID: 38260734 PMCID: PMC10801181 DOI: 10.3389/fbioe.2023.1336255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 12/05/2023] [Indexed: 01/24/2024] Open
Abstract
Introduction: Dementia is a condition (a collection of related signs and symptoms) that causes a continuing deterioration in cognitive function, and millions of people are impacted by dementia every year as the world population continues to rise. Conventional approaches for determining dementia rely primarily on clinical examinations, analyzing medical records, and administering cognitive and neuropsychological testing. However, these methods are time-consuming and costly in terms of treatment. Therefore, this study aims to present a noninvasive method for the early prediction of dementia so that preventive steps should be taken to avoid dementia. Methods: We developed a hybrid diagnostic system based on statistical and machine learning (ML) methods that used patient electronic health records to predict dementia. The dataset used for this study was obtained from the Swedish National Study on Aging and Care (SNAC), with a sample size of 43040 and 75 features. The newly constructed diagnostic extracts a subset of useful features from the dataset through a statistical method (F-score). For the classification, we developed an ensemble voting classifier based on five different ML models: decision tree (DT), naive Bayes (NB), logistic regression (LR), support vector machines (SVM), and random forest (RF). To address the problem of ML model overfitting, we used a cross-validation approach to evaluate the performance of the proposed diagnostic system. Various assessment measures, such as accuracy, sensitivity, specificity, receiver operating characteristic (ROC) curve, and Matthew's correlation coefficient (MCC), were used to thoroughly validate the devised diagnostic system's efficiency. Results: According to the experimental results, the proposed diagnostic method achieved the best accuracy of 98.25%, as well as sensitivity of 97.44%, specificity of 95.744%, and MCC of 0.7535. Discussion: The effectiveness of the proposed diagnostic approach is compared to various cutting-edge feature selection techniques and baseline ML models. From experimental results, it is evident that the proposed diagnostic system outperformed the prior feature selection strategies and baseline ML models regarding accuracy.
Collapse
Affiliation(s)
- Ashir Javeed
- Department of Health, Blekinge Institute of Technology, Karlskrona, Sweden
| | - Peter Anderberg
- Department of Health, Blekinge Institute of Technology, Karlskrona, Sweden
- School of Health Sciences, University of Skövde, Skövde, Sweden
| | - Ahmad Nauman Ghazi
- Department of Software Engineering, Blekinge Institute of Technology, Karlskrona, Sweden
| | - Adeeb Noor
- Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Sölve Elmståhl
- EpiHealth: Epidemiology for Health, Lund University, SUS Malmö, Malmö, Sweden
| | | |
Collapse
|
5
|
Saleem MA, Thien Le N, Asdornwised W, Chaitusaney S, Javeed A, Benjapolakul W. Sooty Tern Optimization Algorithm-Based Deep Learning Model for Diagnosing NSCLC Tumours. SENSORS (BASEL, SWITZERLAND) 2023; 23:2147. [PMID: 36850744 PMCID: PMC9959990 DOI: 10.3390/s23042147] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 02/05/2023] [Accepted: 02/08/2023] [Indexed: 06/18/2023]
Abstract
Lung cancer is one of the most common causes of cancer deaths in the modern world. Screening of lung nodules is essential for early recognition to facilitate treatment that improves the rate of patient rehabilitation. An increase in accuracy during lung cancer detection is vital for sustaining the rate of patient persistence, even though several research works have been conducted in this research domain. Moreover, the classical system fails to segment cancer cells of different sizes accurately and with excellent reliability. This paper proposes a sooty tern optimization algorithm-based deep learning (DL) model for diagnosing non-small cell lung cancer (NSCLC) tumours with increased accuracy. We discuss various algorithms for diagnosing models that adopt the Otsu segmentation method to perfectly isolate the lung nodules. Then, the sooty tern optimization algorithm (SHOA) is adopted for partitioning the cancer nodules by defining the best characteristics, which aids in improving diagnostic accuracy. It further utilizes a local binary pattern (LBP) for determining appropriate feature retrieval from the lung nodules. In addition, it adopts CNN and GRU-based classifiers for identifying whether the lung nodules are malignant or non-malignant depending on the features retrieved during the diagnosing process. The experimental results of this SHOA-optimized DNN model achieved an accuracy of 98.32%, better than the baseline schemes used for comparison.
Collapse
Affiliation(s)
- Muhammad Asim Saleem
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand
| | - Ngoc Thien Le
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand
| | - Widhyakorn Asdornwised
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand
| | - Surachai Chaitusaney
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand
| | - Ashir Javeed
- Aging Research Center, Karolinska Institutet, 171 65 Stockholm, Sweden
| | - Watit Benjapolakul
- Center of Excellence in Artificial Intelligence, Machine Learning and Smart Grid Technology, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand
| |
Collapse
|
6
|
Early Prediction of Dementia Using Feature Extraction Battery (FEB) and Optimized Support Vector Machine (SVM) for Classification. Biomedicines 2023; 11:biomedicines11020439. [PMID: 36830975 PMCID: PMC9953011 DOI: 10.3390/biomedicines11020439] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Revised: 01/30/2023] [Accepted: 01/31/2023] [Indexed: 02/05/2023] Open
Abstract
Dementia is a cognitive disorder that mainly targets older adults. At present, dementia has no cure or prevention available. Scientists found that dementia symptoms might emerge as early as ten years before the onset of real disease. As a result, machine learning (ML) scientists developed various techniques for the early prediction of dementia using dementia symptoms. However, these methods have fundamental limitations, such as low accuracy and bias in machine learning (ML) models. To resolve the issue of bias in the proposed ML model, we deployed the adaptive synthetic sampling (ADASYN) technique, and to improve accuracy, we have proposed novel feature extraction techniques, namely, feature extraction battery (FEB) and optimized support vector machine (SVM) using radical basis function (rbf) for the classification of the disease. The hyperparameters of SVM are calibrated by employing the grid search approach. It is evident from the experimental results that the newly pr oposed model (FEB-SVM) improves the dementia prediction accuracy of the conventional SVM by 6%. The proposed model (FEB-SVM) obtained 98.28% accuracy on training data and a testing accuracy of 93.92%. Along with accuracy, the proposed model obtained a precision of 91.80%, recall of 86.59, F1-score of 89.12%, and Matthew's correlation coefficient (MCC) of 0.4987. Moreover, the newly proposed model (FEB-SVM) outperforms the 12 state-of-the-art ML models that the researchers have recently presented for dementia prediction.
Collapse
|