1
|
Hassan SU, Abdulkadir SJ, Zahid MSM, Al-Selwi SM. Local interpretable model-agnostic explanation approach for medical imaging analysis: A systematic literature review. Comput Biol Med 2024; 185:109569. [PMID: 39705792 DOI: 10.1016/j.compbiomed.2024.109569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 10/30/2024] [Accepted: 12/10/2024] [Indexed: 12/23/2024]
Abstract
BACKGROUND The interpretability and explainability of machine learning (ML) and artificial intelligence systems are critical for generating trust in their outcomes in fields such as medicine and healthcare. Errors generated by these systems, such as inaccurate diagnoses or treatments, can have serious and even life-threatening effects on patients. Explainable Artificial Intelligence (XAI) is emerging as an increasingly significant area of research nowadays, focusing on the black-box aspect of sophisticated and difficult-to-interpret ML algorithms. XAI techniques such as Local Interpretable Model-Agnostic Explanations (LIME) can give explanations for these models, raising confidence in the systems and improving trust in their predictions. Numerous works have been published that respond to medical problems through the use of ML models in conjunction with XAI algorithms to give interpretability and explainability. The primary objective of the study is to evaluate the performance of the newly emerging LIME techniques within healthcare domains that require more attention in the realm of XAI research. METHOD A systematic search was conducted in numerous databases (Scopus, Web of Science, IEEE Xplore, ScienceDirect, MDPI, and PubMed) that identified 1614 peer-reviewed articles published between 2019 and 2023. RESULTS 52 articles were selected for detailed analysis that showed a growing trend in the application of LIME techniques in healthcare, with significant improvements in the interpretability of ML models used for diagnostic and prognostic purposes. CONCLUSION The findings suggest that the integration of XAI techniques, particularly LIME, enhances the transparency and trustworthiness of AI systems in healthcare, thereby potentially improving patient outcomes and fostering greater acceptance of AI-driven solutions among medical professionals.
Collapse
Affiliation(s)
- Shahab Ul Hassan
- Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia; Centre for Intelligent Signal & Imaging Research (CISIR), Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia.
| | - Said Jadid Abdulkadir
- Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia; Center for Research in Data Science (CeRDaS), Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia.
| | - M Soperi Mohd Zahid
- Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia; Centre for Intelligent Signal & Imaging Research (CISIR), Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia.
| | - Safwan Mahmood Al-Selwi
- Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia; Center for Research in Data Science (CeRDaS), Universiti Teknologi PETRONAS, Seri Iskandar, 32610, Perak, Malaysia.
| |
Collapse
|
2
|
Kaur BP, Singh H, Hans R, Sharma SK, Sharma C, Hassan MM. A Genetic algorithm aided hyper parameter optimization based ensemble model for respiratory disease prediction with Explainable AI. PLoS One 2024; 19:e0308015. [PMID: 39621641 PMCID: PMC11611116 DOI: 10.1371/journal.pone.0308015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 07/16/2024] [Indexed: 12/09/2024] Open
Abstract
In the current era, a lot of research is being done in the domain of disease diagnosis using machine learning. In recent times, one of the deadliest respiratory diseases, COVID-19, which causes serious damage to the lungs has claimed a lot of lives globally. Machine learning-based systems can assist clinicians in the early diagnosis of the disease, which can reduce the deadly effects of the disease. For the successful deployment of these machine learning-based systems, hyperparameter-based optimization and feature selection are important issues. Motivated by the above, in this proposal, we design an improved model to predict the existence of respiratory disease among patients by incorporating hyperparameter optimization and feature selection. To optimize the parameters of the machine learning algorithms, hyperparameter optimization with a genetic algorithm is proposed and to reduce the size of the feature set, feature selection is performed using binary grey wolf optimization algorithm. Moreover, to enhance the efficacy of the predictions made by hyperparameter-optimized machine learning models, an ensemble model is proposed using a stacking classifier. Also, explainable AI was incorporated to define the feature importance by making use of Shapely adaptive explanations (SHAP) values. For the experimentation, the publicly accessible Mexico clinical dataset of COVID-19 was used. The results obtained show that the proposed model has superior prediction accuracy in comparison to its counterparts. Moreover, among all the hyperparameter-optimized algorithms, adaboost algorithm outperformed all the other hyperparameter-optimized algorithms. The various performance assessment metrics, including accuracy, precision, recall, AUC, and F1-score, were used to assess the results.
Collapse
Affiliation(s)
- Balraj Preet Kaur
- Department of Computer Science and Engineering, DAV University, Jalandhar, Punjab, India
| | - Harpreet Singh
- Department of Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala, India
| | - Rahul Hans
- Department of Computer Science and Engineering, DAV University, Jalandhar, Punjab, India
| | - Sanjeev Kumar Sharma
- Department of Computer Science and Applications, DAV University, Jalandhar, Punjab, India
| | - Chetna Sharma
- Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India
| | - Md. Mehedi Hassan
- Computer Science and Engineering Discipline, Khulna University, Khulna, Bangladesh
| |
Collapse
|
3
|
Talib MA, Afadar Y, Nasir Q, Nassif AB, Hijazi H, Hasasneh A. A tree-based explainable AI model for early detection of Covid-19 using physiological data. BMC Med Inform Decis Mak 2024; 24:179. [PMID: 38915001 PMCID: PMC11194929 DOI: 10.1186/s12911-024-02576-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 06/13/2024] [Indexed: 06/26/2024] Open
Abstract
With the outbreak of COVID-19 in 2020, countries worldwide faced significant concerns and challenges. Various studies have emerged utilizing Artificial Intelligence (AI) and Data Science techniques for disease detection. Although COVID-19 cases have declined, there are still cases and deaths around the world. Therefore, early detection of COVID-19 before the onset of symptoms has become crucial in reducing its extensive impact. Fortunately, wearable devices such as smartwatches have proven to be valuable sources of physiological data, including Heart Rate (HR) and sleep quality, enabling the detection of inflammatory diseases. In this study, we utilize an already-existing dataset that includes individual step counts and heart rate data to predict the probability of COVID-19 infection before the onset of symptoms. We train three main model architectures: the Gradient Boosting classifier (GB), CatBoost trees, and TabNet classifier to analyze the physiological data and compare their respective performances. We also add an interpretability layer to our best-performing model, which clarifies prediction results and allows a detailed assessment of effectiveness. Moreover, we created a private dataset by gathering physiological data from Fitbit devices to guarantee reliability and avoid bias.The identical set of models was then applied to this private dataset using the same pre-trained models, and the results were documented. Using the CatBoost tree-based method, our best-performing model outperformed previous studies with an accuracy rate of 85% on the publicly available dataset. Furthermore, this identical pre-trained CatBoost model produced an accuracy of 81% when applied to the private dataset. You will find the source code in the link: https://github.com/OpenUAE-LAB/Covid-19-detection-using-Wearable-data.git .
Collapse
Affiliation(s)
- Manar Abu Talib
- Department of Computer Science, College of Computing and Informatics, University of Sharjah, P.O. Box 27272, Sharjah, UAE.
| | - Yaman Afadar
- Department of Computer Engineering, College of Computing and Informatics, University of Sharjah, Sharjah, UAE
| | - Qassim Nasir
- Department of Computer Engineering, College of Computing and Informatics, University of Sharjah, Sharjah, UAE
| | - Ali Bou Nassif
- Department of Computer Engineering, College of Computing and Informatics, University of Sharjah, Sharjah, UAE
| | - Haytham Hijazi
- Centre for Informatics and Systems of the University of Coimbra (CISUC), University of Coimbra, Coimbra, P-3030-290, Portugal
- Intelligent Systems Department, Ahliya University, Bethlehem, P-150-199, Palestine
| | - Ahmad Hasasneh
- Department of Natural, Engineering and Technology Sciences, Faculty of Graduate Studies, Arab American University, P.O. Box 240, Ramallah, Palestine
| |
Collapse
|
4
|
Khalili H, Wimmer MA. Towards Improved XAI-Based Epidemiological Research into the Next Potential Pandemic. Life (Basel) 2024; 14:783. [PMID: 39063538 PMCID: PMC11278356 DOI: 10.3390/life14070783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2024] [Revised: 06/16/2024] [Accepted: 06/19/2024] [Indexed: 07/28/2024] Open
Abstract
By applying AI techniques to a variety of pandemic-relevant data, artificial intelligence (AI) has substantially supported the control of the spread of the SARS-CoV-2 virus. Along with this, epidemiological machine learning studies of SARS-CoV-2 have been frequently published. While these models can be perceived as precise and policy-relevant to guide governments towards optimal containment policies, their black box nature can hamper building trust and relying confidently on the prescriptions proposed. This paper focuses on interpretable AI-based epidemiological models in the context of the recent SARS-CoV-2 pandemic. We systematically review existing studies, which jointly incorporate AI, SARS-CoV-2 epidemiology, and explainable AI approaches (XAI). First, we propose a conceptual framework by synthesizing the main methodological features of the existing AI pipelines of SARS-CoV-2. Upon the proposed conceptual framework and by analyzing the selected epidemiological studies, we reflect on current research gaps in epidemiological AI toolboxes and how to fill these gaps to generate enhanced policy support in the next potential pandemic.
Collapse
Affiliation(s)
- Hamed Khalili
- Research Group E-Government, Faculty of Computer Science, University of Koblenz, D-56070 Koblenz, Germany;
| | | |
Collapse
|
5
|
Winalai C, Anupong S, Modchang C, Chadsuthi S. LSTM-Powered COVID-19 prediction in central Thailand incorporating meteorological and particulate matter data with a multi-feature selection approach. Heliyon 2024; 10:e30319. [PMID: 38711630 PMCID: PMC11070856 DOI: 10.1016/j.heliyon.2024.e30319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 04/23/2024] [Accepted: 04/23/2024] [Indexed: 05/08/2024] Open
Abstract
The COVID-19 pandemic has significantly impacted public health and necessitated urgent actions to mitigate its spread. Monitoring and predicting the outbreak's progression have become vital to devise effective strategies and allocate resources efficiently. This study presents a novel approach utilizing Multivariate Long Short-Term Memory (LSTM) to analyze and predict COVID-19 trends in Central Thailand, particularly emphasizing the multi-feature selection process. To consider a comprehensive view of the pandemic's dynamics, our research dataset encompasses epidemiological, meteorological, and particulate matter features, which were gathered from reliable sources. We propose a multi-feature selection technique to identify the most relevant and influential features that significantly impact the spread of COVID-19 in the region to enhance the model's performance. Our results highlight that relative humidity is the key factor driving COVID-19 transmission in Central Thailand. The proposed multi-feature selection technique significantly improves the model's accuracy, ensuring that only the most informative variables contribute to the predictions, avoiding the potential noise or redundancy from less relevant features. The proposed LSTM model demonstrates its capability to forecast COVID-19 cases, facilitating informed decision-making for public health authorities and policymakers.
Collapse
Affiliation(s)
- Chanidapa Winalai
- Department of Physics, Faculty of Science, Naresuan University, Phitsanulok 65000, Thailand
| | - Suparinthon Anupong
- Department of Chemistry, Mahidol Wittayanusorn School (MWIT), Salaya, Nakhon Pathom 73170, Thailand
| | - Charin Modchang
- Biophysics Group, Department of Physics, Faculty of Science, Mahidol University, Bangkok 10400, Thailand
- Centre of Excellence in Mathematics, CHE, Bangkok 10400, Thailand
- Thailand Center of Excellence in Physics, CHE, 328 Si Ayutthaya Road, Bangkok 10400, Thailand
| | - Sudarat Chadsuthi
- Department of Physics, Faculty of Science, Naresuan University, Phitsanulok 65000, Thailand
| |
Collapse
|
6
|
Roy S, Singh J, Ray SS. Weighted Combination of Łukasiewicz implication and Fuzzy Jaccard similarity in Hybrid Ensemble Framework (WCLFJHEF) for Gene Selection. Comput Biol Med 2024; 170:107981. [PMID: 38262204 DOI: 10.1016/j.compbiomed.2024.107981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 01/02/2024] [Accepted: 01/12/2024] [Indexed: 01/25/2024]
Abstract
A framework is developed for gene expression analysis by introducing fuzzy Jaccard similarity (FJS) and combining Łukasiewicz implication with it through weights in hybrid ensemble framework (WCLFJHEF) for gene selection in cancer. The method is called weighted combination of Łukasiewicz implication and fuzzy Jaccard similarity in hybrid ensemble framework (WCLFJHEF). While the fuzziness in Jaccard similarity is incorporated by using the existing Gödel fuzzy logic, the weights are obtained by maximizing the average F-score of selected genes in classifying the cancer patients. The patients are first divided into different clusters, based on the number of patient groups, using average linkage agglomerative clustering and a new score, called WCLFJ (weighted combination of Łukasiewicz implication and fuzzy Jaccard similarity). The genes are then selected from each cluster separately using filter based Relief-F and wrapper based SVMRFE (Support Vector Machine with Recursive Feature Elimination). A gene (feature) pool is created by considering the union of selected features for all the clusters. A set of informative genes is selected from the pool using sequential backward floating search (SBFS) algorithm. Patients are then classified using Naïve Bayes'(NB) and Support Vector Machine (SVM) separately, using the selected genes and the related F-scores are calculated. The weights in WCLFJ are then updated iteratively to maximize the average F-score obtained from the results of the classifier. The effectiveness of WCLFJHEF is demonstrated on six gene expression datasets. The average values of accuracy, F-score, recall, precision and MCC over all the datasets, are 95%, 94%, 94%, 94%, and 90%, respectively. The explainability of the selected genes is shown using SHapley Additive exPlanations (SHAP) values and this information is further used to rank them. The relevance of the selected gene set are biologically validated using the KEGG Pathway, Gene Ontology (GO), and existing literatures. It is seen that the genes that are selected by WCLFJHEF are candidates for genomic alterations in the various cancer types. The source code of WCLFJHEF is available at http://www.isical.ac.in/~shubhra/WCLFJHEF.html.
Collapse
Affiliation(s)
- Sukriti Roy
- Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700108, India.
| | - Joginder Singh
- Center for Soft Computing Research, Indian Statistical Institute, Kolkata 700108, India.
| | - Shubhra Sankar Ray
- Machine Intelligence Unit, Indian Statistical Institute, Kolkata 700108, India; Center for Soft Computing Research, Indian Statistical Institute, Kolkata 700108, India.
| |
Collapse
|
7
|
Liu J, Wu X, Xie Y, Tang Z, Xie Y, Gong S. Small samples-oriented intrinsically explainable machine learning using Variational Bayesian Logistic Regression: An intensive care unit readmission prediction case for liver transplantation patients. EXPERT SYSTEMS WITH APPLICATIONS 2024; 235:121138. [DOI: 10.1016/j.eswa.2023.121138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
8
|
Chadaga K, Prabhu S, Bhat V, Sampathila N, Umakanth S, Upadya P S. COVID-19 diagnosis using clinical markers and multiple explainable artificial intelligence approaches: A case study from Ecuador. SLAS Technol 2023; 28:393-410. [PMID: 37689365 DOI: 10.1016/j.slast.2023.09.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 08/16/2023] [Accepted: 09/06/2023] [Indexed: 09/11/2023]
Abstract
The COVID-19 pandemic erupted at the beginning of 2020 and proved fatal, causing many casualties worldwide. Immediate and precise screening of affected patients is critical for disease control. COVID-19 is often confused with various other respiratory disorders since the symptoms are similar. As of today, the reverse transcription-polymerase chain reaction (RT-PCR) test is utilized for diagnosing COVID-19. However, this approach is sometimes prone to producing erroneous and false negative results. Hence, finding a reliable diagnostic method that can validate the RT-PCR test results is crucial. Artificial intelligence (AI) and machine learning (ML) applications in COVID-19 diagnosis has proven to be beneficial. Hence, clinical markers have been utilized for COVID-19 diagnosis with the help of several classifiers in this study. Further, five different explainable artificial intelligence techniques have been utilized to interpret the predictions. Among all the algorithms, the k-nearest neighbor obtained the best performance with an accuracy, precision, recall and f1-score of 84%, 85%, 84% and 84%. According to this study, the combination of clinical markers such as eosinophils, lymphocytes, red blood cells and leukocytes was significant in differentiating COVID-19. The classifiers can be utilized synchronously with the standard RT-PCR procedure making diagnosis more reliable and efficient.
Collapse
Affiliation(s)
- Krishnaraj Chadaga
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - Srikanth Prabhu
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India.
| | - Vivekananda Bhat
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India
| | - Niranjana Sampathila
- Department of Biomedical Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India.
| | - Shashikiran Umakanth
- Department of Medicine, Dr. TMA Hospital, Manipal Academy of Higher Education, Manipal, India
| | - Sudhakara Upadya P
- Manipal School of Information Sciences, Manipal Academy of Higher Education, Manipal, India
| |
Collapse
|
9
|
Kırboğa KK, Küçüksille EU, Naldan ME, Işık M, Gülcü O, Aksakal E. CVD22: Explainable artificial intelligence determination of the relationship of troponin to D-Dimer, mortality, and CK-MB in COVID-19 patients. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 233:107492. [PMID: 36965300 PMCID: PMC10023204 DOI: 10.1016/j.cmpb.2023.107492] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 02/06/2023] [Accepted: 03/15/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND PURPOSE COVID-19, which emerged in Wuhan (China), is one of the deadliest and fastest-spreading pandemics as of the end of 2019. According to the World Health Organization (WHO), there are more than 100 million infectious cases worldwide. Therefore, research models are crucial for managing the pandemic scenario. However, because the behavior of this epidemic is so complex and difficult to understand, an effective model must not only produce accurate predictive results but must also have a clear explanation that enables human experts to act proactively. For this reason, an innovative study has been planned to diagnose Troponin levels in the COVID-19 process with explainable white box algorithms to reach a clear explanation. METHODS Using the pandemic data provided by Erzurum Training and Research Hospital (decision number: 2022/13-145), an interpretable explanation of Troponin data was provided in the COVID-19 process with SHApley Additive exPlanations (SHAP) algorithms. Five machine learning (ML) algorithms were developed. Model performances were determined based on training, test accuracies, precision, F1-score, recall, and AUC (Area Under the Curve) values. Feature importance was estimated according to Shapley values by applying the SHApley Additive exPlanations (SHAP) method to the model with high accuracy. The model created with Streamlit v.3.9 was integrated into the interface with the name CVD22. RESULTS Among the five-machine learning (ML) models created with pandemic data, the best model was selected with the values of 1.0, 0.83, 0.86, 0.83, 0.80, and 0.91 in train and test accuracy, precision, F1-score, recall, and AUC values, respectively. As a result of feature selection and SHApley Additive exPlanations (SHAP) algorithms applied to the XGBoost model, it was determined that DDimer mean, mortality, CKMB (creatine kinase myocardial band), and Glucose were the features with the highest importance over the model estimation. CONCLUSIONS Recent advances in new explainable artificial intelligence (XAI) models have successfully made it possible to predict the future using large historical datasets. Therefore, throughout the ongoing pandemic, CVD22 (https://cvd22covid.streamlitapp.com/) can be used as a guide to help authorities or medical professionals make the best decisions quickly.
Collapse
Affiliation(s)
- Kevser Kübra Kırboğa
- Bilecik Seyh Edebali University, Bioengineering Department, 11230, Bilecik, Turkey; Informatics Institute, Istanbul Technical University, Maslak, Istanbul, 34469, Turkey.
| | - Ecir Uğur Küçüksille
- Süleyman Demirel University, Engineering Faculty, Department of Computer Engineering, Isparta 32260, Turkey
| | - Muhammet Emin Naldan
- Bilecik Seyh Edebali University, Faculty of Medicine, Department of Anaesthesiology and Reanimation, 11230, Bilecik, Turkey
| | - Mesut Işık
- Bilecik Seyh Edebali University, Bioengineering Department, 11230, Bilecik, Turkey
| | - Oktay Gülcü
- Health Sciences University, Erzurum City Hospital, Department of Cardiology, Erzurum, Turkey
| | - Emrah Aksakal
- Health Sciences University, Erzurum City Hospital, Department of Cardiology, Erzurum, Turkey
| |
Collapse
|
10
|
Chadaga K, Prabhu S, Bhat V, Sampathila N, Umakanth S, Chadaga R. A Decision Support System for Diagnosis of COVID-19 from Non-COVID-19 Influenza-like Illness Using Explainable Artificial Intelligence. Bioengineering (Basel) 2023; 10:439. [PMID: 37106626 PMCID: PMC10135993 DOI: 10.3390/bioengineering10040439] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 03/27/2023] [Accepted: 03/29/2023] [Indexed: 04/03/2023] Open
Abstract
The coronavirus pandemic emerged in early 2020 and turned out to be deadly, killing a vast number of people all around the world. Fortunately, vaccines have been discovered, and they seem effectual in controlling the severe prognosis induced by the virus. The reverse transcription-polymerase chain reaction (RT-PCR) test is the current golden standard for diagnosing different infectious diseases, including COVID-19; however, it is not always accurate. Therefore, it is extremely crucial to find an alternative diagnosis method which can support the results of the standard RT-PCR test. Hence, a decision support system has been proposed in this study that uses machine learning and deep learning techniques to predict the COVID-19 diagnosis of a patient using clinical, demographic and blood markers. The patient data used in this research were collected from two Manipal hospitals in India and a custom-made, stacked, multi-level ensemble classifier has been used to predict the COVID-19 diagnosis. Deep learning techniques such as deep neural networks (DNN) and one-dimensional convolutional networks (1D-CNN) have also been utilized. Further, explainable artificial techniques (XAI) such as Shapley additive values (SHAP), ELI5, local interpretable model explainer (LIME), and QLattice have been used to make the models more precise and understandable. Among all of the algorithms, the multi-level stacked model obtained an excellent accuracy of 96%. The precision, recall, f1-score and AUC obtained were 94%, 95%, 94% and 98% respectively. The models can be used as a decision support system for the initial screening of coronavirus patients and can also help ease the existing burden on medical infrastructure.
Collapse
Affiliation(s)
- Krishnaraj Chadaga
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;
| | - Srikanth Prabhu
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;
| | - Vivekananda Bhat
- Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;
| | - Niranjana Sampathila
- Department of Biomedical Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;
| | - Shashikiran Umakanth
- Department of Medicine, Dr. TMA Hospital, Manipal Academy of Higher Education, Manipal 576104, India;
| | - Rajagopala Chadaga
- Department of Mechanical and Industrial Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;
| |
Collapse
|
11
|
Lin MY, Chang YM, Li CC, Chao WC. Explainable Machine Learning to Predict Successful Weaning of Mechanical Ventilation in Critically Ill Patients Requiring Hemodialysis. Healthcare (Basel) 2023; 11:healthcare11060910. [PMID: 36981566 PMCID: PMC10048210 DOI: 10.3390/healthcare11060910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 03/18/2023] [Accepted: 03/19/2023] [Indexed: 03/30/2023] Open
Abstract
Lungs and kidneys are two vital and frequently injured organs among critically ill patients. In this study, we attempt to develop a weaning prediction model for patients with both respiratory and renal failure using an explainable machine learning (XML) approach. We used the eICU collaborative research database, which contained data from 335 ICUs across the United States. Four ML models, including XGBoost, GBM, AdaBoost, and RF, were used, with weaning prediction and feature windows, both at 48 h. The model's explanations were presented at the domain, feature, and individual levels by leveraging various techniques, including cumulative feature importance, the partial dependence plot (PDP), the Shapley additive explanations (SHAP) plot, and local explanation with the local interpretable model-agnostic explanations (LIME). We enrolled 1789 critically ill ventilated patients requiring hemodialysis, and 42.8% (765/1789) of them were weaned successfully from mechanical ventilation. The accuracies in XGBoost and GBM were better than those in the other models. The discriminative characteristics of six key features used to predict weaning were demonstrated through the application of the SHAP and PDP plots. By utilizing LIME, we were able to provide an explanation of the predicted probabilities and the associated reasoning for successful weaning on an individual level. In conclusion, we used an XML approach to establish a weaning prediction model in critically ill ventilated patients requiring hemodialysis.
Collapse
Affiliation(s)
- Ming-Yen Lin
- Department of Information Engineering and Computer Science, Feng Chia University, Taichung 407102, Taiwan
| | - Yuan-Ming Chang
- Department of Information Engineering and Computer Science, Feng Chia University, Taichung 407102, Taiwan
| | - Chi-Chun Li
- Department of Information Engineering and Computer Science, Feng Chia University, Taichung 407102, Taiwan
| | - Wen-Cheng Chao
- Department of Critical Care Medicine, Taichung Veterans General Hospital, Taichung 407219, Taiwan
- Department of Post-Baccalaureate Medicine, College of Medicine, National Chung Hsing University, Taichung 402202, Taiwan
- Department of Automatic Control Engineering, Feng Chia University, Taichung 407102, Taiwan
- Big Data Center, National Chung Hsing University, Taichung 402202, Taiwan
| |
Collapse
|
12
|
Mahdi AY, Yuhaniz SS. Optimal feature selection using novel flamingo search algorithm for classification of COVID-19 patients from clinical text. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:5268-5297. [PMID: 36896545 DOI: 10.3934/mbe.2023244] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Though several AI-based models have been established for COVID-19 diagnosis, the machine-based diagnostic gap is still ongoing, making further efforts to combat this epidemic imperative. So, we tried to create a new feature selection (FS) method because of the persistent need for a reliable system to choose features and to develop a model to predict the COVID-19 virus from clinical texts. This study employs a newly developed methodology inspired by the flamingo's behavior to find a near-ideal feature subset for accurate diagnosis of COVID-19 patients. The best features are selected using a two-stage. In the first stage, we implemented a term weighting technique, which that is RTF-C-IEF, to quantify the significance of the features extracted. The second stage involves using a newly developed feature selection approach called the improved binary flamingo search algorithm (IBFSA), which chooses the most important and relevant features for COVID-19 patients. The proposed multi-strategy improvement process is at the heart of this study to improve the search algorithm. The primary objective is to broaden the algorithm's capabilities by increasing diversity and support exploring the algorithm search space. Additionally, a binary mechanism was used to improve the performance of traditional FSA to make it appropriate for binary FS issues. Two datasets, totaling 3053 and 1446 cases, were used to evaluate the suggested model based on the Support Vector Machine (SVM) and other classifiers. The results showed that IBFSA has the best performance compared to numerous previous swarm algorithms. It was noted, that the number of feature subsets that were chosen was also drastically reduced by 88% and obtained the best global optimal features.
Collapse
Affiliation(s)
- Amir Yasseen Mahdi
- Razak Faculty of Technology and Informatics, Universiti Teknologi Malaysia, Kuala Lumpur 54100, Malaysia
- Computer sciences and mathematics college, University of Thi_Qar, Thi_Qar, 64000, Iraq
| | - Siti Sophiayati Yuhaniz
- Razak Faculty of Technology and Informatics, Universiti Teknologi Malaysia, Kuala Lumpur 54100, Malaysia
| |
Collapse
|
13
|
Olañeta D, Morís DI, de Moura J, Marcos PJ, Rey EM, Novo J, Ortega M. Explainable learning to analyze the outcome of COVID-19 patients using clinical data. PROCEDIA COMPUTER SCIENCE 2023; 225:238-247. [DOI: 10.1016/j.procs.2023.10.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
14
|
Cut TG, Ciocan V, Novacescu D, Voicu A, Marinescu AR, Lazureanu VE, Muresan CO, Enache A, Dumache R. Autopsy Findings and Inflammatory Markers in SARS-CoV-2: A Single-Center Experience. Int J Gen Med 2022; 15:8743-8753. [PMID: 36597439 PMCID: PMC9805743 DOI: 10.2147/ijgm.s389300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 12/18/2022] [Indexed: 12/29/2022] Open
Abstract
Purpose The systemic inflammatory response related to COVID-19 can be easily investigated in living patients. Unfortunately, not every biomarker is suitable for postmortem analysis since several factors may interfere. The aim of this study was to summarize key histopathological findings within each organ system due to COVID-19 and to assess if serological inexpensive and widely available biomarkers such as CRP, IL-6, fibrinogen and d-Dimers, associated with adverse outcomes in COVID-19, can be implemented in a post-mortem assessment. Patients and Methods A total of 60 subjects divided in 2 groups were included. All subjects died outside a hospital setting and therefore did not receive specific or symptomatic therapies that could have modulated the inflammatory response. The first group included 45 subjects in which mandatory autopsy was performed in order to establish the cause of death and macroscopic examination of the lungs was highly suggestive of SARS-CoV-2 infection. As controls (Group 2), 20 subjects who died from polytrauma in high velocity car accidents and suicide were selected. Bronchial fluids collected during the autopsy procedure were used for the RT-PCR diagnosis of SARS-CoV-2 and serum samples were sent for analysis of IL-6, CRP, d-Dimers and fibrinogen. Results Compared with the control group, the subjects of the COVID-19 group were older (59±19.5 vs.38±19.15 years, p=0.0002) and had more underlying comorbidities such as hypertension (60% vs 35%, p=0.06) or were overweight (53.3% vs 30%, p=0.08). The levels of CRP, IL-6, fibrinogen and d-Dimers in postmortem plasma samples were significantly higher in COVID-19 subjects than in control group (p< 0.0001). Moreover, the level of IL-6 was significantly higher in overweight patients (r=0.52, P<0.001). In all COVID-19 subjects, the histological examination revealed features corresponding to the exudative and/or proliferative phases of diffuse alveolar damage. Large pulmonary emboli were observed in 7 cases. Gross cardiac enlargement with left ventricular hypertrophy was observed in 19 cases. The most frequent pathological finding of the central nervous system was acute/early-subacute infarction. Conclusion Due to the complexity of the inflammatory response, we postulate that a combination of biomarkers, rather than a single laboratory parameter, might be more effective in obtaining a reliable postmortem COVID-19 diagnosis.
Collapse
Affiliation(s)
- Talida Georgiana Cut
- Department of Infectious Diseases, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania,Victor Babes Clinical Hospital of Infectious Diseases and Pneumophtisiology Timisoara, Timisoara, Romania,Doctoral School Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania,Center for Ethics in Human Genetic Identifications, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania,Academy of Romanian Scientists, Bucharest, Romania
| | - Veronica Ciocan
- Center for Ethics in Human Genetic Identifications, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania,Department of Forensic Medicine, Bioethics, Deontology and Medical Law, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania,Correspondence: Veronica Ciocan, Department of Forensic Medicine, Bioethics, Deontology and Medical Law, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania, Tel +40722944453, Email
| | - Dorin Novacescu
- Doctoral School Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania,Academy of Romanian Scientists, Bucharest, Romania
| | - Adrian Voicu
- Department of Medical Informatics and Biostatistics, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania
| | - Adelina Raluca Marinescu
- Department of Infectious Diseases, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania,Victor Babes Clinical Hospital of Infectious Diseases and Pneumophtisiology Timisoara, Timisoara, Romania
| | - Voichita Elena Lazureanu
- Department of Infectious Diseases, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania,Victor Babes Clinical Hospital of Infectious Diseases and Pneumophtisiology Timisoara, Timisoara, Romania
| | - Camelia Oana Muresan
- Center for Ethics in Human Genetic Identifications, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania,Department of Forensic Medicine, Bioethics, Deontology and Medical Law, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania
| | - Alexandra Enache
- Center for Ethics in Human Genetic Identifications, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania,Department of Forensic Medicine, Bioethics, Deontology and Medical Law, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania
| | - Raluca Dumache
- Center for Ethics in Human Genetic Identifications, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania,Department of Forensic Medicine, Bioethics, Deontology and Medical Law, Victor Babes University of Medicine and Pharmacy Timisoara, Timisoara, Romania
| |
Collapse
|
15
|
Babukarthik RG, Chandramohan D, Tripathi D, Kumar M, Sambasivam G. COVID-19 identification in chest X-ray images using intelligent multi-level classification scenario. COMPUTERS & ELECTRICAL ENGINEERING : AN INTERNATIONAL JOURNAL 2022; 104:108405. [PMID: 36187137 PMCID: PMC9510091 DOI: 10.1016/j.compeleceng.2022.108405] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 09/17/2022] [Accepted: 09/22/2022] [Indexed: 05/27/2023]
Abstract
COVID-19 is an evolving respiratory transmittable disease, and it holds all daily activity worldwide as a global pandemic. It appeared in the city of Wuhan (China) in November 2019 and slowly started spreading to the rest of the world. The number of cases keeps increasing drastically, leading to a shortage of medical resources and testing kids worldwide. As the physicians facing this problem, several scientists and specialists in Artificial Intelligent (AI) are rendering their support to healthcare professionals in the early detection of COVID-19 using chest X-ray image samples to determine the level of severity at a low cost. This paper proposed Genetic Deep Learning Convolutional Neural Network (GDCNN) architecture that includes Huddle Particle Swarm Optimization as an alternative to Gradient descent. Huddle PSO performs better when clubbed with GDCNN architecture. Based on publicly available datasets, trained chest X-ray images are used to predict and identify various pneumonia diseases. The proposed model performed better with an accuracy of 97.23%, a sensitivity of 98.62%, specificity of 97.0%, and precision of 93.0%. The proposed model act as a tool for earlier detection of COVID-19. In the future, we plan to apply the proposed model for the larger dataset and to predict various lung diseases.
Collapse
Affiliation(s)
- R G Babukarthik
- Department of Computer Science and Engineering, Dayananda Sagar University, Bangalore 560078, India
| | - Dhasarathan Chandramohan
- Department of Computer Science & Engineering, Thapar Institute of Engineering & Technology, Patiala, Punjab, India
| | - Diwakar Tripathi
- Department of Computer Science & Engineering, Thapar Institute of Engineering & Technology, Patiala, Punjab, India
| | - Manish Kumar
- Department of Computer Science & Engineering, Thapar Institute of Engineering & Technology, Patiala, Punjab, India
| | - G Sambasivam
- School of Computing Science and Engineering, VIT Bhopal University, Madhya Pradesh, India
| |
Collapse
|
16
|
Smadi AA, Abugabah A, Al-Smadi AM, Almotairi S. SEL-COVIDNET: An intelligent application for the diagnosis of COVID-19 from chest X-rays and CT-scans. INFORMATICS IN MEDICINE UNLOCKED 2022; 32:101059. [PMID: 36033909 PMCID: PMC9398554 DOI: 10.1016/j.imu.2022.101059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 08/17/2022] [Accepted: 08/17/2022] [Indexed: 11/06/2022] Open
Abstract
COVID-19 detection from medical imaging is a difficult challenge that has piqued the interest of experts worldwide. Chest X-rays and computed tomography (CT) scanning are the essential imaging modalities for diagnosing COVID-19. All researchers focus their efforts on developing viable methods and rapid treatment procedures for this pandemic. Fast and accurate automated detection approaches have been devised to alleviate the need for medical professionals. Deep Learning (DL) technologies have successfully recognized COVID-19 situations. This paper proposes a developed set of nine deep learning models for diagnosing COVID-19 based on transfer learning and implementation in a novel architecture (SEL-COVIDNET). We include a global average pooling layer, flattening, and two dense layers that are fully connected. The model’s effectiveness is evaluated using balanced and unbalanced COVID-19 radiography datasets. After that, our model’s performance is analyzed using six evaluation measures: accuracy, sensitivity, specificity, precision, F1-score, and Matthew’s correlation coefficient (MCC). Experiments demonstrated that the proposed SEL-COVIDNET with tuned DenseNet121, InceptionResNetV2, and MobileNetV3Large models outperformed the results of comparative SOTA for multi-class classification (COVID-19 vs. No-finding vs. Pneumonia) in terms of accuracy (98.52%), specificity (98.5%), sensitivity (98.5%), precision (98.7%), F1-score (98.7%), and MCC (97.5%). For the COVID-19 vs. No-finding classification, our method had an accuracy of 99.77%, a specificity of 99.85%, a sensitivity of 99.85%, a precision of 99.55%, an F1-score of 99.7%, and an MCC of 99.4%. The proposed model offers an accurate approach for detecting COVID-19 patients, which aids in the containment of the COVID-19 pandemic.
Collapse
Affiliation(s)
- Ahmad Al Smadi
- School of Artificial Intelligence, Xidian University, No. 2 South Taibai Road, Xian, 710071, China.,College of Technological Innovation, Zayed University, Abu Dhabi Campus, UAE
| | - Ahed Abugabah
- College of Technological Innovation, Zayed University, Abu Dhabi Campus, UAE
| | - Ahmad Mohammad Al-Smadi
- Department of Computer Science, Al-Balqa Applied University, Ajloun University College, Jordan
| | - Sultan Almotairi
- Faculty of Community College, Majmaah University, Al Majma'ah, Saudi Arabia
| |
Collapse
|
17
|
Azadifar S, Rostami M, Berahmand K, Moradi P, Oussalah M. Graph-based relevancy-redundancy gene selection method for cancer diagnosis. Comput Biol Med 2022; 147:105766. [PMID: 35779479 DOI: 10.1016/j.compbiomed.2022.105766] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Revised: 06/12/2022] [Accepted: 06/18/2022] [Indexed: 11/26/2022]
Abstract
Nowadays, microarray data processing is one of the most important applications in molecular biology for cancer diagnosis. A major task in microarray data processing is gene selection, which aims to find a subset of genes with the least inner similarity and most relevant to the target class. Removing unnecessary, redundant, or noisy data reduces the data dimensionality. This research advocates a graph theoretic-based gene selection method for cancer diagnosis. Both unsupervised and supervised modes use well-known and successful social network approaches such as the maximum weighted clique criterion and edge centrality to rank genes. The suggested technique has two goals: (i) to maximize the relevancy of the chosen genes with the target class and (ii) to reduce their inner redundancy. A maximum weighted clique is chosen in a repetitive way in each iteration of this procedure. The appropriate genes are then chosen from among the existing features in this maximum clique using edge centrality and gene relevance. In the experiment, several datasets consisting of Colon, Leukemia, SRBCT, Prostate Tumor, and Lung Cancer, with different properties, are used to demonstrate the efficacy of the developed model. Our performance is compared to that of renowned filter-based gene selection approaches for cancer diagnosis whose results demonstrate a clear superiority.
Collapse
Affiliation(s)
- Saeid Azadifar
- Department of Computer Engineering, University of Khajeh Nasir Toosi, Tehran, Iran
| | - Mehrdad Rostami
- Centre for Machine Vision and Signal Processing, University of Oulu, Oulu, Finland.
| | - Kamal Berahmand
- School of Computer Science, Faculty of Science, Queensland University of Technology (QUT), Brisbane, Australia
| | - Parham Moradi
- Department of Computer Engineering, University of Kurdistan, Sanandaj, Iran
| | - Mourad Oussalah
- Centre for Machine Vision and Signal Processing, University of Oulu, Oulu, Finland; Research Unit of Medical Imaging, Physics, and Technology, Faculty of Medicine, University of Oulu, Finland
| |
Collapse
|
18
|
Novel Insights in Spatial Epidemiology Utilizing Explainable AI (XAI) and Remote Sensing. REMOTE SENSING 2022. [DOI: 10.3390/rs14133074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The COVID-19 pandemic has affected many aspects of human life around the world, due to its tremendous outcomes on public health and socio-economic activities. Policy makers have tried to develop efficient responses based on technologies and advanced pandemic control methodologies, to limit the wide spreading of the virus in urban areas. However, techniques such as social isolation and lockdown are short-term solutions that minimize the spread of the pandemic in cities and do not invert long-term issues that derive from climate change, air pollution and urban planning challenges that enhance the spreading ability. Thus, it seems crucial to understand what kind of factors assist or prevent the wide spreading of the virus. Although AI frameworks have a very efficient predictive ability as data-driven procedures, they often struggle to identify strong correlations among multidimensional data and provide robust explanations. In this paper, we propose the fusion of a heterogeneous, spatio-temporal dataset that combine data from eight European cities spanning from 1 January 2020 to 31 December 2021 and describe atmospheric, socio-economic, health, mobility and environmental factors all related to potential links with COVID-19. Remote sensing data are the key solution to monitor the availability on public green spaces between cities in the study period. So, we evaluate the benefits of NIR and RED bands of satellite images to calculate the NDVI and locate the percentage in vegetation cover on each city for each week of our 2-year study. This novel dataset is evaluated by a tree-based machine learning algorithm that utilizes ensemble learning and is trained to make robust predictions on daily cases and deaths. Comparisons with other machine learning techniques justify its robustness on the regression metrics RMSE and MAE. Furthermore, the explainable frameworks SHAP and LIME are utilized to locate potential positive or negative influence of the factors on global and local level, with respect to our model’s predictive ability. A variation of SHAP, namely treeSHAP, is utilized for our tree-based algorithm to make fast and accurate explanations.
Collapse
|