1
|
Chen Z, Liang N, Li H, Zhang H, Li H, Yan L, Hu Z, Chen Y, Zhang Y, Wang Y, Ke D, Shi N. Exploring explainable AI features in the vocal biomarkers of lung disease. Comput Biol Med 2024; 179:108844. [PMID: 38981214 DOI: 10.1016/j.compbiomed.2024.108844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 05/15/2024] [Accepted: 06/04/2024] [Indexed: 07/11/2024]
Abstract
This review delves into the burgeoning field of explainable artificial intelligence (XAI) in the detection and analysis of lung diseases through vocal biomarkers. Lung diseases, often elusive in their early stages, pose a significant public health challenge. Recent advancements in AI have ushered in innovative methods for early detection, yet the black-box nature of many AI models limits their clinical applicability. XAI emerges as a pivotal tool, enhancing transparency and interpretability in AI-driven diagnostics. This review synthesizes current research on the application of XAI in analyzing vocal biomarkers for lung diseases, highlighting how these techniques elucidate the connections between specific vocal features and lung pathology. We critically examine the methodologies employed, the types of lung diseases studied, and the performance of various XAI models. The potential for XAI to aid in early detection, monitor disease progression, and personalize treatment strategies in pulmonary medicine is emphasized. Furthermore, this review identifies current challenges, including data heterogeneity and model generalizability, and proposes future directions for research. By offering a comprehensive analysis of explainable AI features in the context of lung disease detection, this review aims to bridge the gap between advanced computational approaches and clinical practice, paving the way for more transparent, reliable, and effective diagnostic tools.
Collapse
Affiliation(s)
- Zhao Chen
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Ning Liang
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Haoyuan Li
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Haili Zhang
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Huizhen Li
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Lijiao Yan
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Ziteng Hu
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Yaxin Chen
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Yujing Zhang
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Yanping Wang
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Dandan Ke
- Special Disease Clinic, Huaishuling Branch of Beijing Fengtai Hospital of Integrated Traditional Chinese and Western Medicine, Beijing, China.
| | - Nannan Shi
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, China; Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China.
| |
Collapse
|
2
|
Nasir Y, Kadian K, Sharma A, Dwivedi V. Interpretable machine learning for dermatological disease detection: Bridging the gap between accuracy and explainability. Comput Biol Med 2024; 179:108919. [PMID: 39047502 DOI: 10.1016/j.compbiomed.2024.108919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Revised: 07/15/2024] [Accepted: 07/15/2024] [Indexed: 07/27/2024]
Abstract
Research on disease detection by leveraging machine learning techniques has been under significant focus. The use of machine learning techniques is important to detect critical diseases promptly and provide the appropriate treatment. Disease detection is a vital and sensitive task and while machine learning models may provide a robust solution, they can come across as complex and unintuitive. Therefore, it is important to gauge a better understanding of the predictions and trust the results. This paper takes up the crucial task of skin disease detection and introduces a hybrid machine learning model combining SVM and XGBoost for the detection task. The proposed model outperformed the existing machine learning models - Support Vector Machine (SVM), decision tree, and XGBoost with an accuracy of 99.26%. The increased accuracy is essential for detecting skin disease due to the similarity in the symptoms which make it challenging to differentiate between the different conditions. In order to foster trust and gain insights into the results we turn to the promising field of Explainable Artificial Intelligence (XAI). We explore two such frameworks for local as well as global explanations for these machine learning models namely, SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME).
Collapse
Affiliation(s)
- Yusra Nasir
- CSE, Indira Gandhi Delhi Technical University for Women, Kashmere Gate, New Delhi, 110006, Delhi, India.
| | - Karuna Kadian
- CSE, Indira Gandhi Delhi Technical University for Women, Kashmere Gate, New Delhi, 110006, Delhi, India.
| | - Arun Sharma
- IT, Indira Gandhi Delhi Technical University for Women, Kashmere Gate, New Delhi, 110006, Delhi, India.
| | - Vimal Dwivedi
- School of Computing, Engineering & Intelligent Systems, Ulster University, Londonderry, Northern Ireland, United Kingdom.
| |
Collapse
|
3
|
Karimi Alavijeh M, Lee YY, Gras SL. A perspective-driven and technical evaluation of machine learning in bioreactor scale-up: A case-study for potential model developments. Eng Life Sci 2024; 24:e2400023. [PMID: 38975020 PMCID: PMC11223373 DOI: 10.1002/elsc.202400023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 03/01/2024] [Indexed: 07/09/2024] Open
Abstract
Bioreactor scale-up and scale-down have always been a topical issue for the biopharmaceutical industry and despite considerable effort, the identification of a fail-safe strategy for bioprocess development across scales remains a challenge. With the ubiquitous growth of digital transformation technologies, new scaling methods based on computer models may enable more effective scaling. This study aimed to evaluate the potential application of machine learning (ML) algorithms for bioreactor scale-up, with a specific focus on the prediction of scaling parameters. Factors critical to the development of such models were identified and data for bioreactor scale-up studies involving CHO cell-generated mAb products collated from the literature and public sources for the development of unsupervised and supervised ML models. Comparison of bioreactor performance across scales identified similarities between the different processes and primary differences between small- and large-scale bioreactors. A series of three case studies were developed to assess the relationship between cell growth and scale-sensitive bioreactor features. An embedding layer improved the capability of artificial neural network models to predict cell growth at a large-scale, as this approach captured similarities between the processes. Further models constructed to predict scaling parameters demonstrated how ML models may be applied to assist the scaling process. The development of data sets that include more characterization data with greater variability under different gassing and agitation regimes will also assist the future development of ML tools for bioreactor scaling.
Collapse
Affiliation(s)
- Masih Karimi Alavijeh
- Department of Chemical EngineeringThe University of MelbourneParkvilleVictoriaAustralia
- The Bio21 Molecular Science and Biotechnology InstituteThe University of MelbourneParkvilleVictoriaAustralia
| | | | - Sally L. Gras
- Department of Chemical EngineeringThe University of MelbourneParkvilleVictoriaAustralia
- The Bio21 Molecular Science and Biotechnology InstituteThe University of MelbourneParkvilleVictoriaAustralia
| |
Collapse
|
4
|
Trager MH, Gordon ER, Breneman A, Weng C, Samie FH. Artificial intelligence for nonmelanoma skin cancer. Clin Dermatol 2024:S0738-081X(24)00100-7. [PMID: 38925444 DOI: 10.1016/j.clindermatol.2024.06.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2024]
Abstract
Nonmelanoma skin cancers (NMSCs) are among the top five most common cancers globally. NMSC is an area with great potential for novel application of diagnostic tools including artificial intelligence (AI). In this scoping review, we aimed to describe the applications of AI in the diagnosis and treatment of NMSC. Twenty-nine publications described AI applications to dermatopathology including lesion classification and margin assessment. Twenty-five publications discussed AI use in clinical image analysis, showing that algorithms are not superior to dermatologists and may rely on unbalanced, nonrepresentative, and nontransparent training data sets. Sixteen publications described the use of AI in cutaneous surgery for NMSC including use in margin assessment during excisions and Mohs surgery, as well as predicting procedural complexity. Eleven publications discussed spectroscopy, confocal microscopy, thermography, and the AI algorithms that analyze and interpret their data. Ten publications pertained to AI applications for the discovery and use of NMSC biomarkers. Eight publications discussed the use of smartphones and AI, specifically how they enable clinicians and patients to have increased access to instant dermatologic assessments but with varying accuracies. Five publications discussed large language models and NMSC, including how they may facilitate or hinder patient education and medical decision-making. Three publications pertaining to the skin of color and AI for NMSC discussed concerns regarding limited diverse data sets for the training of convolutional neural networks. AI demonstrates tremendous potential to improve diagnosis, patient and clinician education, and management of NMSC. Despite excitement regarding AI, data sets are often not transparently reported, may include low-quality images, and may not include diverse skin types, limiting generalizability. AI may serve as a tool to increase access to dermatology services for patients in rural areas and save health care dollars. These benefits can only be achieved, however, with consideration of potential ethical costs.
Collapse
Affiliation(s)
- Megan H Trager
- Department of Dermatology, Columbia University Irving Medical Center, New York, NY, USA
| | - Emily R Gordon
- Columbia University Vagelos College of Physicians and Surgeons, New York, NY, USA
| | - Alyssa Breneman
- Department of Dermatology, Columbia University Irving Medical Center, New York, NY, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Faramarz H Samie
- Department of Dermatology, Columbia University Irving Medical Center, New York, NY, USA.
| |
Collapse
|
5
|
Liu Q, Zheng J, Xie A, Chen M, Gong RY, Sheng Y, Chen HL, Qi CB. Exosome, a Rising Biomarkers in Liquid Biopsy: Advances of Label-Free and Label Strategy for Diagnosis of Cancer. Crit Rev Anal Chem 2024:1-12. [PMID: 38669199 DOI: 10.1080/10408347.2024.2339961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/28/2024]
Abstract
Cancer is commonly considered as one of the most severe diseases, posing a significant threat to human health and society due to various serious challenges. These challenges include difficulties in accurate diagnosis and a high propensity to form metastasis. Tissue biopsy remains the gold standard for diagnosing and subtyping cancer. However, concerns arise from its invasive nature and the potential risk of metastasis during these complex diagnostic procedures. Meanwhile, liquid biopsy has recently witnessed the rapid advancements with the emergence of three prominent detection biomarkers: circulating tumor cells (CTCs), circulating tumor DNA (ctDNA), and exosomes. Whereas, the very low abundance of CTCs combined with the instability of ctDNA intensify the challenges and decrease the accuracy of these two biomarkers for cancer diagnosis. While exosomes have gained widespread recognition as a promising biomarker in liquid biopsy due to their relatively low-invasive detection method, excellent biostability, rich resources, high abundance, and ability to provide valuable information about cancer. Therefore, it is crucial to systematically summarize recent advancements mainly in exosome-based detection methods for early cancer diagnosis. Specifically, this review will primarily focus on label-based and label-free strategies for detecting cancer using exosomes. We anticipate that this comprehensive analysis will enhance readers' understanding of the significance and value of exosomes in the fields of cancer diagnosis and therapy.
Collapse
Affiliation(s)
- Qian Liu
- Department of Pathology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China
| | - Jing Zheng
- Department of Pathology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China
| | - An Xie
- Department of Pathology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China
| | - Min Chen
- Department of Pathology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China
| | - Rui-Yue Gong
- Department of Pathology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China
| | - Yuan Sheng
- Department of Pathology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China
| | - Hong-Lei Chen
- Department of Pathology, School of Basic Medical Sciences, Wuhan University, Wuhan, China
| | - Chu-Bo Qi
- Department of Pathology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China
| |
Collapse
|
6
|
Chung CW, Chou SC, Hsiao TH, Zhang GJ, Chung YF, Chen YM. Machine learning approaches to identify systemic lupus erythematosus in anti-nuclear antibody-positive patients using genomic data and electronic health records. BioData Min 2024; 17:1. [PMID: 38183082 PMCID: PMC10770905 DOI: 10.1186/s13040-023-00352-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 12/19/2023] [Indexed: 01/07/2024] Open
Abstract
BACKGROUND Although the 2019 EULAR/ACR classification criteria for systemic lupus erythematosus (SLE) has required at least a positive anti-nuclear antibody (ANA) titer (≥ 1:80), it remains challenging for clinicians to identify patients with SLE. This study aimed to develop a machine learning (ML) approach to assist in the detection of SLE patients using genomic data and electronic health records. METHODS Participants with a positive ANA (≥ 1:80) were enrolled from the Taiwan Precision Medicine Initiative cohort. The Taiwan Biobank version 2 array was used to detect single nucleotide polymorphism (SNP) data. Six ML models, Logistic Regression, Random Forest (RF), Support Vector Machine, Light Gradient Boosting Machine, Gradient Tree Boosting, and Extreme Gradient Boosting (XGB), were used to identify SLE patients. The importance of the clinical and genetic features was determined by Shapley Additive Explanation (SHAP) values. A logistic regression model was applied to identify genetic variations associated with SLE in the subset of patients with an ANA equal to or exceeding 1:640. RESULTS A total of 946 SLE and 1,892 non-SLE controls were included in this analysis. Among the six ML models, RF and XGB demonstrated superior performance in the differentiation of SLE from non-SLE. The leading features in the SHAP diagram were anti-double strand DNA antibodies, ANA titers, AC4 ANA pattern, polygenic risk scores, complement levels, and SNPs. Additionally, in the subgroup with a high ANA titer (≥ 1:640), six SNPs positively associated with SLE and five SNPs negatively correlated with SLE were discovered. CONCLUSIONS ML approaches offer the potential to assist in diagnosing SLE and uncovering novel SNPs in a group of patients with autoimmunity.
Collapse
Affiliation(s)
- Chih-Wei Chung
- Department of Information Management, National Taiwan University, Taipei, Taiwan
| | - Seng-Cho Chou
- Department of Information Management, National Taiwan University, Taipei, Taiwan
| | - Tzu-Hung Hsiao
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan
- Department of Public Health, Fu Jen Catholic University, New Taipei City, Taiwan
- Institute of Genomics and Bioinformatics, National Chung Hsing University, Taichung, Taiwan
| | - Grace Joyce Zhang
- Department of Cellular and Physiological Sciences, The University of British Columbia, Vancouver, BC, Canada
| | - Yu-Fang Chung
- Department of Electrical Engineering, Tunghai University, Taichung, Taiwan
| | - Yi-Ming Chen
- Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan.
- Division of Allergy, Immunology and Rheumatology, Department of Internal Medicine, Taichung Veterans General Hospital, 1650, Section 4, Taiwan Boulevard, Xitun Dist., Taichung City, 407, Taiwan.
- Department of Post-Baccalaureate Medicine, College of Medicine, National Chung Hsing University, Taichung, Taiwan.
- School of Medicine, College of Medicine, National Yang Ming Chiao Tung University, Taipei, Taiwan.
- Rong Hsing Research Center for Translational Medicine & Ph.D. Program in Translational Medicine, National Chung Hsing University, Taichung, Taiwan.
- Precision Medicine Research Center, College of Medicine, National Chung Hsing University, Taichung, Taiwan.
| |
Collapse
|
7
|
Sharma K, Saini N, Hasija Y. Identifying the mitochondrial metabolism network by integration of machine learning and explainable artificial intelligence in skeletal muscle in type 2 diabetes. Mitochondrion 2024; 74:101821. [PMID: 38040172 DOI: 10.1016/j.mito.2023.11.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 10/04/2023] [Accepted: 11/26/2023] [Indexed: 12/03/2023]
Abstract
Imbalance in glucose metabolism and insulin resistance are two primary features of type 2 diabetes/diabetes mellitus. Its etiology is linked to mitochondrial dysfunction in skeletal muscle tissue. The mitochondria are vital organelles involved in ATP synthesis and metabolism. The underlying biological pathways leading to mitochondrial dysfunction in type 2 diabetes can help us understand the pathophysiology of the disease. In this study, the mitochondrial gene expression dataset were retrieved from the GSE22309, GSE25462, and GSE18732 using Mitocarta 3.0, focusing specifically on genes that are associated with mitochondrial function in type 2 disease. Feature selection on the expression dataset of skeletal muscle tissue from 107 control patients and 70 type 2 diabetes patients using the XGBoost algorithm having the highest accuracy. For interpretation and analysis of results linked to the disease by examining the feature importance deduced from the model was done using SHAP (SHapley Additive exPlanations). Next, to comprehend the biological connections, study of protein-protien and mRNA-miRNA networks was conducted using String and Mienturnet respectively. The analysis revealed BDH1, YARS2, AKAP10, RARS2, MRPS31, were potential mitochondrial target genes among the other twenty genes. These genes are mainly involved in the transport and organization of mitochondria, regulation of its membrane potential, and intrinsic apoptotic signaling etc. mRNA-miRNA interaction network revealed a significant role of miR-375; miR-30a-5p; miR-16-5p; miR-129-5p; miR-1229-3p; and miR-1224-3p; in the regulation of mitochondrial function exhibited strong associations with type 2 diabetes. These results might aid in the creation of novel targets for therapy and type 2 diabetes biomarkers.
Collapse
Affiliation(s)
- Kritika Sharma
- CSIR-Institute of Genomics and Integrative Biology, Mall Road, New Delhi 110007, India; Department of Biotechnology, Delhi Technological University, Delhi 110042, India
| | - Neeru Saini
- CSIR-Institute of Genomics and Integrative Biology, Mall Road, New Delhi 110007, India; Academy of Scientific & Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Yasha Hasija
- Department of Biotechnology, Delhi Technological University, Delhi 110042, India.
| |
Collapse
|
8
|
Yagin FH, Yasar S, Gormez Y, Yagin B, Pinar A, Alkhateeb A, Ardigò LP. Explainable Artificial Intelligence Paves the Way in Precision Diagnostics and Biomarker Discovery for the Subclass of Diabetic Retinopathy in Type 2 Diabetics. Metabolites 2023; 13:1204. [PMID: 38132885 PMCID: PMC10745306 DOI: 10.3390/metabo13121204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/11/2023] [Accepted: 12/16/2023] [Indexed: 12/23/2023] Open
Abstract
Diabetic retinopathy (DR), a common ocular microvascular complication of diabetes, contributes significantly to diabetes-related vision loss. This study addresses the imperative need for early diagnosis of DR and precise treatment strategies based on the explainable artificial intelligence (XAI) framework. The study integrated clinical, biochemical, and metabolomic biomarkers associated with the following classes: non-DR (NDR), non-proliferative diabetic retinopathy (NPDR), and proliferative diabetic retinopathy (PDR) in type 2 diabetes (T2D) patients. To create machine learning (ML) models, 10% of the data was divided into validation sets and 90% into discovery sets. The validation dataset was used for hyperparameter optimization and feature selection stages, while the discovery dataset was used to measure the performance of the models. A 10-fold cross-validation technique was used to evaluate the performance of ML models. Biomarker discovery was performed using minimum redundancy maximum relevance (mRMR), Boruta, and explainable boosting machine (EBM). The predictive proposed framework compares the results of eXtreme Gradient Boosting (XGBoost), natural gradient boosting for probabilistic prediction (NGBoost), and EBM models in determining the DR subclass. The hyperparameters of the models were optimized using Bayesian optimization. Combining EBM feature selection with XGBoost, the optimal model achieved (91.25 ± 1.88) % accuracy, (89.33 ± 1.80) % precision, (91.24 ± 1.67) % recall, (89.37 ± 1.52) % F1-Score, and (97.00 ± 0.25) % the area under the ROC curve (AUROC). According to the EBM explanation, the six most important biomarkers in determining the course of DR were tryptophan (Trp), phosphatidylcholine diacyl C42:2 (PC.aa.C42.2), butyrylcarnitine (C4), tyrosine (Tyr), hexadecanoyl carnitine (C16) and total dimethylarginine (DMA). The identified biomarkers may provide a better understanding of the progression of DR, paving the way for more precise and cost-effective diagnostic and treatment strategies.
Collapse
Affiliation(s)
- Fatma Hilal Yagin
- Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya 44280, Turkey; (F.H.Y.); (A.P.)
| | - Seyma Yasar
- Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya 44280, Turkey; (F.H.Y.); (A.P.)
| | - Yasin Gormez
- Department of Management Information Systems, Faculty of Economics and Administrative Sciences, Sivas Cumhuriyet University, Sivas 58140, Turkey;
| | - Burak Yagin
- Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya 44280, Turkey; (F.H.Y.); (A.P.)
| | - Abdulvahap Pinar
- Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya 44280, Turkey; (F.H.Y.); (A.P.)
| | | | - Luca Paolo Ardigò
- Department of Teacher Education, NLA University College, Linstows Gate 3, 0166 Oslo, Norway;
| |
Collapse
|
9
|
Lombardi A, Arezzo F, Di Sciascio E, Ardito C, Mongelli M, Di Lillo N, Fascilla FD, Silvestris E, Kardhashi A, Putino C, Cazzolla A, Loizzi V, Cazzato G, Cormio G, Di Noia T. A human-interpretable machine learning pipeline based on ultrasound to support leiomyosarcoma diagnosis. Artif Intell Med 2023; 146:102697. [PMID: 38042596 DOI: 10.1016/j.artmed.2023.102697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 10/08/2023] [Accepted: 10/29/2023] [Indexed: 12/04/2023]
Abstract
The preoperative evaluation of myometrial tumors is essential to avoid delayed treatment and to establish the appropriate surgical approach. Specifically, the differential diagnosis of leiomyosarcoma (LMS) is particularly challenging due to the overlapping of clinical, laboratory and ultrasound features between fibroids and LMS. In this work, we present a human-interpretable machine learning (ML) pipeline to support the preoperative differential diagnosis of LMS from leiomyomas, based on both clinical data and gynecological ultrasound assessment of 68 patients (8 with LMS diagnosis). The pipeline provides the following novel contributions: (i) end-users have been involved both in the definition of the ML tasks and in the evaluation of the overall approach; (ii) clinical specialists get a full understanding of both the decision-making mechanisms of the ML algorithms and the impact of the features on each automatic decision. Moreover, the proposed pipeline addresses some of the problems concerning both the imbalance of the two classes by analyzing and selecting the best combination of the synthetic oversampling strategy of the minority class and the classification algorithm among different choices, and the explainability of the features at global and local levels. The results show very high performance of the best strategy (AUC = 0.99, F1 = 0.87) and the strong and stable impact of two ultrasound-based features (i.e., tumor borders and consistency of the lesions). Furthermore, the SHAP algorithm was exploited to quantify the impact of the features at the local level and a specific module was developed to provide a template-based natural language (NL) translation of the explanations for enhancing their interpretability and fostering the use of ML in the clinical setting.
Collapse
Affiliation(s)
- Angela Lombardi
- Department of Electrical and Information Engineering (DEI), Politecnico di Bari, Bari, Italy.
| | - Francesca Arezzo
- Gynecologic Oncology Unit, Interdisciplinar Department of Medicine, IRCCS Istituto Tumori "Giovanni Paolo II", Bari, Italy
| | - Eugenio Di Sciascio
- Department of Electrical and Information Engineering (DEI), Politecnico di Bari, Bari, Italy
| | - Carmelo Ardito
- Department of Engineering, LUM "Giuseppe Degennaro" University, Casamassima, Bari, Italy
| | - Michele Mongelli
- Obstetrics and Gynecology Unit, Department of Biomedical Sciences and Human Oncology, University of Bari "Aldo Moro", Bari, Italy
| | - Nicola Di Lillo
- Obstetrics and Gynecology Unit, Department of Biomedical Sciences and Human Oncology, University of Bari "Aldo Moro", Bari, Italy
| | | | - Erica Silvestris
- Gynecologic Oncology Unit, Interdisciplinar Department of Medicine, IRCCS Istituto Tumori "Giovanni Paolo II", Bari, Italy
| | - Anila Kardhashi
- Gynecologic Oncology Unit, Interdisciplinar Department of Medicine, IRCCS Istituto Tumori "Giovanni Paolo II", Bari, Italy
| | - Carmela Putino
- Obstetrics and Gynecology Unit, Department of Biomedical Sciences and Human Oncology, University of Bari "Aldo Moro", Bari, Italy
| | - Ambrogio Cazzolla
- Gynecologic Oncology Unit, Interdisciplinar Department of Medicine, IRCCS Istituto Tumori "Giovanni Paolo II", Bari, Italy
| | - Vera Loizzi
- Gynecologic Oncology Unit, Interdisciplinar Department of Medicine, IRCCS Istituto Tumori "Giovanni Paolo II", Bari, Italy; Interdisciplinar Department of Medicine, University of Bari "Aldo Moro", Bari, Italy
| | - Gerardo Cazzato
- Section of Pathology, Department of Emergency and Organ Transplantation (DETO), University of Bari "Aldo Moro", Bari, Italy
| | - Gennaro Cormio
- Gynecologic Oncology Unit, Interdisciplinar Department of Medicine, IRCCS Istituto Tumori "Giovanni Paolo II", Bari, Italy; Interdisciplinar Department of Medicine, University of Bari "Aldo Moro", Bari, Italy
| | - Tommaso Di Noia
- Department of Electrical and Information Engineering (DEI), Politecnico di Bari, Bari, Italy
| |
Collapse
|
10
|
Morabito F, Adornetto C, Monti P, Amaro A, Reggiani F, Colombo M, Rodriguez-Aldana Y, Tripepi G, D’Arrigo G, Vener C, Torricelli F, Rossi T, Neri A, Ferrarini M, Cutrona G, Gentile M, Greco G. Genes selection using deep learning and explainable artificial intelligence for chronic lymphocytic leukemia predicting the need and time to therapy. Front Oncol 2023; 13:1198992. [PMID: 37719021 PMCID: PMC10501728 DOI: 10.3389/fonc.2023.1198992] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 07/31/2023] [Indexed: 09/19/2023] Open
Abstract
Analyzing gene expression profiles (GEP) through artificial intelligence provides meaningful insight into cancer disease. This study introduces DeepSHAP Autoencoder Filter for Genes Selection (DSAF-GS), a novel deep learning and explainable artificial intelligence-based approach for feature selection in genomics-scale data. DSAF-GS exploits the autoencoder's reconstruction capabilities without changing the original feature space, enhancing the interpretation of the results. Explainable artificial intelligence is then used to select the informative genes for chronic lymphocytic leukemia prognosis of 217 cases from a GEP database comprising roughly 20,000 genes. The model for prognosis prediction achieved an accuracy of 86.4%, a sensitivity of 85.0%, and a specificity of 87.5%. According to the proposed approach, predictions were strongly influenced by CEACAM19 and PIGP, moderately influenced by MKL1 and GNE, and poorly influenced by other genes. The 10 most influential genes were selected for further analysis. Among them, FADD, FIBP, FIBP, GNE, IGF1R, MKL1, PIGP, and SLC39A6 were identified in the Reactome pathway database as involved in signal transduction, transcription, protein metabolism, immune system, cell cycle, and apoptosis. Moreover, according to the network model of the 3D protein-protein interaction (PPI) explored using the NetworkAnalyst tool, FADD, FIBP, IGF1R, QTRT1, GNE, SLC39A6, and MKL1 appear coupled into a complex network. Finally, all 10 selected genes showed a predictive power on time to first treatment (TTFT) in univariate analyses on a basic prognostic model including IGHV mutational status, del(11q) and del(17p), NOTCH1 mutations, β2-microglobulin, Rai stage, and B-lymphocytosis known to predict TTFT in CLL. However, only IGF1R [hazard ratio (HR) 1.41, 95% CI 1.08-1.84, P=0.013), COL28A1 (HR 0.32, 95% CI 0.10-0.97, P=0.045), and QTRT1 (HR 7.73, 95% CI 2.48-24.04, P<0.001) genes were significantly associated with TTFT in multivariable analyses when combined with the prognostic factors of the basic model, ultimately increasing the Harrell's c-index and the explained variation to 78.6% (versus 76.5% of the basic prognostic model) and 52.6% (versus 42.2% of the basic prognostic model), respectively. Also, the goodness of model fit was enhanced (χ2 = 20.1, P=0.002), indicating its improved performance above the basic prognostic model. In conclusion, DSAF-GS identified a group of significant genes for CLL prognosis, suggesting future directions for bio-molecular research.
Collapse
Affiliation(s)
| | - Carlo Adornetto
- Department of Mathematics and Computer Science, University of Calabria, Cosenza, Italy
| | - Paola Monti
- Mutagenesis and Cancer Prevention Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
| | - Adriana Amaro
- Tumor Epigenetics Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
| | - Francesco Reggiani
- Tumor Epigenetics Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
| | - Monica Colombo
- Molecular Pathology Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
| | | | - Giovanni Tripepi
- Consiglio Nazionale delle Ricerche, Istituto di Fisiologia Clinica del Consiglio Nazionale delle Ricerche (CNR), Reggio Calabria, Italy
| | - Graziella D’Arrigo
- Consiglio Nazionale delle Ricerche, Istituto di Fisiologia Clinica del Consiglio Nazionale delle Ricerche (CNR), Reggio Calabria, Italy
| | - Claudia Vener
- Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy
| | - Federica Torricelli
- Laboratory of Translational Research, Azienda Unità Sanitaria Locale - Istituto di Ricovero e Cura a Crabtree Scientifico (USL-IRCCS) of Reggio Emilia, Reggio Emilia, Italy
| | - Teresa Rossi
- Laboratory of Translational Research, Azienda Unità Sanitaria Locale - Istituto di Ricovero e Cura a Crabtree Scientifico (USL-IRCCS) of Reggio Emilia, Reggio Emilia, Italy
| | - Antonino Neri
- Scientific Directorate, Azienda Unità Sanitaria Locale - Istituto di Ricovero e Cura a Carattere Scientifico (USL-IRCCS) of Reggio Emilia, Reggio Emilia, Italy
| | - Manlio Ferrarini
- Unità Operariva (UO) Molecular Pathology, Ospedale Policlinico San Martino Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS), Genoa, Italy
| | - Giovanna Cutrona
- Molecular Pathology Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
| | - Massimo Gentile
- Hematology Unit, Department of Onco-Hematology, Azienda Ospedaliera (A.O.) of Cosenza, Cosenza, Italy
- Department of Pharmacy and Health and Nutritional Sciences, University of Calabria, Cosenza, Italy
| | - Gianluigi Greco
- Department of Mathematics and Computer Science, University of Calabria, Cosenza, Italy
| |
Collapse
|
11
|
Lalithadevi B, Krishnaveni S, Gnanadurai JSC. A Feasibility Study of Diabetic Retinopathy Detection in Type II Diabetic Patients Based on Explainable Artificial Intelligence. J Med Syst 2023; 47:85. [PMID: 37552340 DOI: 10.1007/s10916-023-01976-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 07/12/2023] [Indexed: 08/09/2023]
Abstract
Diabetic retinopathy (DR) is vision impairment and a life-threatening condition for diabetic patients. Especially type II diabetic people have higher chances of getting retinal problems. Hence, early prediction of DR is necessary for preventing the diabetic patients from vision impairment. The main aim of this feasibility study is to identify the most critical risk features that could lead to diabetic retinopathy. This study investigated type II diabetic patients' socio-analytical, diabetes, behavioral, and clinical risk factors. We conducted a self-individual questionnaire session for all participants. Our questionnaire asked about the reliability of results, feeling comfortable during the screening test, willingness to participate in future screenings, overall perspective, and satisfaction with the DR screening test. We proposed a random forest model for predicting the prevalence of DR risk among diabetics. Further explanations of the model were conducted using more robust SHAP eXplainable Artificial Intelligence (XAI) tools. The SHAP method makes it possible to understand how input variables interact with their representative output records, as well as how input variables are ranked. In addition, various descriptive and inferential statistical analyses were performed on the data and evaluated the significant relationship between the factors discussed above via hypothesis testing. This feasibility study involved 172 type II diabetic patients (73 males and 99 females). Therefore, we found that 81 (47.09%) out of 172 participants had referable DR. The average age of the patients was determined as 55.08, with a standard deviation of ± 9.770 (ranging from 40 to 79). Type II patients were affected by mild, moderate, severe, and advanced proliferative diabetic retinopathy (PDR) stages with 23.83%, 13.95%, 5.81%, and 3.48%, respectively, of the total samples. The developed RF model obtained high accuracy of 94.9% using clinical dataset. Our results showed that the formation of tiny microminiature lesions was noticeable in type II diabetic patients with aged people, abnormal blood glucose levels, and prolonged diabetes duration.
Collapse
Affiliation(s)
- B Lalithadevi
- Department of Computational Intelligence, SRM Institute of Science and Technology, Kattankulathur, Chennai, TN, India.
| | - S Krishnaveni
- Department of Computational Intelligence, SRM Institute of Science and Technology, Kattankulathur, Chennai, TN, India
| | | |
Collapse
|
12
|
Ling D, Liu A, Sun J, Wang Y, Wang L, Song X, Zhao X. Integration of IDPC Clustering Analysis and Interpretable Machine Learning for Survival Risk Prediction of Patients with ESCC. Interdiscip Sci 2023:10.1007/s12539-023-00569-9. [PMID: 37248421 DOI: 10.1007/s12539-023-00569-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Revised: 04/26/2023] [Accepted: 04/26/2023] [Indexed: 05/31/2023]
Abstract
Precise forecasting of survival risk plays a pivotal role in comprehending and predicting the prognosis of patients afflicted with esophageal squamous cell carcinoma (ESCC). The existing methods have the problems of insufficient fitting ability and poor interpretability. To address this issue, this work proposes a novel interpretable survival risk prediction method for ESCC patients based on extreme gradient boosting improved by whale optimization algorithm (WOA-XGBoost) and shapley additive explanations (SHAP). Given the imbalanced nature of the data set, the adaptive synthetic sampling (ADASYN) is first used to generate the samples with high survival risk. Then, an improved clustering by fast search and find of density peaks (IDPC) algorithm based on cosine distance and K nearest neighbors is used to cluster the patients. Next, the prediction model for each cluster is obtained by WOA-XGBoost and the constructed model is visualized with SHAP to uncover the factors hidden in the structured model and improve the interpretability of the black-box model. Finally, the effectiveness of the proposed scheme is demonstrated by analyzing the data collected from the First Affiliated Hospital of Zhengzhou University. The results of the analysis reveal that the proposed methodology exhibits superior performance, as indicated by the area under the receiver operating characteristic curve (AUROC) of 0.918 and accuracy of 0.881.
Collapse
Affiliation(s)
- Dan Ling
- Henan Key Lab of Information-Based Electrical Appliances, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
| | - Anhao Liu
- Henan Key Lab of Information-Based Electrical Appliances, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
| | - Junwei Sun
- Henan Key Lab of Information-Based Electrical Appliances, Zhengzhou University of Light Industry, Zhengzhou, 450002, China
| | - Yanfeng Wang
- Henan Key Lab of Information-Based Electrical Appliances, Zhengzhou University of Light Industry, Zhengzhou, 450002, China.
| | - Lidong Wang
- State Key Laboratory of Esophageal Cancer Prevention and Treatment and Henan Key Laboratory for Esophageal Cancer Research of The First Affiliated Hospital, Zhengzhou University, Zhengzhou, 450052, China
| | - Xin Song
- State Key Laboratory of Esophageal Cancer Prevention and Treatment and Henan Key Laboratory for Esophageal Cancer Research of The First Affiliated Hospital, Zhengzhou University, Zhengzhou, 450052, China
| | - Xueke Zhao
- State Key Laboratory of Esophageal Cancer Prevention and Treatment and Henan Key Laboratory for Esophageal Cancer Research of The First Affiliated Hospital, Zhengzhou University, Zhengzhou, 450052, China
| |
Collapse
|
13
|
Kumar S, Das A. Peripheral Blood Mononuclear Cell derived Biomarker detection using eXplainable Artificial Intelligence (XAI) provides better diagnosis of Breast Cancer. Comput Biol Chem 2023; 104:107867. [PMID: 37030103 DOI: 10.1016/j.compbiolchem.2023.107867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 03/22/2023] [Accepted: 04/01/2023] [Indexed: 04/05/2023]
Abstract
The incidence and mortality rate of breast cancer increases yearly by an average of 1.44 % and 0.23 %, respectively. Till 2021, there were 7.8 million women who had been diagnosed with breast cancer within 5 years. Biopsies of tumors are often expensive and invasive and raise the risk of serious complications like infection, excessive bleeding, and puncture damage to nearby tissues and organs. Early detection biomarkers are often variably expressed in different patients and may even be below the detection level at an early stage. Hence PBMC that shows alteration in gene profile as a result of interaction with tumor antigens may serve as a better early detection biomarker. Also, such alterations in immune gene profile in PBMCs are more prone to detection despite variability in different breast cancer mutants.This study aimed to identify potential diagnostic biomarkers for breast cancer using eXplainable Artificial Intelligence (XAI) on XGBoost machine learning (ML) models trained on a binary classification dataset containing the expression data of PBMCs from 252 breast cancer patients and 194 healthy women.After effectively adding SHAP values further into the XGBoost model, ten important genes related to breast cancer development were discovered to be effective potential biomarkers. Our studies showed that SVIP, BEND3, MDGA2, LEF1-AS1, PRM1, TEX14, MZB1, TMIGD2, KIT, and FKBP7 are key genes that impact model prediction. These genes may serve as early, non-invasive diagnostic and prognostic biomarkers for breast cancer patients.
Collapse
|
14
|
Sanchez K, Kamal K, Manjaly P, Ly S, Mostaghimi A. Clinical Application of Artificial Intelligence for Non-melanoma Skin Cancer. Curr Treat Options Oncol 2023; 24:373-379. [PMID: 36917395 PMCID: PMC10011774 DOI: 10.1007/s11864-023-01065-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/26/2023] [Indexed: 03/15/2023]
Abstract
OPINION STATEMENT The development and implementation of artificial intelligence is beginning to impact the care of dermatology patients. Although the clinical application of AI in dermatology to date has largely focused on melanoma, the prevalence of non-melanoma skin cancers, including basal cell and squamous cell cancers, is a critical application for this technology. The need for a timely diagnosis and treatment of skin cancers makes finding more time efficient diagnostic methods a top priority, and AI may help improve dermatologists' performance and facilitate care in the absence of dermatology expertise. Beyond diagnosis, for more severe cases, AI may help in predicting therapeutic response and replacing or reinforcing input from multidisciplinary teams. AI may also help in designing novel therapeutics. Despite this potential, enthusiasm in AI must be tempered by realistic expectations regarding performance. AI can only perform as well as the information that is used to train it, and development and implementation of new guidelines to improve transparency around training and performance of algorithms is key for promoting confidence in new systems. Special emphasis should be placed on the role of dermatologists in curating high-quality datasets that reflect a range of skin tones, diagnoses, and clinical scenarios. For ultimate success, dermatologists must not be wary of AI as a potential replacement for their expertise, but as a new tool to complement their diagnostic acumen and extend patient care.
Collapse
Affiliation(s)
- Katherine Sanchez
- Lake Erie College of Osteopathic Medicine, Erie, PA, USA.,Department of Dermatology, Brigham and Women's Hospital, 221 Longwood Ave, Boston, MA, 02115, USA
| | - Kanika Kamal
- Department of Dermatology, Brigham and Women's Hospital, 221 Longwood Ave, Boston, MA, 02115, USA.,Harvard Medical School, Boston, MA, 02115, USA
| | - Priya Manjaly
- Department of Dermatology, Brigham and Women's Hospital, 221 Longwood Ave, Boston, MA, 02115, USA.,Boston University School of Medicine, Boston, USA
| | - Sophia Ly
- Department of Dermatology, Brigham and Women's Hospital, 221 Longwood Ave, Boston, MA, 02115, USA.,College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, US, USA
| | - Arash Mostaghimi
- Department of Dermatology, Brigham and Women's Hospital, 221 Longwood Ave, Boston, MA, 02115, USA. .,Harvard Medical School, Boston, MA, 02115, USA.
| |
Collapse
|
15
|
Nematzadeh H, García-Nieto J, Navas-Delgado I, Aldana-Montes JF. Ensemble-based genetic algorithm explainer with automized image segmentation: A case study on melanoma detection dataset. Comput Biol Med 2023; 155:106613. [PMID: 36764157 DOI: 10.1016/j.compbiomed.2023.106613] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 01/05/2023] [Accepted: 01/28/2023] [Indexed: 02/07/2023]
Abstract
Explainable Artificial Intelligence (XAI) makes AI understandable to the human user particularly when the model is complex and opaque. Local Interpretable Model-agnostic Explanations (LIME) has an image explainer package that is used to explain deep learning models. The image explainer of LIME needs some parameters to be manually tuned by the expert in advance, including the number of top features to be seen and the number of superpixels in the segmented input image. This parameter tuning is a time-consuming task. Hence, with the aim of developing an image explainer that automizes image segmentation, this paper proposes Ensemble-based Genetic Algorithm Explainer (EGAE) for melanoma cancer detection that automatically detects and presents the informative sections of the image to the user. EGAE has three phases. First, the sparsity of chromosomes in GAs is determined heuristically. Then, multiple GAs are executed consecutively. However, the difference between these GAs are in different number of superpixels in the input image that result in different chromosome lengths. Finally, the results of GAs are ensembled using consensus and majority votings. This paper also introduces how Euclidean distance can be used to calculate the distance between the actual explanation (delineated by experts) and the calculated explanation (computed by the explainer) for accuracy measurement. Experimental results on a melanoma dataset show that EGAE automatically detects informative lesions, and it also improves the accuracy of explanation in comparison with LIME efficiently. The python codes for EGAE, the ground truths delineated by clinicians, and the melanoma detection dataset are available at https://github.com/KhaosResearch/EGAE.
Collapse
Affiliation(s)
- Hossein Nematzadeh
- ITIS Software, Universidad de Málaga, Arquitecto Francisco Peñalosa 18, Malaga, 29071, Spain; Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, Malaga, Spain.
| | - José García-Nieto
- ITIS Software, Universidad de Málaga, Arquitecto Francisco Peñalosa 18, Malaga, 29071, Spain; Biomedical Research Institute of Málaga (IBIMA), Universidad de Málaga, Malaga, Spain; Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, Malaga, Spain.
| | - Ismael Navas-Delgado
- ITIS Software, Universidad de Málaga, Arquitecto Francisco Peñalosa 18, Malaga, 29071, Spain; Biomedical Research Institute of Málaga (IBIMA), Universidad de Málaga, Malaga, Spain; Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, Malaga, Spain.
| | - José F Aldana-Montes
- ITIS Software, Universidad de Málaga, Arquitecto Francisco Peñalosa 18, Malaga, 29071, Spain; Biomedical Research Institute of Málaga (IBIMA), Universidad de Málaga, Malaga, Spain; Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, Malaga, Spain.
| |
Collapse
|
16
|
Sekaran K, Alsamman AM, George Priya Doss C, Zayed H. Bioinformatics investigation on blood-based gene expressions of Alzheimer's disease revealed ORAI2 gene biomarker susceptibility: An explainable artificial intelligence-based approach. Metab Brain Dis 2023; 38:1297-1310. [PMID: 36809524 PMCID: PMC9942063 DOI: 10.1007/s11011-023-01171-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 01/16/2023] [Indexed: 02/23/2023]
Abstract
The progressive, chronic nature of Alzheimer's disease (AD), a form of dementia, defaces the adulthood of elderly individuals. The pathogenesis of the condition is primarily unascertained, turning the treatment efficacy more arduous. Therefore, understanding the genetic etiology of AD is essential to identifying targeted therapeutics. This study aimed to use machine-learning techniques of expressed genes in patients with AD to identify potential biomarkers that can be used for future therapy. The dataset is accessed from the Gene Expression Omnibus (GEO) database (Accession Number: GSE36980). The subgroups (AD blood samples from frontal, hippocampal, and temporal regions) are individually investigated against non-AD models. Prioritized gene cluster analyses are conducted with the STRING database. The candidate gene biomarkers were trained with various supervised machine-learning (ML) classification algorithms. The interpretation of the model prediction is perpetrated with explainable artificial intelligence (AI) techniques. This experiment revealed 34, 60, and 28 genes as target biomarkers of AD mapped from the frontal, hippocampal, and temporal regions. It is identified ORAI2 as a shared biomarker in all three areas strongly associated with AD's progression. The pathway analysis showed that STIM1 and TRPC3 are strongly associated with ORAI2. We found three hub genes, TPI1, STIM1, and TRPC3, in the network of the ORAI2 gene that might be involved in the molecular pathogenesis of AD. Naive Bayes classified the samples of different groups by fivefold cross-validation with 100% accuracy. AI and ML are promising tools in identifying disease-associated genes that will advance the field of targeted therapeutics against genetic diseases.
Collapse
Affiliation(s)
- Karthik Sekaran
- Laboratory of Integrative Genomics, Department of Integrative Biology, School of BioSciences and Technology, Vellore Institute of Technology (VIT), Vellore, 632014, Tamil Nadu, India
| | - Alsamman M Alsamman
- Department of Genome Mapping, Molecular Genetics and Genome Mapping Laboratory, Agricultural Genetic Engineering Research Institute, Giza, Egypt
| | - C George Priya Doss
- Laboratory of Integrative Genomics, Department of Integrative Biology, School of BioSciences and Technology, Vellore Institute of Technology (VIT), Vellore, 632014, Tamil Nadu, India.
| | - Hatem Zayed
- Department of Biomedical Sciences College of Health Sciences, QU Health, Qatar University, Doha, Qatar.
| |
Collapse
|
17
|
Dwivedi K, Rajpal A, Rajpal S, Agarwal M, Kumar V, Kumar N. An explainable AI-driven biomarker discovery framework for Non-Small Cell Lung Cancer classification. Comput Biol Med 2023; 153:106544. [PMID: 36652866 DOI: 10.1016/j.compbiomed.2023.106544] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 12/17/2022] [Accepted: 01/10/2023] [Indexed: 01/13/2023]
Abstract
Non-Small Cell Lung Cancer (NSCLC) exhibits intrinsic heterogeneity at the molecular level that aids in distinguishing between its two prominent subtypes - Lung Adenocarcinoma (LUAD) and Lung Squamous Cell Carcinoma (LUSC). This paper proposes a novel explainable AI (XAI)-based deep learning framework to discover a small set of NSCLC biomarkers. The proposed framework comprises three modules - an autoencoder to shrink the input feature space, a feed-forward neural network to classify NSCLC instances into LUAD and LUSC, and a biomarker discovery module that leverages the combined network comprising the autoencoder and the feed-forward neural network. In the biomarker discovery module, XAI methods uncovered a set of 52 relevant biomarkers for NSCLC subtype classification. To evaluate the classification performance of the discovered biomarkers, multiple machine-learning models are constructed using these biomarkers. Using 10-Fold cross-validation, Multilayer Perceptron achieved an accuracy of 95.74% (±1.27) at 95% confidence interval. Further, using Drug-Gene Interaction Database, we observe that 14 of the discovered biomarkers are druggable. In addition, 28 biomarkers aid the prediction of the survivability of the patients. Out of 52 discovered biomarkers, we find that 45 biomarkers have been reported in previous studies on distinguishing between the two NSCLC subtypes. To the best of our knowledge, the remaining seven biomarkers have not yet been reported for NSCLC subtyping and could be further explored for their contribution to targeted therapy of lung cancer.
Collapse
Affiliation(s)
- Kountay Dwivedi
- Department of Computer Science, University of Delhi, Delhi, India.
| | - Ankit Rajpal
- Department of Computer Science, University of Delhi, Delhi, India.
| | | | | | - Virendra Kumar
- Department of Nuclear Magnetic Resonance Imaging, All India Institute of Medical Sciences, New Delhi, India.
| | - Naveen Kumar
- Department of Computer Science, University of Delhi, Delhi, India.
| |
Collapse
|
18
|
Joseph LP, Joseph EA, Prasad R. Explainable diabetes classification using hybrid Bayesian-optimized TabNet architecture. Comput Biol Med 2022; 151:106178. [PMID: 36306578 DOI: 10.1016/j.compbiomed.2022.106178] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 09/23/2022] [Accepted: 10/01/2022] [Indexed: 12/27/2022]
Abstract
Diabetes is a deadly chronic disease that occurs when the pancreas is not able to produce ample insulin or when the body cannot use insulin effectively. If undetected, it may lead to a host of health complications. Hence, accurate and explainable early-stage detection of diabetes is essential for the proper administration of treatment options in leading a healthy and productive life. For this, we developed an interpretable TabNet model tuned via Bayesian optimization (BO). To achieve model-specific interpretability, the attention mechanism of TabNet architecture was used, which offered the local and global model explanations on the influence of the attributes on the outcomes. The model was further explained locally and globally using more robust model-agnostic LIME and SHAP eXplainable Artificial Intelligence (XAI) tools. The proposed model outperformed all benchmarked models by obtaining high accuracy of 92.2% and 99.4% using the Pima Indians diabetes dataset (PIDD) and the early-stage diabetes risk prediction dataset (ESDRPD), respectively. Based on the XAI results, it was clear that the most influential attribute for diabetes classification using PIDD and ESDRPD were Insulin and Polyuria, respectively. The feature importance values registered for insulin was 0.301 (PIDD) and for polyuria 0.206 was registered (ESDRPD). The high accuracy and ancillary interpretability of our objective model is expected to increase end-users trust and confidence in early-stage detection of diabetes.
Collapse
Affiliation(s)
- Lionel P Joseph
- School of Mathematics, Physics, and Computing, University of Southern Queensland, Springfield, QLD, 4300, Australia
| | - Erica A Joseph
- Umanand Prasad School of Medicine and Health Sciences, The University of Fiji, Saweni, Lautoka, Fiji
| | - Ramendra Prasad
- Department of Science, School of Science and Technology, The University of Fiji, Saweni, Lautoka, Fiji.
| |
Collapse
|
19
|
Ladbury C, Zarinshenas R, Semwal H, Tam A, Vaidehi N, Rodin AS, Liu A, Glaser S, Salgia R, Amini A. Utilization of model-agnostic explainable artificial intelligence frameworks in oncology: a narrative review. Transl Cancer Res 2022; 11:3853-3868. [PMID: 36388027 PMCID: PMC9641128 DOI: 10.21037/tcr-22-1626] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 09/07/2022] [Indexed: 11/25/2022]
Abstract
Background and Objective Machine learning (ML) models are increasingly being utilized in oncology research for use in the clinic. However, while more complicated models may provide improvements in predictive or prognostic power, a hurdle to their adoption are limits of model interpretability, wherein the inner workings can be perceived as a "black box". Explainable artificial intelligence (XAI) frameworks including Local Interpretable Model-agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) are novel, model-agnostic approaches that aim to provide insight into the inner workings of the "black box" by producing quantitative visualizations of how model predictions are calculated. In doing so, XAI can transform complicated ML models into easily understandable charts and interpretable sets of rules, which can give providers with an intuitive understanding of the knowledge generated, thus facilitating the deployment of such models in routine clinical workflows. Methods We performed a comprehensive, non-systematic review of the latest literature to define use cases of model-agnostic XAI frameworks in oncologic research. The examined database was PubMed/MEDLINE. The last search was run on May 1, 2022. Key Content and Findings In this review, we identified several fields in oncology research where ML models and XAI were utilized to improve interpretability, including prognostication, diagnosis, radiomics, pathology, treatment selection, radiation treatment workflows, and epidemiology. Within these fields, XAI facilitates determination of feature importance in the overall model, visualization of relationships and/or interactions, evaluation of how individual predictions are produced, feature selection, identification of prognostic and/or predictive thresholds, and overall confidence in the models, among other benefits. These examples provide a basis for future work to expand on, which can facilitate adoption in the clinic when the complexity of such modeling would otherwise be prohibitive. Conclusions Model-agnostic XAI frameworks offer an intuitive and effective means of describing oncology ML models, with applications including prognostication and determination of optimal treatment regimens. Using such frameworks presents an opportunity to improve understanding of ML models, which is a critical step to their adoption in the clinic.
Collapse
Affiliation(s)
- Colton Ladbury
- Department of Radiation Oncology, City of Hope National Medical Center, Duarte, CA, USA
| | - Reza Zarinshenas
- Department of Radiation Oncology, City of Hope National Medical Center, Duarte, CA, USA
| | - Hemal Semwal
- Departments of Bioengineering and Integrated Biology and Physiology, University of California Los Angeles, Los Angeles, CA, USA
| | - Andrew Tam
- Department of Radiation Oncology, City of Hope National Medical Center, Duarte, CA, USA
| | - Nagarajan Vaidehi
- Department of Computational and Quantitative Medicine, City of Hope National Medical Center, Duarte, CA, USA
| | - Andrei S Rodin
- Department of Computational and Quantitative Medicine, City of Hope National Medical Center, Duarte, CA, USA
| | - An Liu
- Department of Radiation Oncology, City of Hope National Medical Center, Duarte, CA, USA
| | - Scott Glaser
- Department of Radiation Oncology, City of Hope National Medical Center, Duarte, CA, USA
| | - Ravi Salgia
- Department of Medical Oncology, City of Hope National Medical Center, Duarte, CA, USA
| | - Arya Amini
- Department of Radiation Oncology, City of Hope National Medical Center, Duarte, CA, USA
| |
Collapse
|
20
|
Development of an Artificial Neural Network for the Detection of Supporting Hindlimb Lameness: A Pilot Study in Working Dogs. Animals (Basel) 2022; 12:ani12141755. [PMID: 35883302 PMCID: PMC9311578 DOI: 10.3390/ani12141755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 07/04/2022] [Accepted: 07/05/2022] [Indexed: 11/17/2022] Open
Abstract
Subjective lameness assessment has been a controversial subject given the lack of agreement between observers; this has prompted the development of kinetic and kinematic devices in order to obtain an objective evaluation of locomotor system in dogs. After proper training, neural networks are potentially capable of making a non-human diagnosis of canine lameness. The purpose of this study was to investigate whether artificial neural networks could be used to determine canine hindlimb lameness by computational means only. The outcome of this study could potentially assess the efficacy of certain treatments against diseases that cause lameness. With this aim, input data were obtained from an inertial sensor positioned on the rump. Data from dogs with unilateral hindlimb lameness and sound dogs were used to obtain differences between both groups at walk. The artificial neural network, after necessary adjustments, was integrated into a web management tool, and the preliminary results discriminating between lame and sound dogs are promising. The analysis of spatial data with artificial neural networks was summarized and developed into a web app that has proven to be a useful tool to discriminate between sound and lame dogs. Additionally, this environment allows veterinary clinicians to adequately follow the treatment of lame canine patients.
Collapse
|