1
|
Fang F, Sun Y. Prediction of systemic lupus erythematosus-related genes based on graph attention network and deep neural network. Comput Biol Med 2024; 175:108371. [PMID: 38691916 DOI: 10.1016/j.compbiomed.2024.108371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 03/13/2024] [Accepted: 03/24/2024] [Indexed: 05/03/2024]
Abstract
Systemic lupus erythematosus (SLE) is an autoimmune disorder intricately linked to genetic factors, with numerous approaches having identified genes linked to its development, diagnosis and prognosis. Despite genome-wide association analysis and gene knockout experiments confirming some genes associated with SLE, there are still numerous potential genes yet to be discovered. The search for relevant genes through biological experiments entails significant financial and human resources. With the advancement of computational technologies like deep learning, we aim to identify SLE-related genes through deep learning methods, thereby narrowing down the scope for biological experimentation. This study introduces SLEDL, a deep learning-based approach that leverages DNN and graph neural networks to effectively identify SLE-related genes by capturing relevant features in the gene interaction network. The above steps transform the identification of SLE related genes into a binary classification problem, ultimately solved through a fully connected layer. The results demonstrate the superiority of SLEDL, achieving higher AUC (0.7274) and AUPR (0.7599), further validated through case studies.
Collapse
Affiliation(s)
- Fang Fang
- Department of Rheumatology and Immunology, The First Hospital of China Medical University, Shenyang, Liaoning, China
| | - Yizhou Sun
- Department of Ophthalmology, The First Hospital of China Medical University, Shenyang, Liaoning, China.
| |
Collapse
|
2
|
Zhang M, Wang J, Wang W, Yang G, Peng J. Predicting cell-type specific disease genes of diabetes with the biological network. Comput Biol Med 2024; 169:107849. [PMID: 38101116 DOI: 10.1016/j.compbiomed.2023.107849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 11/21/2023] [Accepted: 12/11/2023] [Indexed: 12/17/2023]
Abstract
Type 2 diabetes (T2D) is a chronic condition that can lead to significant harm, such as heart disease, kidney disease, nerve damage, and blindness. Although T2D-related genes have been identified through Genome-wide association studies (GWAS) and various computational methods, the biological mechanism of T2D at the cell type level remains unclear. Exploring cell type-specific genes related to T2D is essential to understand the cellular mechanisms underlying the disease. To address this issue, we introduce DiGCellNet (predicting Disease Genes with Cell type specificity based on biological Networks), a model that integrates graph convolutional network (GCN) and multi-task learning (MTL) to predict T2D-associated cell type-specific genes based on the biological network. Our work represents the first attempt to predict cell type-specific disease genes using GCN and MTL. We evaluate our approach by predicting genes specific to four cell types and demonstrate that the proposed DiGCellNet outperforms other models that combine node embeddings with traditional machine learning algorithms. Moreover, DiGCellNet successfully identifies CALM1 as a gene specific to beta cell type in T2D cases, and this association is confirmed using an independent dataset. The code is available at https://github.com/23AIBox/23AIBox-DiGCellNet.
Collapse
Affiliation(s)
- Menghan Zhang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China; Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an, 710072, China; The National Engineering Laboratory for Integrated Aerospace-Ground-Ocean Big Data Application Technology, Xi'an, 710072, China
| | - Jingru Wang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China; Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an, 710072, China; The National Engineering Laboratory for Integrated Aerospace-Ground-Ocean Big Data Application Technology, Xi'an, 710072, China
| | - Wei Wang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China; Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an, 710072, China; The National Engineering Laboratory for Integrated Aerospace-Ground-Ocean Big Data Application Technology, Xi'an, 710072, China
| | - Guang Yang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China; Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an, 710072, China; The National Engineering Laboratory for Integrated Aerospace-Ground-Ocean Big Data Application Technology, Xi'an, 710072, China
| | - Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China; Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an, 710072, China; The National Engineering Laboratory for Integrated Aerospace-Ground-Ocean Big Data Application Technology, Xi'an, 710072, China; School of Computer Science, Research and Development Institute of Northwestern Polytechnical University in Shenzhen, Shenzhen, 518000, China.
| |
Collapse
|
3
|
Zhang Z, Qiu S, Wang Z, Hu Y. Vitamin D levels and five cardiovascular diseases: A Mendelian randomization study. Heliyon 2024; 10:e23674. [PMID: 38187309 PMCID: PMC10767153 DOI: 10.1016/j.heliyon.2023.e23674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Revised: 11/24/2023] [Accepted: 12/09/2023] [Indexed: 01/09/2024] Open
Abstract
Cardiovascular disease is the leading cause of death worldwide, whilst vitamin D levels have been found to be associated with cardiovascular disease. To investigate the causal relationship between vitamin D levels and five cardiovascular diseases, a genome-wide association study (GWAS) was carried out using data on vitamin D levels (sample size = 79366), angina pectoris (18168 cases and 187840 controls), coronary heart disease (21012 cases and 197780 controls), lacunar stroke (6030 cases and 248929 controls), heart attack (10693 cases and 451187 controls), and hypertension (55917 cases and 162837 controls), with a Mendelian randomization (MR) analysis being subsequently performed. Six single nucleotide polymorphisms were used as instrumental variables (IVs). In addition, sensitivity analysis was performed to verify the reliability of the MR results here. The results showed a causal relationship between vitamin D levels and angina pectoris (OR = 0.51, 95 % CI: 0.28-0.93, P = 0.03), coronary heart disease (OR = 0.53, 95 % CI: 0.34-0.81, P = 0.004), and lacunar stroke (OR = 0.41, 95 % CI: 0.20-0.86, P = 0.02), but no causal relationship with heart attacks (OR = 1.00, 95 % CI: 0.99-1.01, P = 0.76) or hypertension (OR = 0.99, 95 % CI: 0.73-1.34, P = 0.94). Additionally, our IVs data showed no heterogeneity or pleiotropy, whilst the results of the MR analysis were reliable. This study contributes to the prevention and treatment of these five cardiovascular diseases.
Collapse
Affiliation(s)
- Zhishuai Zhang
- Key Laboratory of Tarim Animal Husbandry Science and Technology, Xinjang Production & Construction Group, Tarim University, Alaer, China
| | - Shizheng Qiu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Zhaoqing Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Yang Hu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
4
|
de Souza P, Silva D, de Andrade I, Dias J, Lima JP, Teichrieb V, Quintino JP, da Silva FQB, Santos ALM. A Study on the Influence of Sensors in Frequency and Time Domains on Context Recognition. SENSORS (BASEL, SWITZERLAND) 2023; 23:5756. [PMID: 37420921 DOI: 10.3390/s23125756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 06/02/2023] [Accepted: 06/09/2023] [Indexed: 07/09/2023]
Abstract
Adaptive AI for context and activity recognition remains a relatively unexplored field due to difficulty in collecting sufficient information to develop supervised models. Additionally, building a dataset for human context activities "in the wild" demands time and human resources, which explains the lack of public datasets available. Some of the available datasets for activity recognition were collected using wearable sensors, since they are less invasive than images and precisely capture a user's movements in time series. However, frequency series contain more information about sensors' signals. In this paper, we investigate the use of feature engineering to improve the performance of a Deep Learning model. Thus, we propose using Fast Fourier Transform algorithms to extract features from frequency series instead of time series. We evaluated our approach on the ExtraSensory and WISDM datasets. The results show that using Fast Fourier Transform algorithms to extract features performed better than using statistics measures to extract features from temporal series. Additionally, we examined the impact of individual sensors on identifying specific labels and proved that incorporating more sensors enhances the model's effectiveness. On the ExtraSensory dataset, the use of frequency features outperformed that of time-domain features by 8.9 p.p., 0.2 p.p., 39.5 p.p., and 0.4 p.p. in Standing, Sitting, Lying Down, and Walking activities, respectively, and on the WISDM dataset, the model performance improved by 1.7 p.p., just by using feature engineering.
Collapse
Affiliation(s)
- Pedro de Souza
- Centro de Informática, Universidade Federal de Pernambuco, Recife 50740-560, PE, Brazil
| | - Diógenes Silva
- Centro de Informática, Universidade Federal de Pernambuco, Recife 50740-560, PE, Brazil
| | - Isabella de Andrade
- Centro de Informática, Universidade Federal de Pernambuco, Recife 50740-560, PE, Brazil
| | - Júlia Dias
- Centro de Informática, Universidade Federal de Pernambuco, Recife 50740-560, PE, Brazil
| | - João Paulo Lima
- Centro de Informática, Universidade Federal de Pernambuco, Recife 50740-560, PE, Brazil
- Visual Computing Lab, Departamento de Computação, Universidade Federal Rural de Pernambuco, Recife 52171-900, PE, Brazil
| | - Veronica Teichrieb
- Centro de Informática, Universidade Federal de Pernambuco, Recife 50740-560, PE, Brazil
| | - Jonysberg P Quintino
- Projeto CIn-UFPE Samsung, Centro de Informática, Av. Jorn. Anibal Fernandes, s/n, Recife 50740-560, PE, Brazil
| | - Fabio Q B da Silva
- Centro de Informática, Universidade Federal de Pernambuco, Recife 50740-560, PE, Brazil
| | - Andre L M Santos
- Centro de Informática, Universidade Federal de Pernambuco, Recife 50740-560, PE, Brazil
| |
Collapse
|
5
|
Das S, Sultana M, Bhattacharya S, Sengupta D, De D. XAI-reduct: accuracy preservation despite dimensionality reduction for heart disease classification using explainable AI. THE JOURNAL OF SUPERCOMPUTING 2023; 79:1-31. [PMID: 37359323 PMCID: PMC10177719 DOI: 10.1007/s11227-023-05356-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 04/26/2023] [Indexed: 06/28/2023]
Abstract
Machine learning (ML) has been used for classification of heart diseases for almost a decade, although understanding of the internal working of the black boxes, i.e., non-interpretable models, remain a demanding problem. Another major challenge in such ML models is the curse of dimensionality leading to resource intensive classification using the comprehensive set of feature vector (CFV). This study focuses on dimensionality reduction using explainable artificial intelligence, without negotiating on accuracy for heart disease classification. Four explainable ML models, using SHAP, were used for classification which reflected the feature contributions (FC) and feature weights (FW) for each feature in the CFV for generating the final results. FC and FW were taken into account in generating the reduced dimensional feature subset (FS). The findings of the study are as follows: (a) XGBoost classifies heart diseases best with explanations, with an increase in 2% in model accuracy over existing best proposals, (b) explainable classification using FS exhibits better accuracy than most of the literary proposals, and (c) with the increase in explainability, accuracy can be preserved using XGBoost classifier for classifying heart diseases, and (d) the top four features responsible for diagnosis of heart disease have been exhibited which have common occurrences in all the explanations reflected by the five explainable techniques used on XGBoost classifier based on feature contributions. To the best of our knowledge, this is first attempt to explain XGBoost classification for diagnosis of heart diseases using five explainable techniques.
Collapse
Affiliation(s)
- Surajit Das
- Department of Information Technology, Meghnad Saha Institute of Technology, Kolkata, 700150 India
- Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, West Bengal, Nadia, 741249 West Bengal India
| | - Mahamuda Sultana
- Department of Computer Science and Engineering, Guru Nanak Institute of Technology, Kolkata, 700114 India
| | - Suman Bhattacharya
- Department of Computer Science and Engineering, Guru Nanak Institute of Technology, Kolkata, 700114 India
| | - Diganta Sengupta
- Department of Computer Science and Engineering, Meghnad Saha Institute of Technology, Kolkata, 700150 India
| | - Debashis De
- Department of Computer Science and Engineering, Maulana Abul Kalam Azad University of Technology, West Bengal, Nadia, 741249 West Bengal India
| |
Collapse
|