1
|
Kartheeswaran KP, Rayan AXA, Varrieth GT. Genetically and semantically aware homogeneous network for prediction and scoring of comorbidities. Comput Biol Med 2024; 183:109252. [PMID: 39418770 DOI: 10.1016/j.compbiomed.2024.109252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 06/29/2024] [Accepted: 10/04/2024] [Indexed: 10/19/2024]
Abstract
OBJECTIVE Patients with comorbidities are highly prone to mortality risk than those suffering from a single disease. Therefore, quantification and prediction of disease comorbidities is necessary to stratify the mortality risk of the patients, predict the probability of their occurrence, design treatment strategies, and to prevent the progression of diseases. Enriching comorbidity disease relationships with rich semantics established by genetic components play a vital role in effectively quantifying and predicting comorbidities. However, the existing studies have not extensively explored the semantic richness conveyed by different types of genetic links connecting the comorbidity pairs. METHODS To solve this, a novel genetic-semantic aware weighted homogeneous network-based method, GSWHomoNet is proposed which first constructs the gene enriched comorbidity heterogeneous network, CoGHetNet with encoded genetic semantic aware weighted meta-path instance disease pair embedding to obtain an enhanced disease node embedding of the network. For enhanced comorbidity prediction and scoring, both direct and indirect semantically enriched comorbidity relationships of the disease nodes is preserved while transforming heterogeneous to homogeneous comorbidity network GSWHomoNet. The proposed GSWHomoNet not only helps discover comorbidity links transductively between known-known disease pairs but also improves the inductive link prediction between known-unknown disease pairs by supplying unknown disease nodes with semantically enriched heterogeneous structural knowledge. RESULTS The effectiveness of the proposed components is proved by AUC scores of 0.895 and 0.860, as well as AUPR scores of 0.903 and 0.873 for transductive and inductive link prediction respectively. In comorbidity scoring, GSWHomoNet outperformed other methods with a correlation result of 0.848. The effect of the improved association prediction ability of the genetic semantic aware weighted meta-path instance embedding based node embedding is proved on disease-microbe and bibliographic heterogeneous network datasets. For biological significance of GSWHomoNet-based comorbidity scoring, we compared it with gene, pathway, and protein-protein interaction (PPI) perspectives, revealing a stronger correlation with the PPI aspect. We identified a substantial number of predicted comorbidity disease pairs, with 77,456 and 48,972 pairs supported by literature evidence for transductive and inductive predictions, respectively. Additionally, we highlighted shared pathways and PPIs for these pairs, demonstrating the robustness of comorbidity predictions.
Collapse
Affiliation(s)
| | - Arockia Xavier Annie Rayan
- Department of Computer Science and Engineering, CEG Campus, Anna University, Chennai, Tamil Nadu, India.
| | | |
Collapse
|
2
|
Roversi C, Tavazzi E, Vettoretti M, Di Camillo B. A dynamic probabilistic model of the onset and interaction of cardio-metabolic comorbidities on an ageing adult population. Sci Rep 2024; 14:11514. [PMID: 38769364 PMCID: PMC11106085 DOI: 10.1038/s41598-024-61135-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 05/02/2024] [Indexed: 05/22/2024] Open
Abstract
Comorbidity is widespread in the ageing population, implying multiple and complex medical needs for individuals and a public health burden. Determining risk factors and predicting comorbidity development can help identify at-risk subjects and design prevention strategies. Using socio-demographic and clinical data from approximately 11,000 subjects monitored over 11 years in the English Longitudinal Study of Ageing, we develop a dynamic Bayesian network (DBN) to model the onset and interaction of three cardio-metabolic comorbidities, namely type 2 diabetes (T2D), hypertension, and heart problems. The DBN allows us to identify risk factors for developing each morbidity, simulate ageing progression over time, and stratify the population based on the risk of outcome occurrence. By applying hierarchical agglomerative clustering to the simulated, dynamic risk of experiencing morbidities, we identified patients with similar risk patterns and the variables contributing to their discrimination. The network reveals a direct joint effect of biomarkers and lifestyle on outcomes over time, such as the impact of fasting glucose, HbA1c, and BMI on T2D development. Mediated cross-relationships between comorbidities also emerge, showcasing the interconnected nature of these health issues. The model presents good calibration and discrimination ability, particularly in predicting the onset of T2D (iAUC-ROC = 0.828, iAUC-PR = 0.294) and survival (iAUC-ROC = 0.827, iAUC-PR = 0.311). Stratification analysis unveils two distinct clusters for all comorbidities, effectively discriminated by variables like HbA1c for T2D and age at baseline for heart problems. The developed DBN constitutes an effective, highly-explainable predictive risk tool for simulating and stratifying the dynamic risk of developing cardio-metabolic comorbidities. Its use could help identify the effects of risk factors and develop health policies that prevent the occurrence of comorbidities.
Collapse
Affiliation(s)
- Chiara Roversi
- Department of Information Engineering, University of Padua, Via Giovanni Gradenigo, 6/b, 35131, Padua, Italy
| | - Erica Tavazzi
- Department of Information Engineering, University of Padua, Via Giovanni Gradenigo, 6/b, 35131, Padua, Italy
| | - Martina Vettoretti
- Department of Information Engineering, University of Padua, Via Giovanni Gradenigo, 6/b, 35131, Padua, Italy
| | - Barbara Di Camillo
- Department of Information Engineering, University of Padua, Via Giovanni Gradenigo, 6/b, 35131, Padua, Italy.
- Department of Comparative Biomedicine and Food Science, University of Padua, Agripolis, Viale dell'Università, 16, 35020, Legnaro (PD), Italy.
| |
Collapse
|
3
|
Zhang W, Liu L, Xiao X, Zhou H, Peng Z, Wang W, Huang L, Xie Y, Xu H, Tao L, Nie W, Yuan X, Liu F, Yuan Q. Identification of common molecular signatures of SARS-CoV-2 infection and its influence on acute kidney injury and chronic kidney disease. Front Immunol 2023; 14:961642. [PMID: 37026010 PMCID: PMC10070855 DOI: 10.3389/fimmu.2023.961642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 03/07/2023] [Indexed: 04/08/2023] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the main cause of COVID-19, causing hundreds of millions of confirmed cases and more than 18.2 million deaths worldwide. Acute kidney injury (AKI) is a common complication of COVID-19 that leads to an increase in mortality, especially in intensive care unit (ICU) settings, and chronic kidney disease (CKD) is a high risk factor for COVID-19 and its related mortality. However, the underlying molecular mechanisms among AKI, CKD, and COVID-19 are unclear. Therefore, transcriptome analysis was performed to examine common pathways and molecular biomarkers for AKI, CKD, and COVID-19 in an attempt to understand the association of SARS-CoV-2 infection with AKI and CKD. Three RNA-seq datasets (GSE147507, GSE1563, and GSE66494) from the GEO database were used to detect differentially expressed genes (DEGs) for COVID-19 with AKI and CKD to search for shared pathways and candidate targets. A total of 17 common DEGs were confirmed, and their biological functions and signaling pathways were characterized by enrichment analysis. MAPK signaling, the structural pathway of interleukin 1 (IL-1), and the Toll-like receptor pathway appear to be involved in the occurrence of these diseases. Hub genes identified from the protein-protein interaction (PPI) network, including DUSP6, BHLHE40, RASGRP1, and TAB2, are potential therapeutic targets in COVID-19 with AKI and CKD. Common genes and pathways may play pathogenic roles in these three diseases mainly through the activation of immune inflammation. Networks of transcription factor (TF)-gene, miRNA-gene, and gene-disease interactions from the datasets were also constructed, and key gene regulators influencing the progression of these three diseases were further identified among the DEGs. Moreover, new drug targets were predicted based on these common DEGs, and molecular docking and molecular dynamics (MD) simulations were performed. Finally, a diagnostic model of COVID-19 was established based on these common DEGs. Taken together, the molecular and signaling pathways identified in this study may be related to the mechanisms by which SARS-CoV-2 infection affects renal function. These findings are significant for the effective treatment of COVID-19 in patients with kidney diseases.
Collapse
Affiliation(s)
- Weiwei Zhang
- Department of Nephrology, Xiangya Hospital of Central South University, Changsha, China
| | - Leping Liu
- Department of Pediatrics, The Third Xiangya Hospital of Central South University, Changsha, China
| | - Xiangcheng Xiao
- Department of Nephrology, Xiangya Hospital of Central South University, Changsha, China
| | - Hongshan Zhou
- Department of Nephrology, Xiangya Hospital of Central South University, Changsha, China
| | - Zhangzhe Peng
- Department of Nephrology, Xiangya Hospital of Central South University, Changsha, China
- Organ Fibrosis Key Lab of Hunan Province, Central South University, Changsha, China
| | - Wei Wang
- Department of Nephrology, Xiangya Hospital of Central South University, Changsha, China
- Organ Fibrosis Key Lab of Hunan Province, Central South University, Changsha, China
| | - Ling Huang
- Department of Nephrology, Xiangya Hospital of Central South University, Changsha, China
- Organ Fibrosis Key Lab of Hunan Province, Central South University, Changsha, China
| | - Yanyun Xie
- Department of Nephrology, Xiangya Hospital of Central South University, Changsha, China
- Organ Fibrosis Key Lab of Hunan Province, Central South University, Changsha, China
| | - Hui Xu
- Department of Nephrology, Xiangya Hospital of Central South University, Changsha, China
- Organ Fibrosis Key Lab of Hunan Province, Central South University, Changsha, China
| | - Lijian Tao
- Department of Nephrology, Xiangya Hospital of Central South University, Changsha, China
- Organ Fibrosis Key Lab of Hunan Province, Central South University, Changsha, China
| | - Wannian Nie
- Department of Nephrology, Xiangya Hospital of Central South University, Changsha, China
| | - Xiangning Yuan
- Department of Nephrology, Xiangya Hospital of Central South University, Changsha, China
- Organ Fibrosis Key Lab of Hunan Province, Central South University, Changsha, China
| | - Fang Liu
- Health Management Center, Xiangya Hospital of Central South University, Changsha, China
- *Correspondence: Fang Liu, ; Qiongjing Yuan,
| | - Qiongjing Yuan
- Department of Nephrology, Xiangya Hospital of Central South University, Changsha, China
- Organ Fibrosis Key Lab of Hunan Province, Central South University, Changsha, China
- National Clinical Medical Research Center for Geriatric Diseases, Xiangya Hospital of Central South University, Changsha, China
- Research Center for Medical Metabolomics, Xiangya Hospital of Central South University, Changsha, China
- *Correspondence: Fang Liu, ; Qiongjing Yuan,
| |
Collapse
|
4
|
Li C, Zhang Y, Xiao Y, Luo Y. Identifying the Effect of COVID-19 Infection in Multiple Myeloma and Diffuse Large B-Cell Lymphoma Patients Using Bioinformatics and System Biology. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:7017317. [PMID: 36466549 PMCID: PMC9711963 DOI: 10.1155/2022/7017317] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 11/05/2022] [Accepted: 11/12/2022] [Indexed: 09/29/2023]
Abstract
The severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2), also referred to as COVID-19, has spread to several countries and caused a serious threat to human health worldwide. Patients with confirmed COVID-19 infection spread the disease rapidly throughout the region. Multiple myeloma (MM) and diffuse large B-cell lymphoma (DLBCL) are risk factors for COVID-19, although the molecular mechanisms underlying the relationship among MM, DLBCL, and COVID-19 have not been elucidated so far. In this context, transcriptome analysis was performed in the present study to identify the shared pathways and molecular indicators of MM, DLBCL, and COVID-19, which benefited the overall understanding of the effect of COVID-19 in patients with MM and DLBCL. Three datasets (GSE16558, GSE56315, and GSE152418) were downloaded from the Gene Expression Omnibus (GEO) and searched for the shared differentially expressed genes (DEGs) in patients with MM and DLBCL who were infected with SARS-CoV-2. The objective was to detect similar pathways and prospective medicines. A total of 29 DEGs that were common across these three datasets were selected. A protein-protein interaction (PPI) network was constructed using data from the STRING database followed by the identification of hub genes. In addition, the association of MM and DLBCL with COVID-19 infection was analyzed through functional analysis using ontologies terms and pathway analysis. Three relationships were observed in the evaluated datasets: transcription factor-gene interactions, protein-drug interactions, and an integrated regulatory network of DEGs and miRNAs with mutual DEGs. The findings of the present study revealed potential pharmaceuticals that could be beneficial in the treatment of COVID-19.
Collapse
Affiliation(s)
- Chengcheng Li
- Department of Hematology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
- Institute of Life Science, Chongqing Medical University, Chongqing, China
| | - Ying Zhang
- Department of Hematology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Yingying Xiao
- Department of Hematology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
- Institute of Life Science, Chongqing Medical University, Chongqing, China
| | - Yun Luo
- Department of Hematology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| |
Collapse
|
5
|
Yang K, Liu J, Gong Y, Li Y, Liu Q. Bioinformatics and systems biology approaches to identify molecular targeting mechanism influenced by COVID-19 on heart failure. Front Immunol 2022; 13:1052850. [DOI: 10.3389/fimmu.2022.1052850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Accepted: 10/25/2022] [Indexed: 11/09/2022] Open
Abstract
Coronavirus disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has emerged as a contemporary hazard to people. It has been known that COVID-19 can both induce heart failure (HF) and raise the risk of patient mortality. However, the mechanism underlying the association between COVID-19 and HF remains unclear. The common molecular pathways between COVID-19 and HF were identified using bioinformatic and systems biology techniques. Transcriptome analysis was performed to identify differentially expressed genes (DEGs). To identify gene ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways, common DEGs were used for enrichment analysis. The results showed that COVID-19 and HF have several common immune mechanisms, including differentiation of T helper (Th) 1, Th 2, Th 17 cells; activation of lymphocytes; and binding of major histocompatibility complex class I and II protein complexes. Furthermore, a protein-protein interaction network was constructed to identify hub genes, and immune cell infiltration analysis was performed. Six hub genes (FCGR3A, CD69, IFNG, CCR7, CCL5, and CCL4) were closely associated with COVID-19 and HF. These targets were associated with immune cells (central memory CD8 T cells, T follicular helper cells, regulatory T cells, myeloid-derived suppressor cells, plasmacytoid dendritic cells, macrophages, eosinophils, and neutrophils). Additionally, transcription factors, microRNAs, drugs, and chemicals that are closely associated with COVID-19 and HF were identified through the interaction network.
Collapse
|
6
|
Han X, Wang F, Yang P, Di B, Xu X, Zhang C, Yao M, Sun Y, Lin Y. A Bioinformatic Approach Based on Systems Biology to Determine the Effects of SARS-CoV-2 Infection in Patients with Hypertrophic Cardiomyopathy. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:5337380. [PMID: 36203534 PMCID: PMC9532139 DOI: 10.1155/2022/5337380] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 08/26/2022] [Accepted: 09/01/2022] [Indexed: 11/18/2022]
Abstract
Recently, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of coronavirus disease 2019 (COVID-19), has infected millions of individuals worldwide. While COVID-19 generally affects the lungs, it also damages other organs, including those of the cardiovascular system. Hypertrophic cardiomyopathy (HCM) is a common genetic cardiovascular disorder. Studies have shown that HCM patients with COVID-19 have a higher mortality rate; however, the reason for this phenomenon is not yet elucidated. Herein, we conducted transcriptomic analyses to identify shared biomarkers between HCM and COVID-19 to bridge this knowledge gap. Differentially expressed genes (DEGs) were obtained using the Gene Expression Omnibus ribonucleic acid (RNA) sequencing datasets, GSE147507 and GSE89714, to identify shared pathways and potential drug candidates. We discovered 30 DEGs that were common between these two datasets. Using a combination of statistical and biological tools, protein-protein interactions were constructed in response to these findings to support hub genes and modules. We discovered that HCM is linked to COVID-19 progression based on a functional analysis under ontology terms. Based on the DEGs identified from the datasets, a coregulatory network of transcription factors, genes, proteins, and microRNAs was also discovered. Lastly, our research suggests that the potential drugs we identified might be helpful for COVID-19 therapy.
Collapse
Affiliation(s)
- Xiao Han
- Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Fei Wang
- Department of Emergency Medicine, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Ping Yang
- Department of Pharmacy, Xinhua Hospital Affiliated to Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Bin Di
- Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Xiangdong Xu
- Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Chunya Zhang
- Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Man Yao
- Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Yaping Sun
- Department of Cardiology, Jiading District Central Hospital Affiliated Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Yangyi Lin
- Department of Pulmonary Vascular Disease, State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
7
|
Network-Based Data Analysis Reveals Ion Channel-Related Gene Features in COVID-19: A Bioinformatic Approach. Biochem Genet 2022; 61:471-505. [PMID: 36104591 PMCID: PMC9473477 DOI: 10.1007/s10528-022-10280-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Accepted: 09/01/2022] [Indexed: 11/02/2022]
Abstract
Coronavirus disease 2019 (COVID-19) seriously threatens human health and has been disseminated worldwide. Although there are several treatments for COVID-19, its control is currently suboptimal. Therefore, the development of novel strategies to treat COVID-19 is necessary. Ion channels are located on the membranes of all excitable cells and many intracellular organelles and are key components involved in various biological processes. They are a target of interest when searching for drug targets. This study aimed to reveal the relevant molecular features of ion channel genes in COVID-19 based on bioinformatic analyses. The RNA-sequencing data of patients with COVID-19 and healthy subjects (GSE152418 and GSE171110 datasets) were obtained from the Gene Expression Omnibus (GEO) database. Ion channel genes were selected from the Hugo Gene Nomenclature Committee (HGNC) database. The RStudio software was used to process the data based on the corresponding R language package to identify ion channel-associated differentially expressed genes (DEGs). Based on the DEGs, Gene Ontology (GO) functional and pathway enrichment analyses were performed using the Enrichr web tool. The STRING database was used to generate a protein-protein interaction (PPI) network, and the Cytoscape software was used to screen for hub genes in the PPI network based on the cytoHubba plug-in. Transcription factors (TF)-DEG, DEG-microRNA (miRNA) and DEG-disease association networks were constructed using the NetworkAnalyst web tool. Finally, the screened hub genes as drug targets were subjected to enrichment analysis based on the DSigDB using the Enrichr web tool to identify potential therapeutic agents for COVID-19. A total of 29 ion channel-associated DEGs were identified. GO functional analysis showed that the DEGs were integral components of the plasma membrane and were mainly involved in inorganic cation transmembrane transport and ion channel activity functions. Pathway analysis showed that the DEGs were mainly involved in nicotine addiction, calcium regulation in the cardiac cell and neuronal system pathways. The top 10 hub genes screened based on the PPI network included KCNA2, KCNJ4, CACNA1A, CACNA1E, NALCN, KCNA5, CACNA2D1, TRPC1, TRPM3 and KCNN3. The TF-DEG and DEG-miRNA networks revealed significant TFs (FOXC1, GATA2, HINFP, USF2, JUN and NFKB1) and miRNAs (hsa-mir-146a-5p, hsa-mir-27a-3p, hsa-mir-335-5p, hsa-let-7b-5p and hsa-mir-129-2-3p). Gene-disease association network analysis revealed that the DEGs were closely associated with intellectual disability and cerebellar ataxia. Drug-target enrichment analysis showed that the relevant drugs targeting the hub genes CACNA2D1, CACNA1A, CACNA1E, KCNA2 and KCNA5 were gabapentin, gabapentin enacarbil, pregabalin, guanidine hydrochloride and 4-aminopyridine. The results of this study provide a valuable basis for exploring the mechanisms of ion channel genes in COVID-19 and clues for developing therapeutic strategies for COVID-19.
Collapse
|
8
|
Islam MK, Islam MR, Rahman MH, Islam MZ, Amin MA, Ahmed KR, Rahman MA, Moni MA, Kim B. Bioinformatics Strategies to Identify Shared Molecular Biomarkers That Link Ischemic Stroke and Moyamoya Disease with Glioblastoma. Pharmaceutics 2022; 14:1573. [PMID: 36015199 PMCID: PMC9413912 DOI: 10.3390/pharmaceutics14081573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 07/17/2022] [Accepted: 07/19/2022] [Indexed: 12/01/2022] Open
Abstract
Expanding data suggest that glioblastoma is accountable for the growing prevalence of various forms of stroke formation, such as ischemic stroke and moyamoya disease. However, the underlying deterministic details are still unspecified. Bioinformatics approaches are designed to investigate the relationships between two pathogens as well as fill this study void. Glioblastoma is a form of cancer that typically occurs in the brain or spinal cord and is highly destructive. A stroke occurs when a brain region starts to lose blood circulation and prevents functioning. Moyamoya disorder is a recurrent and recurring arterial disorder of the brain. To begin, adequate gene expression datasets on glioblastoma, ischemic stroke, and moyamoya disease were gathered from various repositories. Then, the association between glioblastoma, ischemic stroke, and moyamoya was established using the existing pipelines. The framework was developed as a generalized workflow to allow for the aggregation of transcriptomic gene expression across specific tissue; Gene Ontology (GO) and biological pathway, as well as the validation of such data, are carried out using enrichment studies such as protein-protein interaction and gold benchmark databases. The results contribute to a more profound knowledge of the disease mechanisms and unveil the projected correlations among the diseases.
Collapse
Affiliation(s)
- Md Khairul Islam
- Department of Information & Communication Technology, Islamic University, Kushtia 7003, Bangladesh; (M.K.I.); (M.R.I.); (M.Z.I.)
| | - Md Rakibul Islam
- Department of Information & Communication Technology, Islamic University, Kushtia 7003, Bangladesh; (M.K.I.); (M.R.I.); (M.Z.I.)
| | - Md Habibur Rahman
- Department of Computer Science & Engineering, Islamic University, Kushtia 7003, Bangladesh;
| | - Md Zahidul Islam
- Department of Information & Communication Technology, Islamic University, Kushtia 7003, Bangladesh; (M.K.I.); (M.R.I.); (M.Z.I.)
| | - Md Al Amin
- Department of Computer Science & Engineering, Prime University, Dhaka 1216, Bangladesh;
| | - Kazi Rejvee Ahmed
- Department of Pathology, College of Korean Medicine, Kyung Hee University, Hoegidong Dongdaemungu, Seoul 02447, Korea;
| | - Md Ataur Rahman
- Department of Pathology, College of Korean Medicine, Kyung Hee University, Hoegidong Dongdaemungu, Seoul 02447, Korea;
- Korean Medicine-Based Drug Repositioning Cancer Research Center, College of Korean Medicine, Kyung Hee University, Seoul 02447, Korea
| | - Mohammad Ali Moni
- School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD 4072, Australia
| | - Bonglee Kim
- Department of Pathology, College of Korean Medicine, Kyung Hee University, Hoegidong Dongdaemungu, Seoul 02447, Korea;
- Korean Medicine-Based Drug Repositioning Cancer Research Center, College of Korean Medicine, Kyung Hee University, Seoul 02447, Korea
| |
Collapse
|
9
|
Hu H, Tang N, Zhang F, Li L, Li L. Bioinformatics and System Biology Approach to Identify the Influences of COVID-19 on Rheumatoid Arthritis. Front Immunol 2022; 13:860676. [PMID: 35464423 PMCID: PMC9021444 DOI: 10.3389/fimmu.2022.860676] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Accepted: 03/16/2022] [Indexed: 02/05/2023] Open
Abstract
Background Severe coronavirus disease 2019 (COVID -19) has led to a rapid increase in mortality worldwide. Rheumatoid arthritis (RA) was a high-risk factor for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, whereas the molecular mechanisms underlying RA and CVOID-19 are not well understood. The objectives of this study were to analyze potential molecular mechanisms and identify potential drugs for the treatment of COVID-19 and RA using bioinformatics and a systems biology approach. Methods Two Differentially expressed genes (DEGs) sets extracted from GSE171110 and GSE1775544 datasets were intersected to generate common DEGs, which were used for functional enrichment, pathway analysis, and candidate drugs analysis. Results A total of 103 common DEGs were identified in the two datasets between RA and COVID-19. A protein-protein interaction (PPI) was constructed using various combinatorial statistical methods and bioinformatics tools. Subsequently, hub genes and essential modules were identified from the PPI network. In addition, we performed functional analysis and pathway analysis under ontological conditions and found that there was common association between RA and progression of COVID-19 infection. Finally, transcription factor-gene interactions, protein-drug interactions, and DEGs-miRNAs coregulatory networks with common DEGs were also identified in the datasets. Conclusion We successfully identified the top 10 hub genes that could serve as novel targeted therapy for COVID-19 and screened out some potential drugs useful for COVID-19 patients with RA.
Collapse
Affiliation(s)
- Huan Hu
- Department of Rheumatology and Immunology, The Affiliated Hospital of Guizhou Medical University, Guiyang, China.,Clinical Medical College, Guizhou Medical University, Guiyang, China
| | - Nana Tang
- Medical Intensive Care Unit, The Affiliated Hospital of Guizhou Medical University, Guiyang, China
| | - Facai Zhang
- Department of Urology/Institute of Urology, West China Hospital, Sichuan University, Chengdu, China
| | - Li Li
- Medical Intensive Care Unit, The Affiliated Hospital of Guizhou Medical University, Guiyang, China
| | - Long Li
- Department of Rheumatology and Immunology, The Affiliated Hospital of Guizhou Medical University, Guiyang, China
| |
Collapse
|
10
|
Machine learning models for classification and identification of significant attributes to detect type 2 diabetes. Health Inf Sci Syst 2022; 10:2. [PMID: 35178244 PMCID: PMC8828812 DOI: 10.1007/s13755-021-00168-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 10/27/2021] [Indexed: 12/15/2022] Open
Abstract
Type 2 Diabetes (T2D) is a chronic disease characterized by abnormally high blood glucose levels due to insulin resistance and reduced pancreatic insulin production. The challenge of this work is to identify T2D-associated features that can distinguish T2D sub-types for prognosis and treatment purposes. We thus employed machine learning (ML) techniques to categorize T2D patients using data from the Pima Indian Diabetes Dataset from the Kaggle ML repository. After data preprocessing, several feature selection techniques were used to extract feature subsets, and a range of classification techniques were used to analyze these. We then compared the derived classification results to identify the best classifiers by considering accuracy, kappa statistics, area under the receiver operating characteristic (AUROC), sensitivity, specificity, and logarithmic loss (logloss). To evaluate the performance of different classifiers, we investigated their outcomes using the summary statistics with a resampling distribution. Therefore, Generalized Boosted Regression modeling showed the highest accuracy (90.91%), followed by kappa statistics (78.77%) and specificity (85.19%). In addition, Sparse Distance Weighted Discrimination, Generalized Additive Model using LOESS and Boosted Generalized Additive Models also gave the maximum sensitivity (100%), highest AUROC (95.26%) and lowest logarithmic loss (30.98%) respectively. Notably, the Generalized Additive Model using LOESS was the top-ranked algorithm according to non-parametric Friedman testing. Of the features identified by these machine learning models, glucose levels, body mass index, diabetes pedigree function, and age were consistently identified as the best and most frequently accurate outcome predictors. These results indicate the utility of ML methods in constructing improved prediction models for T2D and successfully identified outcome predictors for this Pima Indian population.
Collapse
|
11
|
Network based systems biology approach to identify diseasome and comorbidity associations of Systemic Sclerosis with cancers. Heliyon 2022; 8:e08892. [PMID: 35198765 PMCID: PMC8841363 DOI: 10.1016/j.heliyon.2022.e08892] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 08/04/2021] [Accepted: 01/29/2022] [Indexed: 01/11/2023] Open
Abstract
Systemic Sclerosis (SSc) is an autoimmune disease associated with changes in the skin's structure in which the immune system attacks the body. A recent meta-analysis has reported a high incidence of cancer prognosis including lung cancer (LC), leukemia (LK), and lymphoma (LP) in patients with SSc as comorbidity but its underlying mechanistic details are yet to be revealed. To address this research gap, bioinformatics methodologies were developed to explore the comorbidity interactions between a pair of diseases. Firstly, appropriate gene expression datasets from different repositories on SSc and its comorbidities were collected. Then the interconnection between SSc and its cancer comorbidities was identified by applying the developed pipelines. The pipeline was designed as a generic workflow to demonstrate a premise comorbid condition that integrate regarding gene expression data, tissue/organ meta-data, Gene Ontology (GO), Molecular pathways, and other online resources, and analyze them with Gene Set Enrichment Analysis (GSEA), Pathway enrichment and Semantic Similarity (SS). The pipeline was implemented in R and can be accessed through our Github repository: https://github.com/hiddenntreasure/comorbidity. Our result suggests that SSc and its cancer comorbidities share differentially expressed genes, functional terms (gene ontology), and pathways. The findings have led to a better understanding of disease pathways and our developed methodologies may be applied to any set of diseases for finding any association between them. This research may be used by physicians, researchers, biologists, and others.
Collapse
|
12
|
Wesołowski S, Lemmon G, Hernandez EJ, Henrie A, Miller TA, Weyhrauch D, Puchalski MD, Bray BE, Shah RU, Deshmukh VG, Delaney R, Yost HJ, Eilbeck K, Tristani-Firouzi M, Yandell M. An explainable artificial intelligence approach for predicting cardiovascular outcomes using electronic health records. PLOS DIGITAL HEALTH 2022; 1:e0000004. [PMID: 35373216 PMCID: PMC8975108 DOI: 10.1371/journal.pdig.0000004] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 11/17/2021] [Indexed: 11/19/2022]
Abstract
Understanding the conditionally-dependent clinical variables that drive cardiovascular health outcomes is a major challenge for precision medicine. Here, we deploy a recently developed massively scalable comorbidity discovery method called Poisson Binomial based Comorbidity discovery (PBC), to analyze Electronic Health Records (EHRs) from the University of Utah and Primary Children's Hospital (over 1.6 million patients and 77 million visits) for comorbid diagnoses, procedures, and medications. Using explainable Artificial Intelligence (AI) methodologies, we then tease apart the intertwined, conditionally-dependent impacts of comorbid conditions and demography upon cardiovascular health, focusing on the key areas of heart transplant, sinoatrial node dysfunction and various forms of congenital heart disease. The resulting multimorbidity networks make possible wide-ranging explorations of the comorbid and demographic landscapes surrounding these cardiovascular outcomes, and can be distributed as web-based tools for further community-based outcomes research. The ability to transform enormous collections of EHRs into compact, portable tools devoid of Protected Health Information solves many of the legal, technological, and data-scientific challenges associated with large-scale EHR analyses.
Collapse
Affiliation(s)
- Sergiusz Wesołowski
- Department of Human Genetics and Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, United States of America
| | - Gordon Lemmon
- Department of Human Genetics and Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, United States of America
| | - Edgar J. Hernandez
- Department of Human Genetics and Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, United States of America
| | - Alex Henrie
- Department of Human Genetics and Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, United States of America
| | - Thomas A. Miller
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, United States of America
| | - Derek Weyhrauch
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, United States of America
| | - Michael D. Puchalski
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, United States of America
| | - Bruce E. Bray
- Division of Cardiovascular Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America
- University of Utah, Biomedical Informatics, Salt Lake City, UT 84108, United States of America
| | - Rashmee U. Shah
- Division of Cardiovascular Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America
| | - Vikrant G. Deshmukh
- University of Utah Health Care CMIO Office, Salt Lake City, UT, United States of America
| | - Rebecca Delaney
- Department of Population Health Sciences, University of Utah, Salt Lake City, UT, United States of America
| | - H. Joseph Yost
- Molecular Medicine Program, University of Utah, Salt Lake City, UT, United States of America
| | - Karen Eilbeck
- Department of Population Health Sciences, University of Utah, Salt Lake City, UT, United States of America
| | - Martin Tristani-Firouzi
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, United States of America
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, United States of America
| | - Mark Yandell
- Department of Human Genetics and Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, United States of America
| |
Collapse
|
13
|
Identification of molecular signatures and pathways common to blood cells and brain tissue based RNA-Seq datasets of bipolar disorder: Insights from comprehensive bioinformatics approach. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.100881] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
|
14
|
Fränti P, Sieranoja S, Wikström K, Laatikainen T. Clustering Diagnoses from 58M Patient Visits in Finland 2015–2018 (Preprint). JMIR Med Inform 2021; 10:e35422. [PMID: 35507390 PMCID: PMC9118010 DOI: 10.2196/35422] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Revised: 02/25/2022] [Accepted: 03/02/2022] [Indexed: 12/21/2022] Open
Affiliation(s)
- Pasi Fränti
- Machine Learning Group, School of Computing, University of Eastern Finland, Joensuu, Finland
| | - Sami Sieranoja
- Machine Learning Group, School of Computing, University of Eastern Finland, Joensuu, Finland
| | - Katja Wikström
- Institute of Public Health and Clinical Nutrition, University of Eastern Finland, Kuopio, Finland
- The Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Tiina Laatikainen
- Institute of Public Health and Clinical Nutrition, University of Eastern Finland, Kuopio, Finland
- The Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
| |
Collapse
|
15
|
Nashiry MA, Sumi SS, Sharif Shohan MU, Alyami SA, Azad AKM, Moni MA. Bioinformatics and system biology approaches to identify the diseasome and comorbidities complexities of SARS-CoV-2 infection with the digestive tract disorders. Brief Bioinform 2021; 22:bbab126. [PMID: 33993223 PMCID: PMC8194728 DOI: 10.1093/bib/bbab126] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2021] [Revised: 03/16/2021] [Accepted: 03/16/2021] [Indexed: 01/08/2023] Open
Abstract
Coronavirus Disease 2019 (COVID-19), although most commonly demonstrates respiratory symptoms, but there is a growing set of evidence reporting its correlation with the digestive tract and faeces. Interestingly, recent studies have shown the association of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) infection with gastrointestinal symptoms in infected patients but any sign of respiratory issues. Moreover, some studies have also shown that the presence of live SARS-CoV-2 virus in the faeces of patients with COVID-19. Therefore, the pathophysiology of digestive symptoms associated with COVID-19 has raised a critical need for comprehensive investigative efforts. To address this issue we have developed a bioinformatics pipeline involving a system biological framework to identify the effects of SARS-CoV-2 messenger RNA expression on deciphering its association with digestive symptoms in COVID-19 positive patients. Using two RNA-seq datasets derived from COVID-19 positive patients with celiac (CEL), Crohn's (CRO) and ulcerative colitis (ULC) as digestive disorders, we have found a significant overlap between the sets of differentially expressed genes from SARS-CoV-2 exposed tissue and digestive tract disordered tissues, reporting 7, 22 and 13 such overlapping genes, respectively. Moreover, gene set enrichment analysis, comprehensive analyses of protein-protein interaction network, gene regulatory network, protein-chemical agent interaction network revealed some critical association between SARS-CoV-2 infection and the presence of digestive disorders. The infectome, diseasome and comorbidity analyses also discover the influences of the identified signature genes in other risk factors of SARS-CoV-2 infection to human health. We hope the findings from this pathogenetic analysis may reveal important insights in deciphering the complex interplay between COVID-19 and digestive disorders and underpins its significance in therapeutic development strategy to combat against COVID-19 pandemic.
Collapse
Affiliation(s)
- Md Asif Nashiry
- Department of Computer Science and Engineering, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Shauli Sarmin Sumi
- Department of Computer Science and Engineering, Jashore University of Science and Technology, Jashore, Bangladesh
| | | | - Salem A Alyami
- Department of Mathematics and Statistics, Faculty of Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 13318, Saudi Arabia
| | - A K M Azad
- iThree Institute, Faculty of Science, University Technology of Sydney, Australia
| | - Mohammad Ali Moni
- WHO Collaborating Centre on eHealth, UNSW Digital Health, School of Public Health and Community Medicine, Faculty of Medicine, UNSW Sydney, Australia
- Healthy Ageing Theme, The Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia
| |
Collapse
|
16
|
Chowdhury UN, Faruqe MO, Mehedy M, Ahmad S, Islam MB, Shoombuatong W, Azad A, Moni MA. Effects of Bacille Calmette Guerin (BCG) vaccination during COVID-19 infection. Comput Biol Med 2021; 138:104891. [PMID: 34624759 PMCID: PMC8479467 DOI: 10.1016/j.compbiomed.2021.104891] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 09/21/2021] [Accepted: 09/21/2021] [Indexed: 12/16/2022]
Abstract
The coronavirus disease 2019 (COVID-19) is caused by the infection of highly contagious severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also known as the novel coronavirus. In most countries, the containment of this virus spread is not controlled, which is driving the pandemic towards a more difficult phase. In this study, we investigated the impact of the Bacille Calmette Guerin (BCG) vaccination on the severity and mortality of COVID-19 by performing transcriptomic analyses of SARS-CoV-2 infected and BCG vaccinated samples in peripheral blood mononuclear cells (PBMC). A set of common differentially expressed genes (DEGs) were identified and seeded into their functional enrichment analyses via Gene Ontology (GO)-based functional terms and pre-annotated molecular pathways databases, and their Protein-Protein Interaction (PPI) network analysis. We further analysed the regulatory elements, possible comorbidities and putative drug candidates for COVID-19 patients who have not been BCG-vaccinated. Differential expression analyses of both BCG-vaccinated and COVID-19 infected samples identified 62 shared DEGs indicating their discordant expression pattern in their respected conditions compared to control. Next, PPI analysis of those DEGs revealed 10 hub genes, namely ITGB2, CXCL8, CXCL1, CCR2, IFNG, CCL4, PTGS2, ADORA3, TLR5 and CD33. Functional enrichment analyses found significantly enriched pathways/GO terms including cytokine activities, lysosome, IL-17 signalling pathway, TNF-signalling pathways. Moreover, a set of identified TFs, miRNAs and potential drug molecules were further investigated to assess their biological involvements in COVID-19 and their therapeutic possibilities. Findings showed significant genetic interactions between BCG vaccination and SARS-CoV-2 infection, suggesting an interesting prospect of the BCG vaccine in relation to the COVID-19 pandemic. We hope it may potentially trigger further research on this critical phenomenon to combat COVID-19 spread.
Collapse
Affiliation(s)
- Utpala Nanda Chowdhury
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Md Omar Faruqe
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Md Mehedy
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Shamim Ahmad
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - M. Babul Islam
- Department of Electrical and Electronic Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - A.K.M. Azad
- Faculty of Science, Engineering & Technology, Swinburne University of Technology Sydney, Australia
| | - Mohammad Ali Moni
- School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, Brisbane, QLD 4072, Australia,Corresponding author
| |
Collapse
|
17
|
Rahman MH, Rana HK, Peng S, Kibria MG, Islam MZ, Mahmud SMH, Moni MA. Bioinformatics and system biology approaches to identify pathophysiological impact of COVID-19 to the progression and severity of neurological diseases. Comput Biol Med 2021; 138:104859. [PMID: 34601390 PMCID: PMC8483812 DOI: 10.1016/j.compbiomed.2021.104859] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 08/21/2021] [Accepted: 09/06/2021] [Indexed: 02/06/2023]
Abstract
The Coronavirus Disease 2019 (COVID-19) still tends to propagate and increase the occurrence of COVID-19 across the globe. The clinical and epidemiological analyses indicate the link between COVID-19 and Neurological Diseases (NDs) that drive the progression and severity of NDs. Elucidating why some patients with COVID-19 influence the progression of NDs and patients with NDs who are diagnosed with COVID-19 are becoming increasingly sick, although others are not is unclear. In this research, we investigated how COVID-19 and ND interact and the impact of COVID-19 on the severity of NDs by performing transcriptomic analyses of COVID-19 and NDs samples by developing the pipeline of bioinformatics and network-based approaches. The transcriptomic study identified the contributing genes which are then filtered with cell signaling pathway, gene ontology, protein-protein interactions, transcription factor, and microRNA analysis. Identifying hub-proteins using protein-protein interactions leads to the identification of a therapeutic strategy. Additionally, the incorporation of comorbidity interactions score enhances the identification beyond simply detecting novel biological mechanisms involved in the pathophysiology of COVID-19 and its NDs comorbidities. By computing the semantic similarity between COVID-19 and each of the ND, we have found gene-based maximum semantic score between COVID-19 and Parkinson's disease, the minimum semantic score between COVID-19 and Multiple sclerosis. Similarly, we have found gene ontology-based maximum semantic score between COVID-19 and Huntington disease, minimum semantic score between COVID-19 and Epilepsy disease. Finally, we validated our findings using gold-standard databases and literature searches to determine which genes and pathways had previously been associated with COVID-19 and NDs.
Collapse
Affiliation(s)
- Md Habibur Rahman
- Dept. of Computer Science and Engineering, Islamic University, Kushtia 7003, Bangladesh
| | - Humayan Kabir Rana
- Dept. of Computer Science and Engineering, Green University of Bangladesh, Dhaka, Bangladesh
| | - Silong Peng
- Institute of Automation, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing 100190, China
| | - Md Golam Kibria
- Dept. of Chemical and Petroleum Engineering, Schulich School of Engineering, University of Calgary, Canada
| | - Md Zahidul Islam
- Department of Electronics, Graduate School of Engineering, Nagoya University, Japan
| | - S M Hasan Mahmud
- Dept. of Computer Science, American International University Bangladesh, Dhaka, Bangladesh
| | - Mohammad Ali Moni
- School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD 4072, Australia.
| |
Collapse
|
18
|
Lemmon G, Wesolowski S, Henrie A, Tristani-Firouzi M, Yandell M. A Poisson binomial-based statistical testing framework for comorbidity discovery across electronic health record datasets. NATURE COMPUTATIONAL SCIENCE 2021; 1:694-702. [PMID: 35252879 PMCID: PMC8896515 DOI: 10.1038/s43588-021-00141-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Accepted: 09/16/2021] [Indexed: 01/28/2023]
Abstract
Discovering the concomitant occurrence of distinct medical conditions in a patient, also known as comorbidities, is a prerequisite for creating patient outcome prediction tools. Current comorbidity discovery applications are designed for small datasets and use stratification to control for confounding variables such as age, sex or ancestry. Stratification lowers false positive rates, but reduces power, as the size of the study cohort is decreased. Here we describe a Poisson binomial-based approach to comorbidity discovery (PBC) designed for big-data applications that circumvents the need for stratification. PBC adjusts for confounding demographic variables on a per-patient basis and models temporal relationships. We benchmark PBC using two datasets to compute comorbidity statistics on 4,623,841 pairs of potentially comorbid medical terms. The results of this computation are provided as a searchable web resource. Compared with current methods, the PBC approach reduces false positive associations while retaining statistical power to discover true comorbidities.
Collapse
Affiliation(s)
- Gordon Lemmon
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Utah Center for Genetic Discovery and Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Sergiusz Wesolowski
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Utah Center for Genetic Discovery and Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Alex Henrie
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Utah Center for Genetic Discovery and Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Martin Tristani-Firouzi
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Mark Yandell
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- Utah Center for Genetic Discovery and Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| |
Collapse
|
19
|
Chen K, Xu H, Lei Y, Lio P, Li Y, Guo H, Ali Moni M. Integration and interplay of machine learning and bioinformatics approach to identify genetic interaction related to ovarian cancer chemoresistance. Brief Bioinform 2021; 22:6272796. [PMID: 33971668 DOI: 10.1093/bib/bbab100] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 03/04/2021] [Accepted: 03/06/2021] [Indexed: 11/15/2022] Open
Abstract
Although chemotherapy is the first-line treatment for ovarian cancer (OCa) patients, chemoresistance (CR) decreases their progression-free survival. This paper investigates the genetic interaction (GI) related to OCa-CR. To decrease the complexity of establishing gene networks, individual signature genes related to OCa-CR are identified using a gradient boosting decision tree algorithm. Additionally, the genetic interaction coefficient (GIC) is proposed to measure the correlation of two signature genes quantitatively and explain their joint influence on OCa-CR. Gene pair that possesses high GIC is identified as signature pair. A total of 24 signature gene pairs are selected that include 10 individual signature genes and the influence of signature gene pairs on OCa-CR is explored. Finally, a signature gene pair-based prediction of OCa-CR is identified. The area under curve (AUC) is a widely used performance measure for machine learning prediction. The AUC of signature gene pair reaches 0.9658, whereas the AUC of individual signature gene-based prediction is 0.6823 only. The identified signature gene pairs not only build an efficient GI network of OCa-CR but also provide an interesting way for OCa-CR prediction. This improvement shows that our proposed method is a useful tool to investigate GI related to OCa-CR.
Collapse
Affiliation(s)
- Kexin Chen
- School of Electronics Engineering and Computer Science, Peking University, 100871, Beijing, China
| | - Haoming Xu
- Department of Biomedical Engineering, Duke University, 27708, Durham, United States
| | - Yiming Lei
- School of Electronics Engineering and Computer Science, Peking University, 100871, Beijing, China
| | - Pietro Lio
- Computer Laboratory, University of Cambridge, CB3-0FD, Cambridge, United Kingdom
| | - Yuan Li
- Department of Obstetrics and Gynecology, Peking University Third Hospital, 100083, Beijing, China
| | - Hongyan Guo
- Department of Obstetrics and Gynecology, Peking University Third Hospital, 100083, Beijing, China
| | - Mohammad Ali Moni
- School of Public health and Community Medicine, University of New South Wales, 2052, Sydney, Australia
| |
Collapse
|
20
|
Chowdhury UN, Ahmad S, Islam MB, Alyami SA, Quinn JMW, Eapen V, Moni MA. System biology and bioinformatics pipeline to identify comorbidities risk association: Neurodegenerative disorder case study. PLoS One 2021; 16:e0250660. [PMID: 33956862 PMCID: PMC8101720 DOI: 10.1371/journal.pone.0250660] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Accepted: 04/12/2021] [Indexed: 12/17/2022] Open
Abstract
Alzheimer's disease (AD) is the commonest progressive neurodegenerative condition in humans, and is currently incurable. A wide spectrum of comorbidities, including other neurodegenerative diseases, are frequently associated with AD. How AD interacts with those comorbidities can be examined by analysing gene expression patterns in affected tissues using bioinformatics tools. We surveyed public data repositories for available gene expression data on tissue from AD subjects and from people affected by neurodegenerative diseases that are often found as comorbidities with AD. We then utilized large set of gene expression data, cell-related data and other public resources through an analytical process to identify functional disease links. This process incorporated gene set enrichment analysis and utilized semantic similarity to give proximity measures. We identified genes with abnormal expressions that were common to AD and its comorbidities, as well as shared gene ontology terms and molecular pathways. Our methodological pipeline was implemented in the R platform as an open-source package and available at the following link: https://github.com/unchowdhury/AD_comorbidity. The pipeline was thus able to identify factors and pathways that may constitute functional links between AD and these common comorbidities by which they affect each others development and progression. This pipeline can also be useful to identify key pathological factors and therapeutic targets for other diseases and disease interactions.
Collapse
Affiliation(s)
- Utpala Nanda Chowdhury
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Shamim Ahmad
- Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - M. Babul Islam
- Department of Electrical and Electronic Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Salem A. Alyami
- Department of Mathematics and Statistics, Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia
| | - Julian M. W. Quinn
- Healthy Ageing Theme, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
| | - Valsamma Eapen
- School of Psychiatry, Faculty of Medicine, University of New South Wales, Sydney, Australia
| | - Mohammad Ali Moni
- Healthy Ageing Theme, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
- School of Psychiatry, Faculty of Medicine, University of New South Wales, Sydney, Australia
- WHO Collaborating Centre on eHealth, School of Public Health and Community Medicine, Faculty of Medicine, UNSW Sydney, Sydney, Australia
| |
Collapse
|
21
|
Nashiry A, Sarmin Sumi S, Islam S, Quinn JMW, Moni MA. Bioinformatics and system biology approach to identify the influences of COVID-19 on cardiovascular and hypertensive comorbidities. Brief Bioinform 2021; 22:1387-1401. [PMID: 33458761 PMCID: PMC7929376 DOI: 10.1093/bib/bbaa426] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 12/06/2020] [Indexed: 01/08/2023] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infected individuals that have hypertension or cardiovascular comorbidities have an elevated risk of serious coronavirus disease 2019 (COVID-19) disease and high rates of mortality but how COVID-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$19$\end{document} and cardiovascular diseases interact are unclear. We therefore sought to identify novel mechanisms of interaction by identifying genes with altered expression in SARS-CoV-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$2$\end{document} infection that are relevant to the pathogenesis of cardiovascular disease and hypertension. Some recent research shows the SARS-CoV-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$2$\end{document} uses the angiotensin converting enzyme-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$2$\end{document} (ACE-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$2$\end{document}) as a receptor to infect human susceptible cells. The ACE2 gene is expressed in many human tissues, including intestine, testis, kidneys, heart and lungs. ACE2 usually converts Angiotensin I in the renin–angiotensin-aldosterone system to Angiotensin II, which affects blood pressure levels. ACE inhibitors prescribed for cardiovascular disease and hypertension may increase the levels of ACE-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$2$\end{document}, although there are claims that such medications actually reduce lung injury caused by COVID-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$19$\end{document}. We employed bioinformatics and systematic approaches to identify such genetic links, using messenger RNA data peripheral blood cells from COVID-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$19$\end{document} patients and compared them with blood samples from patients with either chronic heart failure disease or hypertensive diseases. We have also considered the immune response genes with elevated expression in COVID-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$19$\end{document} to those active in cardiovascular diseases and hypertension. Differentially expressed genes (DEGs) common to COVID-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$19$\end{document} and chronic heart failure, and common to COVID-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$19$\end{document} and hypertension, were identified; the involvement of these common genes in the signalling pathways and ontologies studied. COVID-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$19$\end{document} does not share a large number of differentially expressed genes with the conditions under consideration. However, those that were identified included genes playing roles in T cell functions, toll-like receptor pathways, cytokines, chemokines, cell stress, type 2 diabetes and gastric cancer. We also identified protein–protein interactions, gene regulatory networks and suggested drug and chemical compound interactions using the differentially expressed genes. The result of this study may help in identifying significant targets of treatment that can combat the ongoing pandemic due to SARS-CoV-\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$2$\end{document} infection.
Collapse
Affiliation(s)
- Asif Nashiry
- Department of Computer Science and Engineering, Jashore University of Science and Technology, Bangladesh
| | - Shauli Sarmin Sumi
- Department of Computer Science and Engineering, Jashore University of Science and Technology, Bangladesh
| | - Salequl Islam
- Department of Microbiology, Jahangirnagar University, Savar, Dhaka-1342, Bangladesh
| | - Julian M W Quinn
- Healthy Ageing Theme, The Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia
| | - Mohammad Ali Moni
- Healthy Ageing Theme, The Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia.,WHO Collaborating Centre on eHealth, UNSW Digital Health, School of Public Health and Community Medicine, Faculty of Medicine, UNSW Sydney, Australia
| |
Collapse
|
22
|
Identification and Validation of a Potential Prognostic 7-lncRNA Signature for Predicting Survival in Patients with Multiple Myeloma. BIOMED RESEARCH INTERNATIONAL 2021; 2020:3813546. [PMID: 33204693 PMCID: PMC7661128 DOI: 10.1155/2020/3813546] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 07/27/2020] [Accepted: 08/25/2020] [Indexed: 12/29/2022]
Abstract
BACKGROUND An increasing number of studies have indicated that the abnormal expression of certain long noncoding RNAs (lncRNAs) is linked to the overall survival (OS) of patients with myeloma. METHODS Gene expression data of myeloma patients were downloaded from the Gene Expression Omnibus (GEO) database (GSE4581 and GSE57317). Cox regression analysis, Kaplan-Meier, and receiver operating characteristic (ROC) analysis were performed to construct and validate the prediction model. Single sample gene set enrichment (ssGSEA) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were used to predict the function of a specified lncRNA. RESULTS In this study, a seven-lncRNA signature was identified and used to construct a risk score system for myeloma prognosis. This system was used to stratify patients with different survival rates in the training set into high-risk and low-risk groups. Test set, the entire test set, the external validation set, and the myeloma subtype achieved the authentication of the results. In addition, functional enrichment analysis indicated that 7 prognostic lncRNAs may be involved in the tumorigenesis of myeloma through cancer-related pathways and biological processes. The results of the immune score showed that IF_I was negatively correlated with the risk score. Compared with the published gene signature, the 7-lncRNA model has a higher C-index (above 0.8). CONCLUSION In summary, our data provide evidence that seven lncRNAs could be used as independent biomarkers to predict the prognosis of myeloma, which also indicated that these 7 lncRNAs may be involved in the progression of myeloma.
Collapse
|
23
|
Mahmud SMH, Al-Mustanjid M, Akter F, Rahman MS, Ahmed K, Rahman MH, Chen W, Moni MA. Bioinformatics and system biology approach to identify the influences of SARS-CoV-2 infections to idiopathic pulmonary fibrosis and chronic obstructive pulmonary disease patients. Brief Bioinform 2021; 22:6224261. [PMID: 33847347 PMCID: PMC8083324 DOI: 10.1093/bib/bbab115] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 02/25/2021] [Accepted: 03/13/2021] [Indexed: 12/15/2022] Open
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), better known as COVID-19, has become a current threat to humanity. The second wave of the SARS-CoV-2 virus has hit many countries, and the confirmed COVID-19 cases are quickly spreading. Therefore, the epidemic is still passing the terrible stage. Having idiopathic pulmonary fibrosis (IPF) and chronic obstructive pulmonary disease (COPD) are the risk factors of the COVID-19, but the molecular mechanisms that underlie IPF, COPD, and CVOID-19 are not well understood. Therefore, we implemented transcriptomic analysis to detect common pathways and molecular biomarkers in IPF, COPD, and COVID-19 that help understand the linkage of SARS-CoV-2 to the IPF and COPD patients. Here, three RNA-seq datasets (GSE147507, GSE52463, and GSE57148) from Gene Expression Omnibus (GEO) is employed to detect mutual differentially expressed genes (DEGs) for IPF, and COPD patients with the COVID-19 infection for finding shared pathways and candidate drugs. A total of 65 common DEGs among these three datasets were identified. Various combinatorial statistical methods and bioinformatics tools were used to build the protein–protein interaction (PPI) and then identified Hub genes and essential modules from this PPI network. Moreover, we performed functional analysis under ontologies terms and pathway analysis and found that IPF and COPD have some shared links to the progression of COVID-19 infection. Transcription factors–genes interaction, protein–drug interactions, and DEGs-miRNAs coregulatory network with common DEGs also identified on the datasets. We think that the candidate drugs obtained by this study might be helpful for effective therapeutic in COVID-19.
Collapse
Affiliation(s)
- S M Hasan Mahmud
- Computer Science and Technology from the University of Electronic Science and Technology of China, China
| | | | - Farzana Akter
- Computer Science and Engineering from Daffodil International University, Bangladesh
| | | | - Kawsar Ahmed
- Information and Communication Technology (ICT) at Mawlana Bhashani Science and Technology University, Tangail, Bangladesh
| | - Md Habibur Rahman
- Institute of Automation, Chinese Academy of Sciences, Beijing, China
| | - Wenyu Chen
- University of Electronic Science and Technology of China, China
| | | |
Collapse
|
24
|
Biswas S, Mitra P, Rao KS. Relation Prediction of Co-Morbid Diseases Using Knowledge Graph Completion. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:708-717. [PMID: 31295118 DOI: 10.1109/tcbb.2019.2927310] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Co-morbid disease condition refers to the simultaneous presence of one or more diseases along with the primary disease. A patient suffering from co-morbid diseases possess more mortality risk than with a disease alone. So, it is necessary to predict co-morbid disease pairs. In past years, though several methods have been proposed by researchers for predicting the co-morbid diseases, not much work is done in prediction using knowledge graph embedding using tensor factorization. Moreover, the complex-valued vector-based tensor factorization is not being used in any knowledge graph with biological and biomedical entities. We propose a tensor factorization based approach on biological knowledge graphs. Our method introduces the concept of complex-valued embedding in knowledge graphs with biological entities. Here, we build a knowledge graph with disease-gene associations and their corresponding background information. To predict the association between prevalent diseases, we use ComplEx embedding based tensor decomposition method. Besides, we obtain new prevalent disease pairs using the MCL algorithm in a disease-gene-gene network and check their corresponding inter-relations using edge prediction task.
Collapse
|
25
|
A system biological approach to investigate the genetic profiling and comorbidities of type 2 diabetes. GENE REPORTS 2020. [DOI: 10.1016/j.genrep.2020.100830] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
26
|
Ewing E, Planell-Picola N, Jagodic M, Gomez-Cabrero D. GeneSetCluster: a tool for summarizing and integrating gene-set analysis results. BMC Bioinformatics 2020; 21:443. [PMID: 33028195 PMCID: PMC7542881 DOI: 10.1186/s12859-020-03784-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Accepted: 09/28/2020] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Gene-set analysis tools, which make use of curated sets of molecules grouped based on their shared functions, aim to identify which gene-sets are over-represented in the set of features that have been associated with a given trait of interest. Such tools are frequently used in gene-centric approaches derived from RNA-sequencing or microarrays such as Ingenuity or GSEA, but they have also been adapted for interval-based analysis derived from DNA methylation or ChIP/ATAC-sequencing. Gene-set analysis tools return, as a result, a list of significant gene-sets. However, while these results are useful for the researcher in the identification of major biological insights, they may be complex to interpret because many gene-sets have largely overlapping gene contents. Additionally, in many cases the result of gene-set analysis consists of a large number of gene-sets making it complicated to identify the major biological insights. RESULTS We present GeneSetCluster, a novel approach which allows clustering of identified gene-sets, from one or multiple experiments and/or tools, based on shared genes. GeneSetCluster calculates a distance score based on overlapping gene content, which is then used to cluster them together and as a result, GeneSetCluster identifies groups of gene-sets with similar gene-set definitions (i.e. gene content). These groups of gene-sets can aid the researcher to focus on such groups for biological interpretations. CONCLUSIONS GeneSetCluster is a novel approach for grouping together post gene-set analysis results based on overlapping gene content. GeneSetCluster is implemented as a package in R. The package and the vignette can be downloaded at https://github.com/TranslationalBioinformaticsUnit.
Collapse
Affiliation(s)
- Ewoud Ewing
- Department of Clinical Neuroscience, Center for Molecular Medicine, Karolinska Institutet, 171 77, Stockholm, Sweden.
| | - Nuria Planell-Picola
- Translational Bioinformatics Unit, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), IdiSNA, Pamplona, Spain
| | - Maja Jagodic
- Department of Clinical Neuroscience, Center for Molecular Medicine, Karolinska Institutet, 171 77, Stockholm, Sweden
| | - David Gomez-Cabrero
- Translational Bioinformatics Unit, Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), IdiSNA, Pamplona, Spain.,Unit of Computational Medicine, Department of Medicine, Solna, Center for Molecular Medicine, Karolinska Institutet, 171 77, Stockholm, Sweden
| |
Collapse
|
27
|
Al-Mustanjid M, Mahmud SMH, Royel MRI, Rahman MH, Islam T, Rahman MR, Moni MA. Detection of molecular signatures and pathways shared in inflammatory bowel disease and colorectal cancer: A bioinformatics and systems biology approach. Genomics 2020; 112:3416-3426. [PMID: 32535071 DOI: 10.1016/j.ygeno.2020.06.001] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 05/03/2020] [Accepted: 06/02/2020] [Indexed: 02/07/2023]
Abstract
Emerging evidence indicates IBD is a risk factor for the increasing incidence of colorectal cancer (CRC) development. We used a system biology approach to identify common molecular signatures and pathways that interact between IBD and CRC and the indispensable pathological mechanisms. First, we identified 177 common differentially expressed genes (DEGs) between IBD and CRC. Gene set enrichment, protein-protein, DEGs-transcription factors, DEGs-microRNAs, protein-drug interaction, gene-disease association, Gene Ontology, pathway enrichment analyses were conducted to these common genes. The inclusion of common DEGs with bimolecular networks disclosed hub proteins (LYN, PLCB1, NPSR1, WNT5A, CDC25B, CD44, RIPK2, ASAP1), transcription factors (SCD, SLC7A5, IKZF3, SLC16A1, SLC7A11) and miRNAs (mir-335-5p, mir-26b-5p, mir-124-3p, mir-16-5p, mir-192-5p, mir-548c-3p, mir-29b-3p, mir-155-5p, mir-21-5p, mir-15a-5p). Analysis of the interaction between protein and drug discovered ASAP1 interacts with cysteine sulfonic acid and double oxidized cysteine drug compounds. Gene-disease association analysis retrieved ASAP1 also associated with pulmonary and bladder neoplasm diseases.
Collapse
Affiliation(s)
- Md Al-Mustanjid
- Department of Software Engineering, Faculty of Science and Information Technology, Daffodil International University, Dhaka 1207, Bangladesh
| | - S M Hasan Mahmud
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
| | - Md Rejaul Islam Royel
- Department of Software Engineering, Faculty of Science and Information Technology, Daffodil International University, Dhaka 1207, Bangladesh
| | - Md Habibur Rahman
- Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Tangail, Bangladesh
| | - Tania Islam
- Department of Biotechnology and Genetic Engineering, Faculty of Biological Sciences, Islamic University, Kushtia 7003, Bangladesh
| | - Md Rezanur Rahman
- Department of Biochemistry and Biotechnology, School of Biomedical Science, Khwaja Yunus Ali, University, Enayetpur, Sirajganj 6751, Bangladesh
| | - Mohammad Ali Moni
- WHO Collaborating Centre on eHealth, UNSW Digital Health, School of Public Health and Community Medicine, Faculty of Medicine, UNSW Sydney, Australia.
| |
Collapse
|
28
|
Hossain MA, Asa TA, Rahman MM, Uddin S, Moustafa AA, Quinn JMW, Moni MA. Network-Based Genetic Profiling Reveals Cellular Pathway Differences Between Follicular Thyroid Carcinoma and Follicular Thyroid Adenoma. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:E1373. [PMID: 32093341 PMCID: PMC7068514 DOI: 10.3390/ijerph17041373] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Revised: 02/05/2020] [Accepted: 02/12/2020] [Indexed: 12/11/2022]
Abstract
Molecular mechanisms underlying the pathogenesis and progression of malignant thyroid cancers, such as follicular thyroid carcinomas (FTCs), and how these differ from benign thyroid lesions, are poorly understood. In this study, we employed network-based integrative analyses of FTC and benign follicular thyroid adenoma (FTA) lesion transcriptomes to identify key genes and pathways that differ between them. We first analysed a microarray gene expression dataset (Gene Expression Omnibus GSE82208, n = 52) obtained from FTC and FTA tissues to identify differentially expressed genes (DEGs). Pathway analyses of these DEGs were then performed using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) resources to identify potentially important pathways, and protein-protein interactions (PPIs) were examined to identify pathway hub genes. Our data analysis identified 598 DEGs, 133 genes with higher and 465 genes with lower expression in FTCs. We identified four significant pathways (one carbon pool by folate, p53 signalling, progesterone-mediated oocyte maturation signalling, and cell cycle pathways) connected to DEGs with high FTC expression; eight pathways were connected to DEGs with lower relative FTC expression. Ten GO groups were significantly connected with FTC-high expression DEGs and 80 with low-FTC expression DEGs. PPI analysis then identified 12 potential hub genes based on degree and betweenness centrality; namely, TOP2A, JUN, EGFR, CDK1, FOS, CDKN3, EZH2, TYMS, PBK, CDH1, UBE2C, and CCNB2. Moreover, transcription factors (TFs) were identified that may underlie gene expression differences observed between FTC and FTA, including FOXC1, GATA2, YY1, FOXL1, E2F1, NFIC, SRF, TFAP2A, HINFP, and CREB1. We also identified microRNA (miRNAs) that may also affect transcript levels of DEGs; these included hsa-mir-335-5p, -26b-5p, -124-3p, -16-5p, -192-5p, -1-3p, -17-5p, -92a-3p, -215-5p, and -20a-5p. Thus, our study identified DEGs, molecular pathways, TFs, and miRNAs that reflect molecular mechanisms that differ between FTC and benign FTA. Given the general similarities of these lesions and common tissue origin, some of these differences may reflect malignant progression potential, and include useful candidate biomarkers for FTC and identifying factors important for FTC pathogenesis.
Collapse
Affiliation(s)
- Md. Ali Hossain
- Department of Computer Science & Engineering, Manarat International University, Khagan, Dhaka 1343, Bangladesh;
| | - Tania Akter Asa
- Electrical and Electronic Engineering, Islamic University, Kushtia 7005, Bangladesh;
| | - Md. Mijanur Rahman
- Computer Science & Engineering, Jatiya Kabi Kazi Nazrul Islam University, Mymensingh 2205, Bangladesh;
| | - Shahadat Uddin
- Complex Systems Research Group & Project Management Program, Faculty of Engineering, The University of Sydney, Sydney, NSW 2006, Australia;
| | - Ahmed A. Moustafa
- Marcs Institute for Brain and Behaviour and School of Psychology, Western Sydney University, Sydney, NSW 2751, Australia;
| | - Julian M. W. Quinn
- Bone Biology Divisions, Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia;
| | - Mohammad Ali Moni
- Bone Biology Divisions, Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia;
- WHO Collaborating Centre on eHealth, School of Public Health and Community Medicine, Faculty of Medicine, The University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|
29
|
Rana HK, Akhtar MR, Islam MB, Ahmed MB, Lió P, Huq F, Quinn JMW, Moni MA. Machine Learning and Bioinformatics Models to Identify Pathways that Mediate Influences of Welding Fumes on Cancer Progression. Sci Rep 2020; 10:2795. [PMID: 32066756 PMCID: PMC7026442 DOI: 10.1038/s41598-020-57916-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Accepted: 12/21/2019] [Indexed: 12/13/2022] Open
Abstract
Welding generates and releases fumes that are hazardous to human health. Welding fumes (WFs) are a complex mix of metallic oxides, fluorides and silicates that can cause or exacerbate health problems in exposed individuals. In particular, WF inhalation over an extended period carries an increased risk of cancer, but how WFs may influence cancer behaviour or growth is unclear. To address this issue we employed a quantitative analytical framework to identify the gene expression effects of WFs that may affect the subsequent behaviour of the cancers. We examined datasets of transcript analyses made using microarray studies of WF-exposed tissues and of cancers, including datasets from colorectal cancer (CC), prostate cancer (PC), lung cancer (LC) and gastric cancer (GC). We constructed gene-disease association networks, identified signaling and ontological pathways, clustered protein-protein interaction network using multilayer network topology, and analyzed survival function of the significant genes using Cox proportional hazards (Cox PH) model and product-limit (PL) estimator. We observed that WF exposure causes altered expression of many genes (36, 13, 25 and 17 respectively) whose expression are also altered in CC, PC, LC and GC. Gene-disease association networks, signaling and ontological pathways, protein-protein interaction network, and survival functions of the significant genes suggest ways that WFs may influence the progression of CC, PC, LC and GC. This quantitative analytical framework has identified potentially novel mechanisms by which tissue WF exposure may lead to gene expression changes in tissue gene expression that affect cancer behaviour and, thus, cancer progression, growth or establishment.
Collapse
Affiliation(s)
- Humayan Kabir Rana
- Department of Computer Science and Engineering, Green University of Bangladesh, Dhaka, Bangladesh
| | - Mst Rashida Akhtar
- Department of Computer Science and Engineering, Varendra University, Rajshahi, Bangladesh
| | - M Babul Islam
- Department of Electrical and Electronic Engineering, University of Rajshahi, Rajshahi, Bangladesh
| | - Mohammad Boshir Ahmed
- Bio-electronics Materials Laboratory, School of Materials Science and Engineering, Gwangju Institute of Science and Technology, 261 Cheomdan-gwagiro, Buk-gu, Gwangju, 500-712, Republic of Korea
| | - Pietro Lió
- Computer Laboratory, Department of Computer Science and Technology, University of Cambridge, 15 JJ Thomson Avenue, Cambridge, CB3 0FD, UK
| | - Fazlul Huq
- Discipline of Pathology, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Julian M W Quinn
- Bone Biology Division, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
| | - Mohammad Ali Moni
- Discipline of Pathology, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia. .,Bone Biology Division, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia.
| |
Collapse
|
30
|
Rahman MH, Peng S, Hu X, Chen C, Rahman MR, Uddin S, Quinn JM, Moni MA. A Network-Based Bioinformatics Approach to Identify Molecular Biomarkers for Type 2 Diabetes that Are Linked to the Progression of Neurological Diseases. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:ijerph17031035. [PMID: 32041280 PMCID: PMC7037290 DOI: 10.3390/ijerph17031035] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 02/02/2020] [Accepted: 02/02/2020] [Indexed: 12/21/2022]
Abstract
Neurological diseases (NDs) are progressive disorders, the progression of which can be significantly affected by a range of common diseases that present as comorbidities. Clinical studies, including epidemiological and neuropathological analyses, indicate that patients with type 2 diabetes (T2D) have worse progression of NDs, suggesting pathogenic links between NDs and T2D. However, finding causal or predisposing factors that link T2D and NDs remains challenging. To address these problems, we developed a high-throughput network-based quantitative pipeline using agnostic approaches to identify genes expressed abnormally in both T2D and NDs, to identify some of the shared molecular pathways that may underpin T2D and ND interaction. We employed gene expression transcriptomic datasets from control and disease-affected individuals and identified differentially expressed genes (DEGs) in tissues of patients with T2D and ND when compared to unaffected control individuals. One hundred and ninety seven DEGs (99 up-regulated and 98 down-regulated in affected individuals) that were common to both the T2D and the ND datasets were identified. Functional annotation of these identified DEGs revealed the involvement of significant cell signaling associated molecular pathways. The overlapping DEGs (i.e., seen in both T2D and ND datasets) were then used to extract the most significant GO terms. We performed validation of these results with gold benchmark databases and literature searching, which identified which genes and pathways had been previously linked to NDs or T2D and which are novel. Hub proteins in the pathways were identified (including DNM2, DNM1, MYH14, PACSIN2, TFRC, PDE4D, ENTPD1, PLK4, CDC20B, and CDC14A) using protein-protein interaction analysis which have not previously been described as playing a role in these diseases. To reveal the transcriptional and post-transcriptional regulators of the DEGs we used transcription factor (TF) interactions analysis and DEG-microRNAs (miRNAs) interaction analysis, respectively. We thus identified the following TFs as important in driving expression of our T2D/ND common genes: FOXC1, GATA2, FOXL1, YY1, E2F1, NFIC, NFYA, USF2, HINFP, MEF2A, SRF, NFKB1, USF2, HINFP, MEF2A, SRF, NFKB1, PDE4D, CREB1, SP1, HOXA5, SREBF1, TFAP2A, STAT3, POU2F2, TP53, PPARG, and JUN. MicroRNAs that affect expression of these genes include mir-335-5p, mir-16-5p, mir-93-5p, mir-17-5p, mir-124-3p. Thus, our transcriptomic data analysis identifies novel potential links between NDs and T2D pathologies that may underlie comorbidity interactions, links that may include potential targets for therapeutic intervention. In sum, our neighborhood-based benchmarking and multilayer network topology methods identified novel putative biomarkers that indicate how type 2 diabetes (T2D) and these neurological diseases interact and pathways that, in the future, may be targeted for treatment.
Collapse
Affiliation(s)
- Md Habibur Rahman
- Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; (M.H.R.); (S.P.); (X.H.); (C.C.)
- University of Chinese Academy of Sciences, Beijing 100190, China
- Department of Computer Science and Engineering, Islamic University, Kushtia 7003, Bangladesh
| | - Silong Peng
- Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; (M.H.R.); (S.P.); (X.H.); (C.C.)
- University of Chinese Academy of Sciences, Beijing 100190, China
| | - Xiyuan Hu
- Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; (M.H.R.); (S.P.); (X.H.); (C.C.)
- University of Chinese Academy of Sciences, Beijing 100190, China
| | - Chen Chen
- Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; (M.H.R.); (S.P.); (X.H.); (C.C.)
- University of Chinese Academy of Sciences, Beijing 100190, China
| | - Md Rezanur Rahman
- Department of Biochemistry and Biotechnology, Khwaja Yunus Ali University, Enayetpur, Sirajgonj 6751, Bangladesh;
| | - Shahadat Uddin
- Complex Systems Research Group & Project Management Program, Faculty of Engineering, The University of Sydney, Sydney, NSW 2006, Australia;
| | - Julian M.W. Quinn
- Bone Biology Division, Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia;
| | - Mohammad Ali Moni
- Bone Biology Division, Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia;
- School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2006, Australia
- Correspondence:
| |
Collapse
|
31
|
Identification of Genetic Links of Thyroid Cancer to the Neurodegenerative and Chronic Diseases Progression: Insights from Systems Biology Approach. PROCEEDINGS OF INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE 2020. [DOI: 10.1007/978-981-15-3607-6_21] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
32
|
A systems biology approach to identifying genetic factors affected by aging, lifestyle factors, and type 2 diabetes that influences Parkinson's disease progression. INFORMATICS IN MEDICINE UNLOCKED 2020. [DOI: 10.1016/j.imu.2020.100448] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
|
33
|
Network-based identification of genetic factors in ageing, lifestyle and type 2 diabetes that influence to the progression of Alzheimer's disease. INFORMATICS IN MEDICINE UNLOCKED 2020. [DOI: 10.1016/j.imu.2020.100309] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
|
34
|
Ali Hossain M, Asa TA, Huq F, Quinn JMW, Moni MA. A Network-Based Approach to Identify Molecular Signatures and Comorbidities of Thyroid Cancer. PROCEEDINGS OF INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE 2020. [DOI: 10.1007/978-981-13-7564-4_21] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
35
|
Akram P, Liao L. Prediction of comorbid diseases using weighted geometric embedding of human interactome. BMC Med Genomics 2019; 12:161. [PMID: 31888634 PMCID: PMC6936100 DOI: 10.1186/s12920-019-0605-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 10/16/2019] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Comorbidity is the phenomenon of two or more diseases occurring simultaneously not by random chance and presents great challenges to accurate diagnosis and treatment. As an effort toward better understanding the genetic causes of comorbidity, in this work, we have developed a computational method to predict comorbid diseases. Two diseases sharing common genes tend to increase their comorbidity. Previous work shows that after mapping the associated genes onto the human interactome the distance between the two disease modules (subgraphs) is correlated with comorbidity. METHODS To fully incorporate structural characteristics of interactome as features into prediction of comorbidity, our method embeds the human interactome into a high dimensional geometric space with weights assigned to the network edges and uses the projection onto different dimension to "fingerprint" disease modules. A supervised machine learning classifier is then trained to discriminate comorbid diseases versus non-comorbid diseases. RESULTS In cross-validation using a benchmark dataset of more than 10,000 disease pairs, we report that our model achieves remarkable performance of ROC score = 0.90 for comorbidity threshold at relative risk RR = 0 and 0.76 for comorbidity threshold at RR = 1, and significantly outperforms the previous method and the interactome generated by annotated data. To further incorporate prior knowledge pathways association with diseases, we weight the protein-protein interaction network edges according to their frequency of occurring in those pathways in such a way that edges with higher frequency will more likely be selected in the minimum spanning tree for geometric embedding. Such weighted embedding is shown to lead to further improvement of comorbid disease prediction. CONCLUSION The work demonstrates that embedding the two-dimension planar graph of human interactome into a high dimensional geometric space allows for characterizing and capturing disease modules (subgraphs formed by the disease associated genes) from multiple perspectives, and hence provides enriched features for a supervised classifier to discriminate comorbid disease pairs from non-comorbid disease pairs more accurately than based on simply the module separation.
Collapse
Affiliation(s)
- Pakeeza Akram
- School of Electrical Engineering and Computer Science (SEECS), National University of Sciences and Technology (NUST), H-12, Islamabad, Pakistan
- Department of Computer Science, University of Delaware, Newark, USA
| | - Li Liao
- Department of Computer Science, University of Delaware, Newark, USA
| |
Collapse
|
36
|
Hossain MA, Saiful Islam SM, Quinn JM, Huq F, Moni MA. Machine learning and bioinformatics models to identify gene expression patterns of ovarian cancer associated with disease progression and mortality. J Biomed Inform 2019; 100:103313. [DOI: 10.1016/j.jbi.2019.103313] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2019] [Revised: 09/20/2019] [Accepted: 10/13/2019] [Indexed: 02/07/2023]
|
37
|
Chen X, Shi W, Deng L. Prediction of Disease Comorbidity Using HeteSim Scores based on Multiple Heterogeneous Networks. Curr Gene Ther 2019; 19:232-241. [DOI: 10.2174/1566523219666190917155959] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 06/14/2019] [Accepted: 06/16/2019] [Indexed: 12/25/2022]
Abstract
Background:
Accumulating experimental studies have indicated that disease comorbidity
causes additional pain to patients and leads to the failure of standard treatments compared to patients
who have a single disease. Therefore, accurate prediction of potential comorbidity is essential to design
more efficient treatment strategies. However, only a few disease comorbidities have been discovered
in the clinic.
Objective:
In this work, we propose PCHS, an effective computational method for predicting disease
comorbidity.
Materials and Methods:
We utilized the HeteSim measure to calculate the relatedness score for different
disease pairs in the global heterogeneous network, which integrates six networks based on biological
information, including disease-disease associations, drug-drug interactions, protein-protein interactions
and associations among them. We built the prediction model using the Support Vector Machine
(SVM) based on the HeteSim scores.
Results and Conclusion:
The results showed that PCHS performed significantly better than previous
state-of-the-art approaches and achieved an AUC score of 0.90 in 10-fold cross-validation. Furthermore,
some of our predictions have been verified in literatures, indicating the effectiveness of our method.
Collapse
Affiliation(s)
- Xuegong Chen
- School of Computer Science and Engineering, Central South University, Changsha, 410075, China
| | - Wanwan Shi
- School of Computer Science and Engineering, Central South University, Changsha, 410075, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, 410075, China
| |
Collapse
|
38
|
Gutiérrez-Sacristán A, Bravo À, Giannoula A, Mayer MA, Sanz F, Furlong LI. comoRbidity: an R package for the systematic analysis of disease comorbidities. Bioinformatics 2019; 34:3228-3230. [PMID: 29897411 PMCID: PMC6137966 DOI: 10.1093/bioinformatics/bty315] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Accepted: 04/19/2018] [Indexed: 12/11/2022] Open
Abstract
Motivation The study of comorbidities is a major priority due to their impact on life expectancy, quality of life and healthcare cost. The availability of electronic health records (EHRs) for data mining offers the opportunity to discover disease associations and comorbidity patterns from the clinical history of patients gathered during routine medical care. This opens the need for analytical tools for detection of disease comorbidities, including the investigation of their underlying genetic basis. Results We present comoRbidity, an R package aimed at providing a systematic and comprehensive analysis of disease comorbidities from both the clinical and molecular perspectives. comoRbidity leverages from (i) user provided clinical data from EHR databases (the clinical comorbidity analysis) and (ii) genotype-phenotype information of the diseases under study (the molecular comorbidity analysis) for a comprehensive analysis of disease comorbidities. The clinical comorbidity analysis enables identifying significant disease comorbidities from clinical data, including sex and age stratification and temporal directionality analyses, while the molecular comorbidity analysis supports the generation of hypothesis on the underlying mechanisms of the disease comorbidities by exploring shared genes among disorders. The open-source comoRbidity package is a software tool aimed at expediting the integrative analysis of disease comorbidities by incorporating several analytical and visualization functions. Availability and implementation https://bitbucket.org/ibi_group/comorbidity Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alba Gutiérrez-Sacristán
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences (DCEXS), Hospital del Mar Medical Research Institute (IMIM), Universitat Pompeu Fabra (UPF), Barcelona, Spain.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Àlex Bravo
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences (DCEXS), Hospital del Mar Medical Research Institute (IMIM), Universitat Pompeu Fabra (UPF), Barcelona, Spain.,Large-Scale Text Understanding Systems Lab, TALN Research Group, Department of Information and Communication Technologies (DTIC), Universitat Pompeu Fabra, Barcelona, Spain
| | - Alexia Giannoula
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences (DCEXS), Hospital del Mar Medical Research Institute (IMIM), Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Miguel A Mayer
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences (DCEXS), Hospital del Mar Medical Research Institute (IMIM), Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Ferran Sanz
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences (DCEXS), Hospital del Mar Medical Research Institute (IMIM), Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Laura I Furlong
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences (DCEXS), Hospital del Mar Medical Research Institute (IMIM), Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|
39
|
A computational approach to identify blood cell-expressed Parkinson's disease biomarkers that are coordinately expressed in brain tissue. Comput Biol Med 2019; 113:103385. [PMID: 31437626 DOI: 10.1016/j.compbiomed.2019.103385] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 08/06/2019] [Accepted: 08/07/2019] [Indexed: 01/09/2023]
Abstract
Identification of genes whose regulation of expression is functionally similar in both brain tissue and blood cells could in principle enable monitoring of significant neurological traits and disorders by analysis of blood samples. We thus employed transcriptional analysis of pathologically affected tissues, using agnostic approaches to identify overlapping gene functions and integrating this transcriptomic information with expression quantitative trait loci (eQTL) data. Here, we estimate the correlation of gene expression in the top-associated cis-eQTLs of brain tissue and blood cells in Parkinson's Disease (PD). We introduced quantitative frameworks to reveal the complex relationship of various biasing genetic factors in PD, a neurodegenerative disease. We examined gene expression microarray and RNA-Seq datasets from human brain and blood tissues from PD-affected and control individuals. Differentially expressed genes (DEG) were identified for both brain and blood cells to determine common DEG overlaps. Based on neighborhood-based benchmarking and multilayer network topology approaches we then developed genetic associations of factors with PD. Overlapping DEG sets underwent gene enrichment using pathway analysis and gene ontology methods, which identified candidate common genes and pathways. We identified 12 significantly dysregulated genes shared by brain and blood cells, which were validated using dbGaP (gene SNP-disease linkage) database for gold-standard benchmarking of their significance in disease processes. Ontological and pathway analyses identified significant gene ontology and molecular pathways that indicate PD progression. In sum, we found possible novel links between pathological processes in brain tissue and blood cells by examining cell pathway commonalities, corroborating these associations using well validated datasets. This demonstrates that for brain-related pathologies combining gene expression analysis and blood cell cis-eQTL is a potentially powerful analytical approach. Thus, our methodologies facilitate data-driven approaches that can advance knowledge of disease mechanisms and may, with clinical validation, enable prediction of neurological dysfunction using blood cell transcript profiling.
Collapse
|
40
|
Rana HK, Akhtar MR, Islam MB, Ahmed MB, Liò P, Quinn JMW, Huq F, Moni MA. Genetic effects of welding fumes on the development of respiratory system diseases. Comput Biol Med 2019; 108:142-149. [PMID: 31005006 DOI: 10.1016/j.compbiomed.2019.04.004] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 04/03/2019] [Accepted: 04/04/2019] [Indexed: 12/19/2022]
Abstract
BACKGROUND The welding process releases potentially hazardous gases and fumes, mainly composed of metallic oxides, fluorides and silicates. Long term welding fume (WF) inhalation is a recognized health issue that carries a risk of developing chronic health problems, particularly respiratory system diseases (RSDs). Aside from general airway irritation, WF exposure may drive direct cellular responses in the respiratory system which increase risk of RSD, but these are not well understood. METHODS We developed a quantitative framework to identify gene expression effects of WF exposure that may affect RSD development. We analyzed gene expression microarray data from WF-exposed tissues and RSD-affected tissues, including chronic bronchitis (CB), asthma (AS), pulmonary edema (PE), lung cancer (LC) datasets. We built disease-gene (diseasome) association networks and identified dysregulated signaling and ontological pathways, and protein-protein interaction sub-network using neighborhood-based benchmarking and multilayer network topology. RESULTS We observed many genes with altered expression in WF-exposed tissues were also among differentially expressed genes (DEGs) in RSD tissues; for CB, AS, PE and LC there were 34, 27, 50 and 26 genes respectively. DEG analysis, using disease association networks, pathways, ontological analysis and protein-protein interaction sub-network suggest significant links between WF exposure and the development of CB, AS, PE and LC. CONCLUSIONS Our network-based analysis and investigation of the genetic links of WFs and RSDs confirm a number of genes and gene products are plausible participants in RSD development. Our results are a significant resource to identify causal influences on the development of RSDs, particularly in the context of WF exposure.
Collapse
Affiliation(s)
- Humayan Kabir Rana
- Department of Computer Science and Engineering, Green University of Bangladesh, Bangladesh
| | - Mst Rashida Akhtar
- Department of Computer Science and Engineering, Varendra University, Rajshahi, Bangladesh
| | - M Babul Islam
- Department of Applied Physics and Electronic Engineering, University of Rajshahi, Bangladesh
| | - Mohammad Boshir Ahmed
- School of Civil and Environmental Engineering, University of Technology Sydney, NSW, 2007, Australia
| | - Pietro Liò
- Computer Laboratory, The University of Cambridge, 15 JJ Thomson Avenue, Cambridge, UK
| | - Julian M W Quinn
- Bone Biology Division, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
| | - Fazlul Huq
- Discipline of Pathology, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Australia
| | - Mohammad Ali Moni
- Bone Biology Division, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia; Discipline of Pathology, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Australia.
| |
Collapse
|
41
|
Rana HK, Akhtar MR, Ahmed MB, Liò P, Quinn JM, Huq F, Moni MA. Genetic effects of welding fumes on the progression of neurodegenerative diseases. Neurotoxicology 2019; 71:93-101. [DOI: 10.1016/j.neuro.2018.12.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2018] [Revised: 12/03/2018] [Accepted: 12/05/2018] [Indexed: 12/14/2022]
|
42
|
Hossain MA, Asa TA, Rahman MR, Moni MA. Network-based approach to identify key candidate genes and pathways shared by thyroid cancer and chronic kidney disease. INFORMATICS IN MEDICINE UNLOCKED 2019. [DOI: 10.1016/j.imu.2019.100240] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
|
43
|
Del Prete E, Facchiano A, Liò P. Bioinformatics methodologies for coeliac disease and its comorbidities. Brief Bioinform 2018; 21:355-367. [PMID: 30452543 DOI: 10.1093/bib/bby109] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Revised: 10/01/2018] [Accepted: 10/11/2018] [Indexed: 12/30/2022] Open
Abstract
Coeliac disease (CD) is a complex, multifactorial pathology caused by different factors, such as nutrition, immunological response and genetic factors. Many autoimmune diseases are comorbidities for CD, and a comprehensive and integrated analysis with bioinformatics approaches can help in evaluating the interconnections among all the selected pathologies. We first performed a detailed survey of gene expression data available in public repositories on CD and less commonly considered comorbidities. Then we developed an innovative pipeline that integrates gene expression, cell-type data and online resources (e.g. a list of comorbidities from the literature), using bioinformatics methods such as gene set enrichment analysis and semantic similarity. Our pipeline is written in R language, available at the following link: http://bioinformatica.isa.cnr.it/COELIAC_DISEASE/SCRIPTS/. We found a list of common differential expressed genes, gene ontology terms and pathways among CD and comorbidities and the closeness among the selected pathologies by means of disease ontology terms. Physicians and other researchers, such as molecular biologists, systems biologists and pharmacologists can use it to analyze pathology in detail, from differential expressed genes to ontologies, performing a comparison with the pathology comorbidities or with other diseases.
Collapse
Affiliation(s)
- Eugenio Del Prete
- Department of Sciences, University of Basilicata,Via dell'Ateneo Lucano, Potenza, Italy.,National Research Council, Institute of Food Science (CNR-ISA),Via Roma 64, Avellino, Italy.,Computer Laboratory, University of Cambridge, JJ Thomson Ave., Cambridge, UK
| | - Angelo Facchiano
- National Research Council, Institute of Food Science (CNR-ISA),Via Roma 64, Avellino, Italy
| | - Pietro Liò
- Computer Laboratory, University of Cambridge, JJ Thomson Ave., Cambridge, UK
| |
Collapse
|
44
|
Scatà M, Di Stefano A, La Corte A, Liò P. Quantifying the propagation of distress and mental disorders in social networks. Sci Rep 2018; 8:5005. [PMID: 29568086 PMCID: PMC5864966 DOI: 10.1038/s41598-018-23260-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2017] [Accepted: 03/07/2018] [Indexed: 01/18/2023] Open
Abstract
Heterogeneity of human beings leads to think and react differently to social phenomena. Awareness and homophily drive people to weigh interactions in social multiplex networks, influencing a potential contagion effect. To quantify the impact of heterogeneity on spreading dynamics, we propose a model of coevolution of social contagion and awareness, through the introduction of statistical estimators, in a weighted multiplex network. Multiplexity of networked individuals may trigger propagation enough to produce effects among vulnerable subjects experiencing distress, mental disorder, which represent some of the strongest predictors of suicidal behaviours. The exposure to suicide is emotionally harmful, since talking about it may give support or inadvertently promote it. To disclose the complex effect of the overlapping awareness on suicidal ideation spreading among disordered people, we also introduce a data-driven approach by integrating different types of data. Our modelling approach unveils the relationship between distress and mental disorders propagation and suicidal ideation spreading, shedding light on the role of awareness in a social network for suicide prevention. The proposed model is able to quantify the impact of overlapping awareness on suicidal ideation spreading and our findings demonstrate that it plays a dual role on contagion, either reinforcing or delaying the contagion outbreak.
Collapse
Affiliation(s)
- Marialisa Scatà
- University of Catania, Dipartimento di Ingegneria Elettrica, Elettronica e Informatica, Catania, CNIT 95125, Italy.
| | - Alessandro Di Stefano
- University of Catania, Dipartimento di Ingegneria Elettrica, Elettronica e Informatica, Catania, CNIT 95125, Italy
| | - Aurelio La Corte
- University of Catania, Dipartimento di Ingegneria Elettrica, Elettronica e Informatica, Catania, CNIT 95125, Italy
| | - Pietro Liò
- University of Cambridge, Computer Laboratory, Cambridge, CB3 0FD, UK
| |
Collapse
|
45
|
He F, Zhu G, Wang YY, Zhao XM, Huang DS. PCID: A Novel Approach for Predicting Disease Comorbidity by Integrating Multi-Scale Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:678-686. [PMID: 27076462 DOI: 10.1109/tcbb.2016.2550443] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Disease comorbidity is the presence of one or more diseases along with a primary disorder, which causes additional pain to patients and leads to the failure of standard treatments compared with single diseases. Therefore, the identification of potential comorbidity can help prevent those comorbid diseases when treating a primary disease. Unfortunately, most of current known disease comorbidities are discovered occasionally in clinic, and our knowledge about comorbidity is far from complete. Despite the fact that many efforts have been made to predict disease comorbidity, the prediction accuracy of existing computational approaches needs to be improved. By investigating the factors underlying disease comorbidity, e.g., mutated genes and rewired protein-protein interactions (PPIs), we here present a novel algorithm to predict disease comorbidity by integrating multi-scale data ranging from genes to phenotypes. Benchmark results on real data show that our approach outperforms existing algorithms, and some of our novel predictions are validated with those reported in literature, indicating the effectiveness and predictive power of our approach. In addition, we identify some pathway and PPI patterns that underlie the co-occurrence between a primary disease and certain disease classes, which can help explain how the comorbidity is initiated from molecular perspectives.
Collapse
|
46
|
Gomez-Cabrero D, Menche J, Vargas C, Cano I, Maier D, Barabási AL, Tegnér J, Roca J. From comorbidities of chronic obstructive pulmonary disease to identification of shared molecular mechanisms by data integration. BMC Bioinformatics 2016; 17:441. [PMID: 28185567 PMCID: PMC5133493 DOI: 10.1186/s12859-016-1291-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Background Deep mining of healthcare data has provided maps of comorbidity relationships between diseases. In parallel, integrative multi-omics investigations have generated high-resolution molecular maps of putative relevance for understanding disease initiation and progression. Yet, it is unclear how to advance an observation of comorbidity relations (one disease to others) to a molecular understanding of the driver processes and associated biomarkers. Results Since Chronic Obstructive Pulmonary disease (COPD) has emerged as a central hub in temporal comorbidity networks, we developed a systematic integrative data-driven framework to identify shared disease-associated genes and pathways, as a proxy for the underlying generative mechanisms inducing comorbidity. We integrated records from approximately 13 M patients from the Medicare database with disease-gene maps that we derived from several resources including a semantic-derived knowledge-base. Using rank-based statistics we not only recovered known comorbidities but also discovered a novel association between COPD and digestive diseases. Furthermore, our analysis provides the first set of COPD co-morbidity candidate biomarkers, including IL15, TNF and JUP, and characterizes their association to aging and life-style conditions, such as smoking and physical activity. Conclusions The developed framework provides novel insights in COPD and especially COPD co-morbidity associated mechanisms. The methodology could be used to discover and decipher the molecular underpinning of other comorbidity relationships and furthermore, allow the identification of candidate co-morbidity biomarkers. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1291-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- David Gomez-Cabrero
- Department of Medicine, Karolinska Institutet, Unit of Computational Medicine, Stockholm, 171 77, Sweden. .,Karolinska Institutet, Center for Molecular Medicine, Stockholm, 171 77, Sweden. .,Department of Medicine, Unit of Clinical Epidemiology, Karolinska University Hospital, Solna, L8, 17176, Sweden. .,Science for Life Laboratory, Solna, 17121, Sweden. .,Mucosal and Salivary Biology Division, King's College London Dental Institute, London, UK.
| | - Jörg Menche
- Center for Complex Networks Research and Department of Physics, Northeastern University, Boston, MA, USA.,Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.,Center for Network Science, Central European University, Budapest, Hungary
| | - Claudia Vargas
- Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clinic de Barcelona, Universitat de Barcelona, Barcelona, Spain.,Center for Biomedical Network Research in Respiratory Diseases (CIBERES), Madrid, Spain
| | - Isaac Cano
- Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clinic de Barcelona, Universitat de Barcelona, Barcelona, Spain.,Center for Biomedical Network Research in Respiratory Diseases (CIBERES), Madrid, Spain
| | | | - Albert-László Barabási
- Center for Complex Networks Research and Department of Physics, Northeastern University, Boston, MA, USA.,Center for Cancer Systems Biology (CCSB) and Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.,Center for Network Science, Central European University, Budapest, Hungary.,Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Jesper Tegnér
- Department of Medicine, Karolinska Institutet, Unit of Computational Medicine, Stockholm, 171 77, Sweden.,Karolinska Institutet, Center for Molecular Medicine, Stockholm, 171 77, Sweden.,Department of Medicine, Unit of Clinical Epidemiology, Karolinska University Hospital, Solna, L8, 17176, Sweden.,Science for Life Laboratory, Solna, 17121, Sweden
| | - Josep Roca
- Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Hospital Clinic de Barcelona, Universitat de Barcelona, Barcelona, Spain. .,Center for Biomedical Network Research in Respiratory Diseases (CIBERES), Madrid, Spain.
| | | |
Collapse
|
47
|
Scatà M, Di Stefano A, Liò P, La Corte A. The Impact of Heterogeneity and Awareness in Modeling Epidemic Spreading on Multiplex Networks. Sci Rep 2016; 6:37105. [PMID: 27848978 PMCID: PMC5111071 DOI: 10.1038/srep37105] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Accepted: 10/25/2016] [Indexed: 12/18/2022] Open
Abstract
In the real world, dynamic processes involving human beings are not disjoint. To capture the real complexity of such dynamics, we propose a novel model of the coevolution of epidemic and awareness spreading processes on a multiplex network, also introducing a preventive isolation strategy. Our aim is to evaluate and quantify the joint impact of heterogeneity and awareness, under different socioeconomic conditions. Considering, as case study, an emerging public health threat, Zika virus, we introduce a data-driven analysis by exploiting multiple sources and different types of data, ranging from Big Five personality traits to Google Trends, related to different world countries where there is an ongoing epidemic outbreak. Our findings demonstrate how the proposed model allows delaying the epidemic outbreak and increasing the resilience of nodes, especially under critical economic conditions. Simulation results, using data-driven approach on Zika virus, which has a growing scientific research interest, are coherent with the proposed analytic model.
Collapse
Affiliation(s)
- Marialisa Scatà
- University of Catania, Dipartimento di Ingegneria Elettrica, Elettronica e Informatica, Catania, 95125, Italy
| | - Alessandro Di Stefano
- University of Catania, Dipartimento di Ingegneria Elettrica, Elettronica e Informatica, Catania, 95125, Italy
| | - Pietro Liò
- University of Cambridge, Computer Laboratory, Cambridge (UK), CB3OFD, UK
| | - Aurelio La Corte
- University of Catania, Dipartimento di Ingegneria Elettrica, Elettronica e Informatica, Catania, 95125, Italy
| |
Collapse
|
48
|
Network regularised Cox regression and multiplex network models to predict disease comorbidities and survival of cancer. Comput Biol Chem 2015; 59 Pt B:15-31. [DOI: 10.1016/j.compbiolchem.2015.08.010] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2015] [Revised: 08/21/2015] [Accepted: 08/25/2015] [Indexed: 12/17/2022]
|
49
|
Garcia-Albornoz M, Nielsen J. Finding directionality and gene-disease predictions in disease associations. BMC SYSTEMS BIOLOGY 2015; 9:35. [PMID: 26168918 PMCID: PMC4501277 DOI: 10.1186/s12918-015-0184-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/25/2014] [Accepted: 06/30/2015] [Indexed: 01/04/2023]
Abstract
BACKGROUND Understanding the underlying molecular mechanisms in human diseases is important for diagnosis and treatment of complex conditions and has traditionally been done by establishing associations between disorder-genes and their associated diseases. This kind of network analysis usually includes only the interaction of molecular components and shared genes. The present study offers a network and association analysis under a bioinformatics frame involving the integration of HUGO Gene Nomenclature Committee approved gene symbols, KEGG metabolic pathways and ICD-10-CM codes for the analysis of human diseases based on the level of inclusion and hypergeometric enrichment between genes and metabolic pathways shared by the different human disorders. METHODS The present study offers the integration of HGNC approved gene symbols, KEGG metabolic pathways andICD-10-CM codes for the analysis of associations based on the level of inclusion and hypergeometricenrichment between genes and metabolic pathways shared by different diseases. RESULTS 880 unique ICD-10-CM codes were mapped to the 4315 OMIM phenotypes and 3083 genes with phenotype-causing mutation. From this, a total of 705 ICD-10-CM codes were linked to 1587 genes with phenotype-causing mutations and 801 KEGG pathways creating a tripartite network composed by 15,455 code-gene-pathway interactions. These associations were further used for an inclusion analysis between diseases along with gene-disease predictions based on a hypergeometric enrichment methodology. CONCLUSIONS The results demonstrate that even though a large number of genes and metabolic pathways are shared between diseases of the same categories, inclusion levels between these genes and pathways are directional and independent of the disease classification. However, the gene-disease-pathway associations can be used for prediction of new gene-disease interactions that will be useful in drug discovery and therapeutic applications.
Collapse
Affiliation(s)
- Manuel Garcia-Albornoz
- Department of Biology and Biological Engineering, Chalmers University of Technology, Göteborg, Sweden.
| | - Jens Nielsen
- Department of Biology and Biological Engineering, Chalmers University of Technology, Göteborg, Sweden.
| |
Collapse
|
50
|
Moni MA, Liò P. How to build personalized multi-omics comorbidity profiles. Front Cell Dev Biol 2015; 3:28. [PMID: 26157799 PMCID: PMC4478898 DOI: 10.3389/fcell.2015.00028] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2014] [Accepted: 04/27/2015] [Indexed: 12/19/2022] Open
Abstract
Multiple diseases (acute or chronic events) occur together in a patient, which refers to the disease comorbidities, because of the multi ways associations among diseases. Due to shared genetic, molecular, environmental, and lifestyle-based risk factors, many diseases are comorbid in the same patient. Methods for integrating multiple types of omics data play an important role to identify integrative biomarkers for stratification of patients into groups with different clinical outcomes. Moreover, integrated omics and clinical information may potentially improve prediction accuracy of disease comorbidities. However, there is a lack of effective and efficient bioinformatics and statistical software for true integrative data analysis. With the availability of the wide spread huge omics, phenotype and ontology information, it is becoming more and more practical to help doctors in clinical diagnostics and comorbidity prediction by providing appropriate software tool. We developed an R software POGO to compute novel estimators of the disease comorbidity risks and patient stratification. Starting from an initial diagnosis, omics and clinical data of a patient the software identifies the association risk of disease comorbidities. The input of this software is the initial diagnosis of a patient and the output provides evidence of disease comorbidities. The functions of POGO offer flexibility for diagnostic applications to predict disease comorbidities, and can be easily integrated to high-throughput and clinical data analysis pipelines. POGO is compliant with the Bioconductor standard and it is freely available at www.cl.cam.ac.uk/~mam211/POGO/.
Collapse
Affiliation(s)
- Mohammad Ali Moni
- Computer Laboratory, University of CambridgeCambridge, UK
- Department of Computer Science and Engineering, Pabna University of Science and TechnologyPabna, Bangladesh
- Bone Biology, Garvan Institute of Medical Research, The University of New South WalesSydney, NSW, Australia
| | - Pietro Liò
- Computer Laboratory, University of CambridgeCambridge, UK
| |
Collapse
|