1
|
Saranya KR, Vimina ER, Pinto FR. TransNeT-CGP: A cluster-based comorbid gene prioritization by integrating transcriptomics and network-topological features. Comput Biol Chem 2024; 110:108038. [PMID: 38461796 DOI: 10.1016/j.compbiolchem.2024.108038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 01/11/2024] [Accepted: 02/25/2024] [Indexed: 03/12/2024]
Abstract
The local disruptions caused by the genes of one disease can influence the pathways associated with the other diseases resulting in comorbidity. For gene therapies, it is necessary to prioritize the key genes that regulate common biological mechanisms to tackle the issues caused by overlapping diseases. This work proposes a clustering-based computational approach for prioritising the comorbid genes within the overlapping disease modules by analyzing Protein-Protein Interaction networks. For this, a sub-network with gene interactions of the disease pair was extracted from the interactome. The edge weights are assigned by combining the pairwise gene expression correlation and betweenness centrality scores. Further, a weighted graph clustering algorithm is applied and dominant nodes of high-density clusters are ranked based on clustering coefficients and neighborhood connectivity. Case studies based on neurodegenerative diseases such as Amyotrophic Lateral Sclerosis- Spinal Muscular Atrophy (ALS-SMA) pair and cancers such as Ovarian Carcinoma-Invasive Ductal Breast Carcinoma (OC-IDBC) pair were conducted to examine the efficacy of the proposed method. To identify the mechanistic role of top-ranked genes, we used Functional and Pathway enrichment analysis, connectivity analysis with leave-one-out (LOO) method, analysis of associated disease-related protein complexes, and prioritization tools such as TOPPGENE and Heml2.0. From pathway analysis, it was observed that the top 10 genes obtained using the proposed method were associated with 10 pathways in ALS-SMA comorbidity and 15 in the case of OC-IDBC, while that in similar methods like SAPDSB and S2B were 4, 6 respectively for ALS-SMA and 9, 10 respectively for OC-IDBC. In both case studies, 70 % of the disease-specific benchmark protein complexes were linked to top-ranked genes of the proposed method while that of SAPDSB and S2B were 55 % and 60 % respectively. Additionally, it was found that the removal of the top 10 genes disconnect the network into 14 distinct components in the case of ALS-SMA and 9 in the case of OC-IDBC. The experimental results shows that the proposed method can be effectively used for identifying key genes in comorbidity and can offer insights about the intricate molecular relationship driving comorbid diseases.
Collapse
Affiliation(s)
- K R Saranya
- Department of Computer Science & IT, School of Computing, Amrita Vishwa Vidyapeetham, Kochi Campus, India.
| | - E R Vimina
- Department of Computer Science & IT, School of Computing, Amrita Vishwa Vidyapeetham, Kochi Campus, India.
| | - F R Pinto
- Chemistry and Biochemistry Department, Faculty of Sciences, University of Lisbon, Portugal.
| |
Collapse
|
2
|
Shang J, Zhu X, Sun Y, Li F, Kong X, Liu JX. DM-MOGA: a multi-objective optimization genetic algorithm for identifying disease modules of non-small cell lung cancer. BMC Bioinformatics 2023; 24:13. [PMID: 36624376 PMCID: PMC9830734 DOI: 10.1186/s12859-023-05136-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Accepted: 01/04/2023] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Constructing molecular interaction networks from microarray data and then identifying disease module biomarkers can provide insight into the underlying pathogenic mechanisms of non-small cell lung cancer. A promising approach for identifying disease modules in the network is community detection. RESULTS In order to identify disease modules from gene co-expression networks, a community detection method is proposed based on multi-objective optimization genetic algorithm with decomposition. The method is named DM-MOGA and possesses two highlights. First, the boundary correction strategy is designed for the modules obtained in the process of local module detection and pre-simplification. Second, during the evolution, we introduce Davies-Bouldin index and clustering coefficient as fitness functions which are improved and migrated to weighted networks. In order to identify modules that are more relevant to diseases, the above strategies are designed to consider the network topology of genes and the strength of connections with other genes at the same time. Experimental results of different gene expression datasets of non-small cell lung cancer demonstrate that the core modules obtained by DM-MOGA are more effective than those obtained by several other advanced module identification methods. CONCLUSIONS The proposed method identifies disease-relevant modules by optimizing two novel fitness functions to simultaneously consider the local topology of each gene and its connection strength with other genes. The association of the identified core modules with lung cancer has been confirmed by pathway and gene ontology enrichment analysis.
Collapse
Affiliation(s)
- Junliang Shang
- grid.412638.a0000 0001 0227 8151School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Xuhui Zhu
- grid.412638.a0000 0001 0227 8151School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Yan Sun
- grid.412638.a0000 0001 0227 8151School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Feng Li
- grid.412638.a0000 0001 0227 8151School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Xiangzhen Kong
- grid.412638.a0000 0001 0227 8151School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| | - Jin-Xing Liu
- grid.412638.a0000 0001 0227 8151School of Computer Science, Qufu Normal University, Rizhao, 276826 China
| |
Collapse
|
3
|
Baruah B, Dutta MP, Bhattacharyya DK. Identification of ESCC potential biomarkers using biclustering algorithms. GENE REPORTS 2022. [DOI: 10.1016/j.genrep.2022.101563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
4
|
Taylor LW, French JE, Robbins ZG, Nylander-French LA. Epigenetic Markers Are Associated With Differences in Isocyanate Biomarker Levels in Exposed Spray-Painters. Front Genet 2021; 12:700636. [PMID: 34335698 PMCID: PMC8318037 DOI: 10.3389/fgene.2021.700636] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 06/21/2021] [Indexed: 12/30/2022] Open
Abstract
Isocyanates are respiratory and skin sensitizers that are one of the main causes of occupational asthma globally. Genetic and epigenetic markers are associated with isocyanate-induced asthma and, before asthma develops, we have shown that genetic polymorphisms are associated with variation in plasma and urine biomarker levels in exposed workers. Inter-individual epigenetic variance may also have a significant role in the observed biomarker variability following isocyanate exposure. Therefore, we determined the percent methylation for CpG islands from DNA extracted from mononuclear blood cells of 24 male spray-painters exposed to 1,6-hexamethylene diisocyanate (HDI) monomer and HDI isocyanurate. Spray-painters’ personal inhalation and skin exposure to these compounds and the respective biomarker levels of 1,6-diaminohexane (HDA) and trisaminohexyl isocyanurate (TAHI) in their plasma and urine were measured during three repeated industrial hygiene monitoring visits. We controlled for inhalation exposure, skin exposure, age, smoking status, and ethnicity as covariates and performed an epigenome-wide association study (EWAS) using likelihood-ratio statistical modeling. We identified 38 CpG markers associated with differences in isocyanate biomarker levels (Bonferroni < 0.05). Annotations for these markers included 18 genes: ALG1, ANKRD11, C16orf89, CHD7, COL27A, FUZ, FZD9, HMGN1, KRT6A, LEPR, MAPK10, MED25, NOSIP, PKD1, SNX19, UNC13A, UROS, and ZFHX3. We explored the functions of the genes that have been published in the literature and used GeneMANIA to investigate gene ontologies and predicted protein-interaction networks. The protein functions of the predicted networks include keratinocyte migration, cell–cell adhesions, calcium transport, neurotransmitter release, nitric oxide production, and apoptosis regulation. Many of the protein pathway functions overlap with previous findings on genetic markers associated with variability both in isocyanate biomarker levels and asthma susceptibility, which suggests there are overlapping protein pathways that contribute to both isocyanate toxicokinetics and toxicodynamics. These predicted protein networks can inform future research on the mechanism of allergic airway sensitization by isocyanates and aid in the development of mitigation strategies to better protect worker health.
Collapse
Affiliation(s)
- Laura W Taylor
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - John E French
- Nutrition Research Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Zachary G Robbins
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Leena A Nylander-French
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| |
Collapse
|
5
|
Mahapatra S, Bhuyan R, Das J, Swarnkar T. Integrated multiplex network based approach for hub gene identification in oral cancer. Heliyon 2021; 7:e07418. [PMID: 34258466 PMCID: PMC8258848 DOI: 10.1016/j.heliyon.2021.e07418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 01/27/2021] [Accepted: 06/23/2021] [Indexed: 02/01/2023] Open
Abstract
Background: The incidence of Oral Cancer (OC) is high in Asian countries, which goes undetected at its early stage. The study of genetics, especially genetic networks holds great promise in this endeavor. Hub genes in a genetic network are prominent in regulating the whole network structure of genes. Thus identification of such genes related to specific cancer types can help in reducing the gap in OC prognosis. Methods: Traditional study of network biology is unable to decipher the inter-dependencies within and across diverse biological networks. Multiplex network provides a powerful representation of such systems and encodes much richer information than isolated networks. In this work, we focused on the entire multiplex structure of the genetic network integrating the gene expression profile and DNA methylation profile for OC. Further, hub genes were identified by considering their connectivity in the multiplex structure and the respective protein-protein interaction (PPI) network as well. Results: 46 hub genes were inferred in our approach with a high prediction accuracy (96%), outstanding Matthews coefficient correlation value (93%) and significant biological implications. Among them, genes PIK3CG, PIK3R5, MYH7, CDC20 and CCL4 were differentially expressed and predominantly enriched in molecular cascades specific to OC. Conclusions: The identified hub genes in this work carry ontological signatures specific to cancer, which may further facilitate improved understanding of the tumorigenesis process and the underlying molecular events. Result indicates the effectiveness of our integrated multiplex network approach for hub gene identification. This work puts an innovative research route for multi-omics biological data analysis.
Collapse
Affiliation(s)
- S. Mahapatra
- Department of Computer Application, Siksha O Anusandhan Deemed to be University, Bhubaneswar, India
| | - R. Bhuyan
- Department of Oral Pathology & Microbiology, Siksha O Anusandhan Deemed to be University, Bhubaneswar, India
| | - J. Das
- Centre for Genomics & Biomedical Informatics, Siksha O Anusandhan Deemed to be University, Bhubaneswar, India
| | - T. Swarnkar
- Department of Computer Application, Siksha O Anusandhan Deemed to be University, Bhubaneswar, India
| |
Collapse
|
6
|
Wu J, Xia X, Hu Y, Fang X, Orsulic S. Identification of Infertility-Associated Topologically Important Genes Using Weighted Co-expression Network Analysis. Front Genet 2021; 12:580190. [PMID: 33613630 PMCID: PMC7887323 DOI: 10.3389/fgene.2021.580190] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 01/04/2021] [Indexed: 12/19/2022] Open
Abstract
Endometriosis has been associated with a high risk of infertility. However, the underlying molecular mechanism of infertility in endometriosis remains poorly understood. In our study, we aimed to discover topologically important genes related to infertility in endometriosis, based on the structure network mining. We used microarray data from the Gene Expression Omnibus (GEO) database to construct a weighted gene co-expression network for fertile and infertile women with endometriosis and to identify gene modules highly correlated with clinical features of infertility in endometriosis. Additionally, the protein–protein interaction network analysis was used to identify the potential 20 hub messenger RNAs (mRNAs) while the network topological analysis was used to identify nine candidate long non-coding RNAs (lncRNAs). Functional annotations of clinically significant modules and lncRNAs revealed that hub genes might be involved in infertility in endometriosis by regulating G protein-coupled receptor signaling (GPCR) activity. Gene Set Enrichment Analysis showed that the phospholipase C-activating GPCR signaling pathway is correlated with infertility in patients with endometriosis. Taken together, our analysis has identified 29 hub genes which might lead to infertility in endometriosis through the regulation of the GPCR network.
Collapse
Affiliation(s)
- Jingni Wu
- Department of Obstetrics and Gynecology, The Second Xiangya Hospital, Central South University, Changsha, China.,Department of Obstetrics and Gynecology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States
| | - Xiaomeng Xia
- Department of Obstetrics and Gynecology, The Second Xiangya Hospital, Central South University, Changsha, China
| | - Ye Hu
- Department of Obstetrics and Gynecology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States
| | - Xiaoling Fang
- Department of Obstetrics and Gynecology, The Second Xiangya Hospital, Central South University, Changsha, China
| | - Sandra Orsulic
- Department of Obstetrics and Gynecology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, United States
| |
Collapse
|
7
|
Perscheid C. Integrative biomarker detection on high-dimensional gene expression data sets: a survey on prior knowledge approaches. Brief Bioinform 2020; 22:5881664. [PMID: 32761115 DOI: 10.1093/bib/bbaa151] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 06/15/2020] [Accepted: 06/16/2020] [Indexed: 02/06/2023] Open
Abstract
Gene expression data provide the expression levels of tens of thousands of genes from several hundred samples. These data are analyzed to detect biomarkers that can be of prognostic or diagnostic use. Traditionally, biomarker detection for gene expression data is the task of gene selection. The vast number of genes is reduced to a few relevant ones that achieve the best performance for the respective use case. Traditional approaches select genes based on their statistical significance in the data set. This results in issues of robustness, redundancy and true biological relevance of the selected genes. Integrative analyses typically address these shortcomings by integrating multiple data artifacts from the same objects, e.g. gene expression and methylation data. When only gene expression data are available, integrative analyses instead use curated information on biological processes from public knowledge bases. With knowledge bases providing an ever-increasing amount of curated biological knowledge, such prior knowledge approaches become more powerful. This paper provides a thorough overview on the status quo of biomarker detection on gene expression data with prior biological knowledge. We discuss current shortcomings of traditional approaches, review recent external knowledge bases, provide a classification and qualitative comparison of existing prior knowledge approaches and discuss open challenges for this kind of gene selection.
Collapse
Affiliation(s)
- Cindy Perscheid
- Hasso Plattner Institute, University of Potsdam, Potsdam, 14482, Germany
| |
Collapse
|
8
|
Singh P, Rai A, Dohare R, Arora S, Ali S, Parveen S, Syed MA. Network-based identification of signature genes KLF6 and SPOCK1 associated with oral submucous fibrosis. Mol Clin Oncol 2020; 12:299-310. [PMID: 32190310 PMCID: PMC7058035 DOI: 10.3892/mco.2020.1991] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Accepted: 10/08/2019] [Indexed: 12/16/2022] Open
Abstract
The molecular mechanism of oral submucous fibrosis (OSF) is yet to be fully elucidated. The identification of reliable signature genes to screen patients with a high risk of OSF and to provide oral cancer surveillance is therefore required. The present study produced a filtering criterion based on network characteristics and principal component analysis, and identified the genes that were involved in OSF prognosis. Two gene expression datasets were analyzed using meta-analysis, the results of which revealed 1,176 biologically significant genes. A co-expression network was subsequently constructed and weighted gene modules were detected. The pathway and functional enrichment analyses of the present study allowed for the identification of modules 1 and 2, and their respective genes, SPARC (osteonectin), cwcv and kazal like domain proteoglycan 1 (SPOCK1) and kruppel like factor 6 (KLF6), which were involved in the occurrence of OSF. The results revealed that both genes had a prominent role in epithelial to mesenchymal transition during OSF progression. The genes identified in the present study require further exploration and validation within clinical settings to determine their roles in OSF.
Collapse
Affiliation(s)
- Prithvi Singh
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi 110025, India
| | - Arpita Rai
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi 110025, India
- Department of Oral Medicine and Radiology, Faculty of Dentistry, Jamia Millia Islamia, New Delhi 110025, India
| | - Ravins Dohare
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi 110025, India
| | - Shweta Arora
- Translational Research Lab, Department of Biotechnology, Faculty of Natural Sciences, Jamia Millia Islamia, New Delhi 110025, India
| | - Sher Ali
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi 110025, India
| | - Shama Parveen
- Centre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, New Delhi 110025, India
| | - Mansoor Ali Syed
- Translational Research Lab, Department of Biotechnology, Faculty of Natural Sciences, Jamia Millia Islamia, New Delhi 110025, India
| |
Collapse
|
9
|
Shi J, Zhang P, Liu L, Min X, Xiao Y. Weighted gene coexpression network analysis identifies a new biomarker of CENPF for prediction disease prognosis and progression in nonmuscle invasive bladder cancer. Mol Genet Genomic Med 2019; 7:e982. [PMID: 31566930 PMCID: PMC6825849 DOI: 10.1002/mgg3.982] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 07/23/2019] [Accepted: 08/29/2019] [Indexed: 11/08/2022] Open
Abstract
BACKGROUND The dreadful prognosis of nonmuscle invasive bladder cancer mainly results from the delay in recognition of individuals with a high risk of progression. Thus, the emphasis of this work lies in developing valuable biomarkers that is conducive to accurately predicting the progression of NMIBC. METHODS Microarray data from GSE32894 including 209 NMIBC samples were performed by weighted gene coexpression network analysis (WGCNA), which could find modules of highly correlated genes and relate modules to external sample traits. Besides, we constructed a protein-protein interaction to facilitate screening the hub gene. At last, we used RNA-seq and microarray data and clinical information from ArrayExpress (E-MTAB-4321) and GSE13507 to select and validate the candidate gene. RESULTS In current paper, blue module of 13 gene coexpression clusters we identified was selected as the key modules. Seven genes namely: CDCA8, CENPF, MCM6, MELK, PRC1, STIL, and TPX2 have been identified as candidate genes. Notably, among them, only elevated CENPF in NIMBC tissue was closely associated with low progression-free survival (PFS) and overall survival (OS) rate in three datasets and had a large area under receiver operating characteristic (ROC) curve. Finally, CENPF was identified as an effective biomarker in NMIBC. CONCLUSION Therefore, our findings submit a new progressive and prognostic molecular marker and therapeutic target for NMIBC. Moreover, these genes that deserve to be further researched may improve the comprehension about the occurrence and development of superficial bladder cancer.
Collapse
Affiliation(s)
- Jiawei Shi
- Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Pu Zhang
- Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Lilong Liu
- Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Xiaobo Min
- Department of Hepatology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Yajun Xiao
- Department of Urology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|