1
|
Hanna EM, El Hasbani G, Azar D. Ant colony optimization for the identification of dysregulated gene subnetworks from expression data. BMC Bioinformatics 2024; 25:254. [PMID: 39090538 PMCID: PMC11295523 DOI: 10.1186/s12859-024-05871-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 07/12/2024] [Indexed: 08/04/2024] Open
Abstract
BACKGROUND High-throughput experimental technologies can provide deeper insights into pathway perturbations in biomedical studies. Accordingly, their usage is central to the identification of molecular targets and the subsequent development of suitable treatments for various diseases. Classical interpretations of generated data, such as differential gene expression and pathway analyses, disregard interconnections between studied genes when looking for gene-disease associations. Given that these interconnections are central to cellular processes, there has been a recent interest in incorporating them in such studies. The latter allows the detection of gene modules that underlie complex phenotypes in gene interaction networks. Existing methods either impose radius-based restrictions or freely grow modules at the expense of a statistical bias towards large modules. We propose a heuristic method, inspired by Ant Colony Optimization, to apply gene-level scoring and module identification with distance-based search constraints and penalties, rather than radius-based constraints. RESULTS We test and compare our results to other approaches using three datasets of different neurodegenerative diseases, namely Alzheimer's, Parkinson's, and Huntington's, over three independent experiments. We report the outcomes of enrichment analyses and concordance of gene-level scores for each disease. Results indicate that the proposed approach generally shows superior stability in comparison to existing methods. It produces stable and meaningful enrichment results in all three datasets which have different case to control proportions and sample sizes. CONCLUSION The presented network-based gene expression analysis approach successfully identifies dysregulated gene modules associated with a certain disease. Using a heuristic based on Ant Colony Optimization, we perform a distance-based search with no radius constraints. Experimental results support the effectiveness and stability of our method in prioritizing modules of high relevance. Our tool is publicly available at github.com/GhadiElHasbani/ACOxGS.git.
Collapse
Affiliation(s)
- Eileen Marie Hanna
- Department of Computer Science and Mathematics, Lebanese American University, Byblos, Lebanon.
| | - Ghadi El Hasbani
- Department of Computer Science and Mathematics, Lebanese American University, Byblos, Lebanon
| | - Danielle Azar
- Department of Computer Science and Mathematics, Lebanese American University, Byblos, Lebanon
| |
Collapse
|
2
|
He A, Yip KC, Lu D, Liu J, Zhang Z, Wang X, Liu Y, Wei Y, Zhang Q, Yan R, Gao F, Li R. Construction of a pathway-level model for preeclampsia based on gene expression data. Hypertens Res 2024:10.1038/s41440-024-01753-0. [PMID: 38914704 DOI: 10.1038/s41440-024-01753-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 05/17/2024] [Accepted: 05/28/2024] [Indexed: 06/26/2024]
Abstract
Preeclampsia (PE) is a heterogeneous disease that seriously affects the health of mothers and fetuses. Lack of detection assays, its diagnosis and intervention are often delayed when the clinical symptoms are atypical. Using personalized pathway-based analysis and machine learning algorithms, we built a PE diagnosis model consisting of nine core pathways using multiple cohorts from the Gene Expression Omnibus database. The model showed an area under the receiver operating characteristic (AUROC) curve of 0.959 with the data from the placental tissue samples in the development cohort. In the two validation cohorts, the AUROCs were 0.898 and 0.876, respectively. The model also performed well with the maternal plasma data in another validation cohort (AUROC: 0.815). Moreover, we identified tyrosine-protein kinase Lck (LCK) as the hub gene in this model and found that LCK and pLCK proteins were downregulated in placentas from PE patients. The pathway-level model for PE can provide a novel direction to develop molecular diagnostic assay and investigate potential mechanisms of PE in future studies.
Collapse
Affiliation(s)
- Andong He
- Department of Obstetrics and Gynecology, Jinan University First Affiliated Hospital, Guangzhou, 510630, China
| | - Ka Cheuk Yip
- Department of Obstetrics and Gynecology, Jinan University First Affiliated Hospital, Guangzhou, 510630, China
| | - Daiqiang Lu
- Institute of Molecular and Medical Virology, School of Medicine, Jinan University, Guangzhou, 510632, China
| | - Jia Liu
- Department of Obstetrics and Gynecology, Jinan University First Affiliated Hospital, Guangzhou, 510630, China
| | - Zunhao Zhang
- Department of Pathology, The First Affiliated Hospital of Jinan University, Guangzhou, 510630, China
| | - Xiufang Wang
- Department of Obstetrics and Gynecology, Jinan University First Affiliated Hospital, Guangzhou, 510630, China
| | - Yifeng Liu
- Institute of Molecular and Medical Virology, School of Medicine, Jinan University, Guangzhou, 510632, China
| | - Yiling Wei
- Department of Obstetrics and Gynecology, Jinan University First Affiliated Hospital, Guangzhou, 510630, China
| | - Qiao Zhang
- Institute of Molecular and Medical Virology, School of Medicine, Jinan University, Guangzhou, 510632, China.
| | - Ruiling Yan
- Department of Obstetrics and Gynecology, Jinan University First Affiliated Hospital, Guangzhou, 510630, China.
| | - Feng Gao
- Institute of Molecular and Medical Virology, School of Medicine, Jinan University, Guangzhou, 510632, China.
| | - Ruiman Li
- Department of Obstetrics and Gynecology, Jinan University First Affiliated Hospital, Guangzhou, 510630, China.
| |
Collapse
|
3
|
Wei L, Xin Y, Pu M, Zhang Y. Patient-specific analysis of co-expression to measure biological network rewiring in individuals. Life Sci Alliance 2024; 7:e202302253. [PMID: 37977656 PMCID: PMC10656351 DOI: 10.26508/lsa.202302253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 11/04/2023] [Accepted: 11/06/2023] [Indexed: 11/19/2023] Open
Abstract
To effectively understand the underlying mechanisms of disease and inform the development of personalized therapies, it is critical to harness the power of differential co-expression (DCE) network analysis. Despite the promise of DCE network analysis in precision medicine, current approaches have a major limitation: they measure an average differential network across multiple samples, which means the specific etiology of individual patients is often overlooked. To address this, we present Cosinet, a DCE-based single-sample network rewiring degree quantification tool. By analyzing two breast cancer datasets, we demonstrate that Cosinet can identify important differences in gene co-expression patterns between individual patients and generate scores for each individual that are significantly associated with overall survival, recurrence-free interval, and other clinical outcomes, even after adjusting for risk factors such as age, tumor size, HER2 status, and PAM50 subtypes. Cosinet represents a remarkable development toward unlocking the potential of DCE analysis in the context of precision medicine.
Collapse
Affiliation(s)
- Lanying Wei
- Beijing StoneWise Technology Co Ltd, Danling SOHO, Beijing, China
| | - Yucui Xin
- Beijing StoneWise Technology Co Ltd, Danling SOHO, Beijing, China
| | - Mengchen Pu
- Beijing StoneWise Technology Co Ltd, Danling SOHO, Beijing, China
| | - Yingsheng Zhang
- Beijing StoneWise Technology Co Ltd, Danling SOHO, Beijing, China
| |
Collapse
|
4
|
Forghani-Ramandi MM, Mostafavi B, Bahavar A, Dehghankar M, Siami Z, Mozhgani SH. Illuminating (HTLV-1)-induced adult T-cell leukemia/lymphoma transcriptomic signature: A systems virology approach. Virus Res 2023; 338:199237. [PMID: 37832654 PMCID: PMC10618755 DOI: 10.1016/j.virusres.2023.199237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 10/01/2023] [Accepted: 10/04/2023] [Indexed: 10/15/2023]
Abstract
BACKGROUND Adult T-cell leukemia/lymphoma (ATLL) is a poor prognosis malignancy of peripheral T-cells caused by human T-cell leukemia virus type 1 (HTLV-1). The low survival rates observed in the patients are the result of the lack of sufficient knowledge about the disease pathogenesis. METHODS In the present study, we first identified differentially expressed genes in ATLL patients and the cellular signaling pathways affected by them. Then, genes of these pathways were subjected to more comprehensive evaluations, including WGCNA and module validation studies on five external datasets. Finally, potential biomarkers were selected for qRT-PCR validation. RESULTS Thirteen signaling pathways, including Apoptosis, Human T-cell leukemia virus 1 infection, IL-17 signaling pathway, pathways in cancer, T cell receptor signaling pathway, Th1 and Th2 cell differentiation, and seven others were selected for deeper investigations. Results of our in-depth bioinformatics evaluations, highlighted pathways related to regulation of immune responses, T-cell receptor and activation, regulation of cell signaling receptors and messengers, Wnt signaling pathway, and apoptosis as key players in ATLL pathogenesis. MAPK3, PIK3CD, KRAS, NFKB1, TNF, PLCB3, PLCB2, PLCB1, MAPK11, JUN, ITPR1, ADCY1, GNAQ, ADCY3, ADCY4, CHEK1, CCND1, SOS2, BAX, FOS and GNA12 were identified as possible biomarkers. Upregulation of ADCY1 and ADCY3 genes was confirmed via qRT-PCR. CONCLUSIONS In this study, we performed a deep bioinformatic examination on a limited set of genes with high probabilities of involvement in the pathogenesis of ATLL. Our results highlighted signaling pathways and genes with potential key roles in disease formation and resistance against current treatment strategies. Further studies are required to test the possible benefits of highlighted genes as biomarkers and targets of treatment.
Collapse
Affiliation(s)
| | - Behnam Mostafavi
- Biomedical Engineering Department, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Tehran, Iran; Department of Microbiology, School of Medicine, Alborz University of Medical Sciences, Karaj, Iran
| | - Atefeh Bahavar
- Department of Microbiology, School of Medicine, Golestan University of Medical Sciences, Gorgan, Iran
| | - Maryam Dehghankar
- Student Research Committee, School of Medicine, Alborz University of Medical Sciences, Karaj, Iran
| | - Zeinab Siami
- Department Infectious Disease and Tropical Medicine, Ziaeian Hospital, School of Medicine, Tehran University of Medical Sciences, Tehran, Iran; Department of Infectious Diseases, School of Medicine, Alborz University of Medical Sciences, Karaj, Iran
| | - Sayed-Hamidreza Mozhgani
- Department of Microbiology, School of Medicine, Alborz University of Medical Sciences, Karaj, Iran; Non-Communicable Disease Research Center, Alborz University of Medical Sciences, Karaj, Iran.
| |
Collapse
|
5
|
Song H, Wu MC. Limitation of permutation-based differential correlation analysis. Genet Epidemiol 2023; 47:637-641. [PMID: 37947279 PMCID: PMC10833089 DOI: 10.1002/gepi.22540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 09/22/2023] [Accepted: 10/26/2023] [Indexed: 11/12/2023]
Abstract
The comparison of biological systems, through the analysis of molecular changes under different conditions, has played a crucial role in the progress of modern biological science. Specifically, differential correlation analysis (DCA) has been employed to determine whether relationships between genomic features differ across conditions or outcomes. Because ascertaining the null distribution of test statistics to capture variations in correlation is challenging, several DCA methods utilize permutation which can loosen parametric (e.g., normality) assumptions. However, permutation is often problematic for DCA due to violating the assumption that samples are exchangeable under the null. Here, we examine the limitations of permutation-based DCA and investigate instances where the permutation-based DCA exhibits poor performance. Experimental results show that the permutation-based DCA often fails to control the type I error under the null hypothesis of equal correlation structures.
Collapse
Affiliation(s)
- Hoseung Song
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
| | - Michael C. Wu
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
| |
Collapse
|
6
|
Zheng C, Wang M, Yamada R, Okada D. Delving into gene-set multiplex networks facilitated by a k-nearest neighbor-based measure of similarity. Comput Struct Biotechnol J 2023; 21:4988-5002. [PMID: 37867964 PMCID: PMC10589751 DOI: 10.1016/j.csbj.2023.09.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 09/22/2023] [Accepted: 09/28/2023] [Indexed: 10/24/2023] Open
Abstract
Gene sets are functional units for living cells. Previously, limited studies investigated the complex relations among gene sets, but documents about their altering patterns across biological conditions still need to be prepared. In this study, we adopted and modified a classical k-nearest neighbor-based association function to detect inter-gene-set similarities. Based on this method, we built multiplex networks of gene sets for the first time; these networks contain layers of gene sets corresponding to different populations of cells. The context-based multiplex networks can capture meaningful biological variation and have considerable differences from knowledge-based networks of gene sets built on Jaccard similarity, as demonstrated in this study. Furthermore, at the scale of individual gene sets, the structural coefficients of gene sets (multiplex PageRank centrality, clustering coefficient, and participation coefficient) disclose the diversity of gene sets from the perspective of structural properties and make it easier to identify unique gene sets. In gene set enrichment analysis (GSEA), each gene set is treated independently, and its contextual and relational attributes are ignored. The structural coefficients of gene sets can supplement GSEA with information about the overall picture of gene sets, promoting the constructive reorganization of the enriched terms and helping researchers better prioritize and select gene sets.
Collapse
Affiliation(s)
- Cheng Zheng
- Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, South Research Bldg. No.1(5F), 53 Shogoinkawahara-cho, Sakyo-ku, Kyoto, 6068507, Kyoto, Japan
| | - Man Wang
- Department of Signal Transduction, Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, 5650871, Osaka, Japan
| | - Ryo Yamada
- Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, South Research Bldg. No.1(5F), 53 Shogoinkawahara-cho, Sakyo-ku, Kyoto, 6068507, Kyoto, Japan
| | - Daigo Okada
- Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, South Research Bldg. No.1(5F), 53 Shogoinkawahara-cho, Sakyo-ku, Kyoto, 6068507, Kyoto, Japan
| |
Collapse
|
7
|
Odaka M, Magnin M, Inoue K. Gene network inference from single-cell omics data and domain knowledge for constructing COVID-19-specific ICAM1-associated pathways. Front Genet 2023; 14:1250545. [PMID: 37719701 PMCID: PMC10501835 DOI: 10.3389/fgene.2023.1250545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 08/16/2023] [Indexed: 09/19/2023] Open
Abstract
Introduction: Intercellular adhesion molecule 1 (ICAM-1) is a critical molecule responsible for interactions between cells. Previous studies have suggested that ICAM-1 triggers cell-to-cell transmission of HIV-1 or HTLV-1, that SARS-CoV-2 shares several features with these viruses via interactions between cells, and that SARS-CoV-2 cell-to-cell transmission is associated with COVID-19 severity. From these previous arguments, it is assumed that ICAM-1 can be related to SARS-CoV-2 cell-to-cell transmission in COVID-19 patients. Indeed, the time-dependent change of the ICAM-1 expression level has been detected in COVID-19 patients. However, signaling pathways that consist of ICAM-1 and other molecules interacting with ICAM-1 are not identified in COVID-19. For example, the current COVID-19 Disease Map has no entry for those pathways. Therefore, discovering unknown ICAM1-associated pathways will be indispensable for clarifying the mechanism of COVID-19. Materials and methods: This study builds ICAM1-associated pathways by gene network inference from single-cell omics data and multiple knowledge bases. First, single-cell omics data analysis extracts coexpressed genes with significant differences in expression levels with spurious correlations removed. Second, knowledge bases validate the models. Finally, mapping the models onto existing pathways identifies new ICAM1-associated pathways. Results: Comparison of the obtained pathways between different cell types and time points reproduces the known pathways and indicates the following two unknown pathways: (1) upstream pathway that includes proteins in the non-canonical NF-κB pathway and (2) downstream pathway that contains integrins and cytoskeleton or motor proteins for cell transformation. Discussion: In this way, data-driven and knowledge-based approaches are integrated into gene network inference for ICAM1-associated pathway construction. The results can contribute to repairing and completing the COVID-19 Disease Map, thereby improving our understanding of the mechanism of COVID-19.
Collapse
Affiliation(s)
- Mitsuhiro Odaka
- The Graduate University for Advanced Studies, SOKENDAI, Tokyo, Japan
- Principles of Informatics Research Division, National Institute of Informatics, Tokyo, Japan
- Laboratoire des Sciences du Numérique de Nantes, École Centrale de Nantes, Nantes Université, UMR 6004, Nantes, France
- Japan Society for the Promotion of Science, Tokyo, Japan
| | - Morgan Magnin
- Principles of Informatics Research Division, National Institute of Informatics, Tokyo, Japan
- Laboratoire des Sciences du Numérique de Nantes, École Centrale de Nantes, Nantes Université, UMR 6004, Nantes, France
| | - Katsumi Inoue
- The Graduate University for Advanced Studies, SOKENDAI, Tokyo, Japan
- Principles of Informatics Research Division, National Institute of Informatics, Tokyo, Japan
- Laboratoire des Sciences du Numérique de Nantes, École Centrale de Nantes, Nantes Université, UMR 6004, Nantes, France
| |
Collapse
|
8
|
Liu Y, Darville T, Zheng X, Li Q. Decomposition of variation of mixed variables by a latent mixed Gaussian copula model. Biometrics 2023; 79:1187-1200. [PMID: 35304917 PMCID: PMC10019899 DOI: 10.1111/biom.13660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Accepted: 03/03/2022] [Indexed: 11/27/2022]
Abstract
Many biomedical studies collect data of mixed types of variables from multiple groups of subjects. Some of these studies aim to find the group-specific and the common variation among all these variables. Even though similar problems have been studied by some previous works, their methods mainly rely on the Pearson correlation, which cannot handle mixed data. To address this issue, we propose a latent mixed Gaussian copula (LMGC) model that can quantify the correlations among binary, ordinal, continuous, and truncated variables in a unified framework. We also provide a tool to decompose the variation into the group-specific and the common variation over multiple groups via solving a regularized M-estimation problem. We conduct extensive simulation studies to show the advantage of our proposed method over the Pearson correlation-based methods. We also demonstrate that by jointly solving the M-estimation problem over multiple groups, our method is better than decomposing the variation group by group. We also apply our method to a Chlamydia trachomatis genital tract infection study to demonstrate how it can be used to discover informative biomarkers that differentiate patients.
Collapse
Affiliation(s)
- Yutong Liu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Toni Darville
- Department of Pediatrics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Xiaojing Zheng
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Department of Pediatrics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Quefeng Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
9
|
Differential expression and cross-correlation between global regulator and pho regulon genes involved in decision-making under phosphate stress. J Appl Genet 2023; 64:173-183. [PMID: 36346581 DOI: 10.1007/s13353-022-00735-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 10/10/2022] [Accepted: 10/26/2022] [Indexed: 11/11/2022]
Abstract
The differential gene expression under phosphate stress conditions leads to cross-talk between the global regulator, pho regulon, and metabolic genes. Promoter activity analysis of the selected 23 genes reveals the dynamic nature of real-time gene expression under different phosphate conditions. The expression profiles of the global regulator (rpoD, soxR, soxS, arcB, and fur), pho regulon (phoH, phoR, phoB, and ugpB), and metabolic genes (sdh, pfkA, ldh) varied significantly on phosphate level variation. Under stress conditions, soxR switches expression partners and co-expresses with rpoS instead of soxS. The partner-switching behavior of the genes under a challenging environment represents the intelligence of functional execution and ensures cell survival. The dynamic expression profile of the selected genes applies a time-lagged correlation to provide insight into the differential gene interaction between time-shifted expression profiles. Under different phosphate conditions, the minimum spanning tree graph revealed a different clustering pattern of selected genes depending on the computed distance and its proximity to other promoters.
Collapse
|
10
|
Jain N, Corken A, Arthur JM, Ware J, Arulprakash N, Dai J, Phadnis MA, Davis O, Rahmatallah Y, Mehta JL, Hedayati SS, Smyth S. Ticagrelor inhibits platelet aggregation and reduces inflammatory burden more than clopidogrel in patients with stages 4 or 5 chronic kidney disease. Vascul Pharmacol 2023; 148:107143. [PMID: 36682595 PMCID: PMC9998358 DOI: 10.1016/j.vph.2023.107143] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 12/27/2022] [Accepted: 01/17/2023] [Indexed: 01/21/2023]
Abstract
BACKGROUND No study has compared pharmacologic properties of ticagrelor and clopidogrel in non-dialysis patients with stage 4-5 chronic kidney disease (CKD). METHODS We conducted a double-blind RCT to compare effects of ticagrelor and clopidogrel in 48 CKD, with the primary outcome of ADP-induced platelet aggregation (WBPA) after 2 weeks of DAPT. In a parallel arm, we compared effects of 2 weeks of ticagrelor plus aspirin on mean changes in WBPA and markers of thromboinflammation among non-CKD controls (n = 26) with that of CKD in the ticagrelor-arm. RESULTS Average age of CKD was 53.7 years, with 62% women, 54% African American, and 42% with stage 5 CKD. Ticagrelor generated statistically lower WBPA values post treatment [median 0 Ω (IQR 0, 2)] vs. clopidogrel [median 0 Ω (IQR 0, 5)] (P = 0.002); percent inhibition of WBPA was greater (87 ± 22% vs. 63 ± 50%; P = 0.04; and plasma IL-6 levels were much lower (8.42 ± 1.73 pg/ml vs. 18.48 ± 26.56 pg/ml; P = 0.04). No differences in mean changes in WBPA between CKD-ticagrelor and control groups were observed. Ticagrelor- DAPT reduced levels of IL-1α and IL-1β in CKD-ticagrelor and control groups, attenuated lowering of TNFα and TRAIL levels in CKD-ticagrelor (vs controls), and had global changes in correlation between various cytokines in a subgroup of CKD-ticagrelor subjects not on statins (n = 10). Peak/trough levels of ticagrelor/metabolite were not different between CKD-ticagrelor and control groups. CONCLUSIONS We report significant differences in platelet aggregation and anti-inflammatory properties between ticagrelor- and clopidogrel-based DAPT in non-dialysis people with stage 4-5 CKD. These notable inflammatory responses suggest ticagrelor-based DAPT might lower inflammatory burden of asymptomatic patients with stage 4 or 5 CKD. (clinicaltrials.gov # NCT03649711).
Collapse
Affiliation(s)
- Nishank Jain
- Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America; Central Arkansas Veterans Health Care System, Little Rock, AR, United States of America.
| | - Adam Corken
- Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - John M Arthur
- Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America; Central Arkansas Veterans Health Care System, Little Rock, AR, United States of America
| | - Jerry Ware
- Department of Physiology and Cell Biology, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Narenraj Arulprakash
- Department of Neurology, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Junqiang Dai
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States of America
| | - Milind A Phadnis
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, United States of America
| | - Otis Davis
- Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Yasir Rahmatallah
- Department of Bioinformatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - J L Mehta
- Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America; Central Arkansas Veterans Health Care System, Little Rock, AR, United States of America
| | - S Susan Hedayati
- Department of Medicine, University of Texas Southwestern Medical Center, Dallas, TX, United States of America
| | - Susan Smyth
- Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America; Central Arkansas Veterans Health Care System, Little Rock, AR, United States of America
| |
Collapse
|
11
|
Cai M, Vesely A, Chen X, Li L, Goeman JJ. NetTDP: permutation-based true discovery proportions for differential co-expression network analysis. Brief Bioinform 2022; 23:6754043. [PMID: 36209415 DOI: 10.1093/bib/bbac417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 08/23/2022] [Accepted: 08/28/2022] [Indexed: 12/14/2022] Open
Abstract
Existing methods for differential network analysis could only infer whether two networks of interest have differences between two groups of samples, but could not quantify and localize network differences. In this work, a novel method, permutation-based Network True Discovery Proportions (NetTDP), is proposed to quantify the number of edges (correlations) or nodes (genes) for which the co-expression networks are different. In the NetTDP method, we propose an edge-level statistic and a node-level statistic, and detect true discoveries of edges and nodes in the sense of differential co-expression network, respectively, by the permutation-based sumSome method. Furthermore, the NetTDP method could further localize the differences by inferring the TDPs for edge or gene subsets of interest, which can be selected post hoc. Our NetTDP method allows inference on data-driven modules or biology-driven gene sets, and remains valid even when these sub-networks are optimized using the same data. Experimental results on both simulation data sets and five real data sets show the effectiveness of the proposed method in inferring the quantification and localization of differential co-expression networks. The R code is available at https://github.com/LiminLi-xjtu/NetTDP.
Collapse
Affiliation(s)
- Menglan Cai
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xianning West, 710049, Shaanxi, China
| | - Anna Vesely
- Department of Statistical Sciences, University of Padova, Italy
| | - Xu Chen
- Department of Biomedical Data Sciences, Leiden University Medical Center, Postbus 9600, 2300 RC Leiden, The Netherlands
| | - Limin Li
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xianning West, 710049, Shaanxi, China
| | - Jelle J Goeman
- Department of Biomedical Data Sciences, Leiden University Medical Center, Postbus 9600, 2300 RC Leiden, The Netherlands
| |
Collapse
|
12
|
Ni Y, He J, Chalise P. Integration of differential expression and network structure for 'omics data analysis. Comput Biol Med 2022; 150:106133. [PMID: 36179515 DOI: 10.1016/j.compbiomed.2022.106133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 08/23/2022] [Accepted: 09/18/2022] [Indexed: 11/25/2022]
Abstract
Differential expression (DE) analysis has been routinely used to identify molecular features that are statistically significantly different between distinct biological groups. In recent years, differential network (DN) analysis has emerged as a powerful approach to uncover molecular network structure changes from one biological condition to the other where the molecular features with larger topological changes are selected as biomarkers. Although a large number of DE and a few DN-based methods are available, they have been usually implemented independently. DE analysis ignores the relationship among molecular features while DN analysis does not account for the expression changes at individual level. Therefore, an integrative analysis approach that accounts for both DE and DN is required to identify disease associated key features. Although, a handful of methods have been proposed, there is no method that optimizes the combination of DE and DN. We propose a novel integrative analysis method, DNrank, to identify disease-associated molecular features that leverages the strengths of both DE and DN by calculating a weight using resampling based cross validation scheme within the algorithm. First, differential expression analysis of individual molecular features is carried out. Second, a differential network structure is constructed using the differential partial correlation analysis. Third, the molecular features are ranked in the order of their significances by integrating their DE measures and DN structure using the modified Google's PageRank algorithm. In the algorithm, the optimum combination of DE and DN analyses is achieved by evaluating the prediction performance of top-ranked features utilizing support vector machine classifier with Monte Carlo cross validation. The proposed method is illustrated using both simulated data and three real data sets. The results show that the proposed method has a better performance in identifying important molecular features with respect to predictive discrimination. Also, as compared to existing feature selection methods, the top-ranked features selected by our method had a higher stability in selection. DNrank allows the researchers to identify the disease-associated features by utilizing both expression and network topology changes between two groups.
Collapse
Affiliation(s)
- Yonghui Ni
- Department of Biostatistics and Data Science, University of Kansas Medical Center, 3901 Rainbow Blvd, Kansas City, KS, 66160, USA
| | - Jianghua He
- Department of Biostatistics and Data Science, University of Kansas Medical Center, 3901 Rainbow Blvd, Kansas City, KS, 66160, USA
| | - Prabhakar Chalise
- Department of Biostatistics and Data Science, University of Kansas Medical Center, 3901 Rainbow Blvd, Kansas City, KS, 66160, USA.
| |
Collapse
|
13
|
Corken A, Ware J, Dai J, Arthur JM, Smyth S, Davis CL, Liu J, Harville TO, Phadnis MA, Mehta JL, Rahmatallah Y, Jain N. Platelet-Dependent Inflammatory Dysregulation in Patients with Stages 4 or 5 Chronic Kidney Disease: A Mechanistic Clinical Study. KIDNEY360 2022; 3:2036-2047. [PMID: 36591354 PMCID: PMC9802560 DOI: 10.34067/kid.0005532022] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 10/04/2022] [Indexed: 11/05/2022]
Abstract
Background Chronic kidney disease (CKD) is characterized by dysregulated inflammation that worsens with CKD severity. The role of platelets in modulating inflammation in stage 4 or 5 CKD remains unexplored. We investigated whether there are changes in platelet-derived thromboinflammatory markers in CKD with dual antiplatelet therapy (DAPT; aspirin 81 mg/d plus P2Y12 inhibitor). Methods In a mechanistic clinical trial, we compared platelet activation markers (aggregation and surface receptor expression), circulating platelet-leukocyte aggregates, leukocyte composition (monocyte subtypes and CD11b surface expression), and plasma cytokine profile (45 analytes) of non-CKD controls (n=26) and CKD outpatients (n=48) with a glomerular filtration rate (GFR) <30 ml/min per 1.73 m2 on 2 weeks of DAPT. Results Patients with CKD demonstrated a reduced mean platelet count, elevated mean platelet volume, reduced platelet-leukocyte aggregates, reduced platelet-bound monocytes, higher total non-classic monocytes in the circulation, and higher levels of IL-1RA, VEGF, and fractalkine (all P<0.05). There were no differences in platelet activation markers between CKD and controls. Although DAPT reduced platelet aggregation in both groups, it had multifaceted effects on thromboinflammatory markers in CKD, including a reduction in PDGF levels in all CKD individuals, reductions in IL-1β and TNF-α levels in select CKD individuals, and no change in a number of other cytokines. Significant positive correlations existed for baseline IL-1β, PDGF, and TNF-α levels with older age, and for baseline TNF-α levels with presence of diabetes mellitus and worse albuminuria. Mean change in IL-1β and PDGF levels on DAPT positively correlated with younger age, mean change in TNF-α levels with higher GFR, and mean changes in PDGF, and TRAIL levels correlated with worse albuminuria. Minimum spanning trees plot of cytokines showed platelet-derived CD40L had a large reduction in weight factor after DAPT in CKD. Additionally, platelet-derived IL-1β and PDGF were tightly correlated with other cytokines, with IL-1β as the hub cytokine. Conclusions Attenuated interactions between platelets and leukocytes in the CKD state coincided with no change in platelet activation status, an altered differentiation state of monocytes, and heightened inflammatory markers. Platelet-derived cytokines were one of the central cytokines in patients with CKD that were tightly correlated with others. DAPT had multifaceted effects on thromboinflammation, suggesting that there is platelet-dependent and -independent inflammation in stage 4 or 5 CKD.
Collapse
Affiliation(s)
- Adam Corken
- Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, Arkansas
| | - Jerry Ware
- Department of Physiology and Cell Biology, University of Arkansas for Medical Sciences, Little Rock, Arkansas
| | - Junqiang Dai
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, Kansas
| | - John M. Arthur
- Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, Arkansas,Central Arkansas Veterans Health Care System, Little Rock, Arkansas
| | - Susan Smyth
- Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, Arkansas
| | - Clayton L. Davis
- Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, Arkansas
| | - Juan Liu
- Department of Pathology, University of Arkansas for Medical Sciences, Little Rock, Arkansas
| | - Terry O. Harville
- Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, Arkansas,Department of Pathology, University of Arkansas for Medical Sciences, Little Rock, Arkansas
| | - Milind A. Phadnis
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, Kansas
| | - Jawahar L. Mehta
- Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, Arkansas,Central Arkansas Veterans Health Care System, Little Rock, Arkansas
| | - Yasir Rahmatallah
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, Arkansas
| | - Nishank Jain
- Department of Medicine, University of Arkansas for Medical Sciences, Little Rock, Arkansas,Central Arkansas Veterans Health Care System, Little Rock, Arkansas
| |
Collapse
|
14
|
Saikia M, Bhattacharyya DK, Kalita JK. CBDCEM: An effective centrality based differential co-expression method for critical gene finding. GENE REPORTS 2022. [DOI: 10.1016/j.genrep.2022.101688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
|
15
|
Pu J, Yu H, Guo Y. A Novel Strategy to Identify Prognosis-Relevant Gene Sets in Cancers. Genes (Basel) 2022; 13:862. [PMID: 35627247 PMCID: PMC9141699 DOI: 10.3390/genes13050862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 05/06/2022] [Accepted: 05/09/2022] [Indexed: 11/16/2022] Open
Abstract
Molecular prognosis markers hold promise for improved prediction of patient survival, and a pathway or gene set may add mechanistic interpretation to their prognostic prediction power. In this study, we demonstrated a novel strategy to identify prognosis-relevant gene sets in cancers. Our study consists of a first round of gene-level analyses and a second round of gene-set-level analyses, in which the Composite Gene Expression Score critically summarizes a surrogate expression value at gene set level and a permutation procedure is exerted to assess prognostic significance of gene sets. An optional differential coexpression module is appended to the two phases of survival analyses to corroborate and refine prognostic gene sets. Our strategy was demonstrated in 33 cancer types across 32,234 gene sets. We found oncogenic gene sets accounted for an increased proportion among the final gene sets, and genes involved in DNA replication and DNA repair have ubiquitous prognositic value for multiple cancer types. In summary, we carried out the largest gene set based prognosis study to date. Compared to previous similar studies, our approach offered multiple improvements in design and methodology implementation. Functionally relevant gene sets of ubiquitous prognostic significance in multiple cancer types were identified.
Collapse
Affiliation(s)
- Junyi Pu
- School of Life Sciences, Northwest University, Xi’an 710069, China;
| | - Hui Yu
- Comprehensive Cancer Center, New Mexico University, Albuquerque, NM 87131, USA;
| | - Yan Guo
- Comprehensive Cancer Center, New Mexico University, Albuquerque, NM 87131, USA;
| |
Collapse
|
16
|
Panditrao G, Bhowmick R, Meena C, Sarkar RR. Emerging landscape of molecular interaction networks: Opportunities, challenges and prospects. J Biosci 2022. [PMID: 36210749 PMCID: PMC9018971 DOI: 10.1007/s12038-022-00253-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Network biology finds application in interpreting molecular interaction networks and providing insightful inferences using graph theoretical analysis of biological systems. The integration of computational bio-modelling approaches with different hybrid network-based techniques provides additional information about the behaviour of complex systems. With increasing advances in high-throughput technologies in biological research, attempts have been made to incorporate this information into network structures, which has led to a continuous update of network biology approaches over time. The newly minted centrality measures accommodate the details of omics data and regulatory network structure information. The unification of graph network properties with classical mathematical and computational modelling approaches and technologically advanced approaches like machine-learning- and artificial intelligence-based algorithms leverages the potential application of these techniques. These computational advances prove beneficial and serve various applications such as essential gene prediction, identification of drug–disease interaction and gene prioritization. Hence, in this review, we have provided a comprehensive overview of the emerging landscape of molecular interaction networks using graph theoretical approaches. With the aim to provide information on the wide range of applications of network biology approaches in understanding the interaction and regulation of genes, proteins, enzymes and metabolites at different molecular levels, we have reviewed the methods that utilize network topological properties, emerging hybrid network-based approaches and applications that integrate machine learning techniques to analyse molecular interaction networks. Further, we have discussed the applications of these approaches in biomedical research with a note on future prospects.
Collapse
Affiliation(s)
- Gauri Panditrao
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Rupa Bhowmick
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| | - Chandrakala Meena
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
| | - Ram Rup Sarkar
- Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Pune, 411008 India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002 India
| |
Collapse
|
17
|
Matsui Y, Abe Y, Uno K, Miyano S. RoDiCE: robust differential protein co-expression analysis for cancer complexome. Bioinformatics 2022; 38:1269-1276. [PMID: 34529752 DOI: 10.1093/bioinformatics/btab612] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 08/09/2021] [Accepted: 08/23/2021] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION The full spectrum of abnormalities in cancer-associated protein complexes remains largely unknown. Comparing the co-expression structure of each protein complex between tumor and healthy cells may provide insights regarding cancer-specific protein dysfunction. However, the technical limitations of mass spectrometry-based proteomics, including contamination with biological protein variants, causes noise that leads to non-negligible over- (or under-) estimating co-expression. RESULTS We propose a robust algorithm for identifying protein complex aberrations in cancer based on differential protein co-expression testing. Our method based on a copula is sufficient for improving identification accuracy with noisy data compared to conventional linear correlation-based approaches. As an application, we use large-scale proteomic data from renal cancer to show that important protein complexes, regulatory signaling pathways and drug targets can be identified. The proposed approach surpasses traditional linear correlations to provide insights into higher-order differential co-expression structures. AVAILABILITY AND IMPLEMENTATION https://github.com/ymatts/RoDiCE. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yusuke Matsui
- Biomedical and Health Informatics Unit, Department of Integrated Health Science, Nagoya University Graduate School of Medicine, 461-8673 Nagoya, Aichi, Japan.,Institute for Glyco-core Research (iGCORE), Nagoya University, 461-8673 Nagoya, Aichi, Japan
| | - Yuichi Abe
- Division of Molecular Diagnostics, Aichi Cancer Center Research Institute, 464-0021 Nagoya, Aichi, Japan
| | - Kohei Uno
- Biomedical and Health Informatics Unit, Department of Integrated Health Science, Nagoya University Graduate School of Medicine, 461-8673 Nagoya, Aichi, Japan
| | - Satoru Miyano
- Department of Integrated Data Science, M&D Data Science Center, Tokyo Medical and Dental University, 113-8510 Tokyo, Japan
| |
Collapse
|
18
|
Abstract
DNA microarrays are widely used to investigate gene expression. Even though the classical analysis of microarray data is based on the study of differentially expressed genes, it is well known that genes do not act individually. Network analysis can be applied to study association patterns of the genes in a biological system. Moreover, it finds wide application in differential coexpression analysis between different systems. Network based coexpression studies have for example been used in (complex) disease gene prioritization, disease subtyping, and patient stratification.In this chapter we provide an overview of the methods and tools used to create networks from microarray data and describe multiple methods on how to analyze a single network or a group of networks. The described methods range from topological metrics, functional group identification to data integration strategies, topological pathway analysis as well as graphical models.
Collapse
Affiliation(s)
- Alisa Pavel
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Angela Serra
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Luca Cattelani
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Antonio Federico
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
- BioMediTech Institute, Tampere University, Tampere, Finland
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland
| | - Dario Greco
- Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland.
- BioMediTech Institute, Tampere University, Tampere, Finland.
- Finnish Hub for Development and Validation of Integrated Approaches (FHAIVE), Tampere University, Tampere, Finland.
- Institute of Biotechnology , University of Helsinki, Helsinki, Finland.
| |
Collapse
|
19
|
Grimes T, Datta S. A novel probabilistic generator for large-scale gene association networks. PLoS One 2021; 16:e0259193. [PMID: 34767561 PMCID: PMC8589155 DOI: 10.1371/journal.pone.0259193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Accepted: 10/14/2021] [Indexed: 11/18/2022] Open
Abstract
MOTIVATION Gene expression data provide an opportunity for reverse-engineering gene-gene associations using network inference methods. However, it is difficult to assess the performance of these methods because the true underlying network is unknown in real data. Current benchmarks address this problem by subsampling a known regulatory network to conduct simulations. But the topology of regulatory networks can vary greatly across organisms or tissues, and reference-based generators-such as GeneNetWeaver-are not designed to capture this heterogeneity. This means, for example, benchmark results from the E. coli regulatory network will not carry over to other organisms or tissues. In contrast, probabilistic generators do not require a reference network, and they have the potential to capture a rich distribution of topologies. This makes probabilistic generators an ideal approach for obtaining a robust benchmarking of network inference methods. RESULTS We propose a novel probabilistic network generator that (1) provides an alternative to address the inherent limitation of reference-based generators and (2) is able to create realistic gene association networks, and (3) captures the heterogeneity found across gold-standard networks better than existing generators used in practice. Eight organism-specific and 12 human tissue-specific gold-standard association networks are considered. Several measures of global topology are used to determine the similarity of generated networks to the gold-standards. Along with demonstrating the variability of network structure across organisms and tissues, we show that the commonly used "scale-free" model is insufficient for replicating these structures. AVAILABILITY This generator is implemented in the R package "SeqNet" and is available on CRAN (https://cran.r-project.org/web/packages/SeqNet/index.html).
Collapse
Affiliation(s)
- Tyler Grimes
- Department of Biostatistics, University of Florida, Gainesville, Florida, United States of America
| | - Somnath Datta
- Department of Biostatistics, University of Florida, Gainesville, Florida, United States of America
| |
Collapse
|
20
|
Wang L, Xie W, Li K, Wang Z, Li X, Feng W, Li J. DysPIA: A Novel Dysregulated Pathway Identification Analysis Method. Front Genet 2021; 12:647653. [PMID: 34290733 PMCID: PMC8287415 DOI: 10.3389/fgene.2021.647653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 04/20/2021] [Indexed: 11/13/2022] Open
Abstract
Differential co-expression-based pathway analysis is still limited and not widely used. In most current methods, the pathways were considered as gene sets, but the gene regulation relationships were not considered, and the computational speed was slow. In this article, we proposed a novel Dysregulated Pathway Identification Analysis (DysPIA) method to overcome these shortcomings. We adopted the idea of Correlation by Individual Level Product into analysis and performed a fast enrichment analysis. We constructed a combined gene-pair background which was much more sufficient than the background used in Edge Set Enrichment Analysis. In simulation study, DysPIA was able to identify the causal pathways with high AUC (0.9584 to 0.9896). In p53 mutation data, DysPIA obtained better performance than other methods. It obtained more potential dysregulated pathways that could be literature verified, and it ran much faster (∼1,700-8,000 times faster than other methods when 10,000 permutations). DysPIA was also applied to breast cancer relapse dataset and breast cancer subtype dataset. The results show that DysPIA is effective and has a great biological significance. R packages "DysPIA" and "DysPIAData" are constructed and freely available on R CRAN (https://cran.r-project.org/web/packages/DysPIA/index.html and https://cran.r-project.org/web/packages/DysPIAData/index.html), and on GitHub (https://github.com/lemonwang2020).
Collapse
Affiliation(s)
- Limei Wang
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China.,Key Laboratory of Tropical Translational Medicine, Ministry of Education, College of Biomedical Information and Engineering, Hainan Medical University, Haikou, China.,College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Weixin Xie
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China
| | - Kongning Li
- Key Laboratory of Tropical Translational Medicine, Ministry of Education, College of Biomedical Information and Engineering, Hainan Medical University, Haikou, China
| | - Zhenzhen Wang
- Key Laboratory of Tropical Translational Medicine, Ministry of Education, College of Biomedical Information and Engineering, Hainan Medical University, Haikou, China
| | - Xia Li
- Key Laboratory of Tropical Translational Medicine, Ministry of Education, College of Biomedical Information and Engineering, Hainan Medical University, Haikou, China.,College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Weixing Feng
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, China
| | - Jin Li
- Key Laboratory of Tropical Translational Medicine, Ministry of Education, College of Biomedical Information and Engineering, Hainan Medical University, Haikou, China.,College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| |
Collapse
|
21
|
Grimes T, Datta S. SeqNet: An R Package for Generating Gene-Gene Networks and Simulating RNA-Seq Data. J Stat Softw 2021; 98:10.18637/jss.v098.i12. [PMID: 34321962 PMCID: PMC8315007 DOI: 10.18637/jss.v098.i12] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Gene expression data provide an abundant resource for inferring connections in gene regulatory networks. While methodologies developed for this task have shown success, a challenge remains in comparing the performance among methods. Gold-standard datasets are scarce and limited in use. And while tools for simulating expression data are available, they are not designed to resemble the data obtained from RNA-seq experiments. SeqNet is an R package that provides tools for generating a rich variety of gene network structures and simulating RNA-seq data from them. This produces in silico RNA-seq data for benchmarking and assessing gene network inference methods. The package is available on CRAN and on GitHub at https://github.com/tgrimes/SeqNet.
Collapse
Affiliation(s)
- Tyler Grimes
- Univeristy of Florida, Department of Biostatistics
| | | |
Collapse
|
22
|
Arbet J, Zhuang Y, Litkowski E, Saba L, Kechris K. Comparing Statistical Tests for Differential Network Analysis of Gene Modules. Front Genet 2021; 12:630215. [PMID: 34093641 PMCID: PMC8170128 DOI: 10.3389/fgene.2021.630215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 04/19/2021] [Indexed: 11/13/2022] Open
Abstract
Genes often work together to perform complex biological processes, and "networks" provide a versatile framework for representing the interactions between multiple genes. Differential network analysis (DiNA) quantifies how this network structure differs between two or more groups/phenotypes (e.g., disease subjects and healthy controls), with the goal of determining whether differences in network structure can help explain differences between phenotypes. In this paper, we focus on gene co-expression networks, although in principle, the methods studied can be used for DiNA for other types of features (e.g., metabolome, epigenome, microbiome, proteome, etc.). Three common applications of DiNA involve (1) testing whether the connections to a single gene differ between groups, (2) testing whether the connection between a pair of genes differs between groups, or (3) testing whether the connections within a "module" (a subset of 3 or more genes) differs between groups. This article focuses on the latter, as there is a lack of studies comparing statistical methods for identifying differentially co-expressed modules (DCMs). Through extensive simulations, we compare several previously proposed test statistics and a new p-norm difference test (PND). We demonstrate that the true positive rate of the proposed PND test is competitive with and often higher than the other methods, while controlling the false positive rate. The R package discoMod (differentially co-expressed modules) implements the proposed method and provides a full pipeline for identifying DCMs: clustering tools to derive gene modules, tests to identify DCMs, and methods for visualizing the results.
Collapse
Affiliation(s)
- Jaron Arbet
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| | - Yaxu Zhuang
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| | - Elizabeth Litkowski
- Department of Epidemiology, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| | - Laura Saba
- Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora CO, United States
| | - Katerina Kechris
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| |
Collapse
|
23
|
Oh M, Kim K, Sun H. Covariance thresholding to detect differentially co-expressed genes from microarray gene expression data. J Bioinform Comput Biol 2021; 18:2050002. [PMID: 32336254 DOI: 10.1142/s021972002050002x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Gene set analysis aims to identify differentially expressed or co-expressed genes within a biological pathway between two experimental conditions, so that it can eventually reveal biological processes and pathways involved in disease development. In the last few decades, various statistical and computational methods have been proposed to improve statistical power of gene set analysis. In recent years, much attention has been paid to differentially co-expressed genes since they can be potentially disease-related genes without significant difference in average expression levels between two conditions. In this paper, we propose a new statistical method to identify differentially co-expressed genes from microarray gene expression data. The proposed method first estimates co-expression levels of paired genes using covariance regularization by thresholding, and then significance of difference in covariance estimation between two conditions is evaluated. We demonstrated that the proposed method is more powerful than the existing main-stream methods to detect co-expressed genes through extensive simulation studies. Also, we applied it to various microarray gene expression datasets related with mutant p53 transcriptional activity, and epithelium and stroma breast cancer.
Collapse
Affiliation(s)
- Mingyu Oh
- Department of Statistics, Pusan National University, Busan, 46241, Korea
| | - Kipoong Kim
- Department of Statistics, Pusan National University, Busan, 46241, Korea
| | - Hokeun Sun
- Department of Statistics, Pusan National University, Busan, 46241, Korea
| |
Collapse
|
24
|
Guo Y, Yu H, Song H, He J, Oyebamiji O, Kang H, Ping J, Ness S, Shyr Y, Ye F. MetaGSCA: A tool for meta-analysis of gene set differential coexpression. PLoS Comput Biol 2021; 17:e1008976. [PMID: 33945541 PMCID: PMC8121311 DOI: 10.1371/journal.pcbi.1008976] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 05/14/2021] [Accepted: 04/18/2021] [Indexed: 01/24/2023] Open
Abstract
Analyses of gene set differential coexpression may shed light on molecular mechanisms underlying phenotypes and diseases. However, differential coexpression analyses of conceptually similar individual studies are often inconsistent and underpowered to provide definitive results. Researchers can greatly benefit from an open-source application facilitating the aggregation of evidence of differential coexpression across studies and the estimation of more robust common effects. We developed Meta Gene Set Coexpression Analysis (MetaGSCA), an analytical tool to systematically assess differential coexpression of an a priori defined gene set by aggregating evidence across studies to provide a definitive result. In the kernel, a nonparametric approach that accounts for the gene-gene correlation structure is used to test whether the gene set is differentially coexpressed between two comparative conditions, from which a permutation test p-statistic is computed for each individual study. A meta-analysis is then performed to combine individual study results with one of two options: a random-intercept logistic regression model or the inverse variance method. We demonstrated MetaGSCA in case studies investigating two human diseases and identified pathways highly relevant to each disease across studies. We further applied MetaGSCA in a pan-cancer analysis with hundreds of major cellular pathways in 11 cancer types. The results indicated that a majority of the pathways identified were dysregulated in the pan-cancer scenario, many of which have been previously reported in the cancer literature. Our analysis with randomly generated gene sets showed excellent specificity, indicating that the significant pathways/gene sets identified by MetaGSCA are unlikely false positives. MetaGSCA is a user-friendly tool implemented in both forms of a Web-based application and an R package “MetaGSCA”. It enables comprehensive meta-analyses of gene set differential coexpression data, with an optional module of post hoc pathway crosstalk network analysis to identify and visualize pathways having similar coexpression profiles. Analyses of gene set differential coexpression often shed light on molecular mechanisms underlying phenotypes and diseases. However, results from conceptually similar individual studies are often inconsistent and underpowered to reach definitive conclusions. We provide an open-source application facilitating the aggregation of evidence of differential coexpression across studies and the estimation of more robust common effects, with an optional module of post hoc pathway crosstalk network analysis to identify and visualize pathways having similar coexpression profiles. We established the usefulness of MetaGSCA via case studies of chronic kidney disease and non-small cell lung cancer, and applied it to a pan-cancer analysis of 11 cancer types. We further demonstrated the tool with 100 randomly generated gene sets and showed excellent specificity, indicating low false positive rates.
Collapse
Affiliation(s)
- Yan Guo
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, New Mexico, United States of America
| | - Hui Yu
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, New Mexico, United States of America
| | - Haocan Song
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Jiapeng He
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, New Mexico, United States of America
| | - Olufunmilola Oyebamiji
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, New Mexico, United States of America
| | - Huining Kang
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, New Mexico, United States of America
| | - Jie Ping
- Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Scott Ness
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, New Mexico, United States of America
| | - Yu Shyr
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Vanderbilt Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Fei Ye
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Vanderbilt Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
25
|
Yu H, Guo Y, Chen J, Chen X, Jia P, Zhao Z. Rewired Pathways and Disrupted Pathway Crosstalk in Schizophrenia Transcriptomes by Multiple Differential Coexpression Methods. Genes (Basel) 2021; 12:665. [PMID: 33946654 PMCID: PMC8146818 DOI: 10.3390/genes12050665] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Revised: 04/16/2021] [Accepted: 04/27/2021] [Indexed: 02/03/2023] Open
Abstract
Transcriptomic studies of mental disorders using the human brain tissues have been limited, and gene expression signatures in schizophrenia (SCZ) remain elusive. In this study, we applied three differential co-expression methods to analyze five transcriptomic datasets (three RNA-Seq and two microarray datasets) derived from SCZ and matched normal postmortem brain samples. We aimed to uncover biological pathways where internal correlation structure was rewired or inter-coordination was disrupted in SCZ. In total, we identified 60 rewired pathways, many of which were related to neurotransmitter, synapse, immune, and cell adhesion. We found the hub genes, which were on the center of rewired pathways, were highly mutually consistent among the five datasets. The combinatory list of 92 hub genes was generally multi-functional, suggesting their complex and dynamic roles in SCZ pathophysiology. In our constructed pathway crosstalk network, we found "Clostridium neurotoxicity" and "signaling events mediated by focal adhesion kinase" had the highest interactions. We further identified disconnected gene links underlying the disrupted pathway crosstalk. Among them, four gene pairs (PAK1:SYT1, PAK1:RFC5, DCTN1:STX1A, and GRIA1:MAP2K4) were normally correlated in universal contexts. In summary, we systematically identified rewired pathways, disrupted pathway crosstalk circuits, and critical genes and gene links in schizophrenia transcriptomes.
Collapse
Affiliation(s)
- Hui Yu
- Department of Internal Medicine, University of New Mexico, Albuquerque, NM 87131, USA; (H.Y.); (Y.G.)
| | - Yan Guo
- Department of Internal Medicine, University of New Mexico, Albuquerque, NM 87131, USA; (H.Y.); (Y.G.)
| | - Jingchun Chen
- Nevada Institute of Personalized Medicine, University of Nevada Las Vegas, Las Vegas, NV 89154, USA; (J.C.); (X.C.)
| | - Xiangning Chen
- Nevada Institute of Personalized Medicine, University of Nevada Las Vegas, Las Vegas, NV 89154, USA; (J.C.); (X.C.)
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA;
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA;
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203, USA
| |
Collapse
|
26
|
Ebrahimpoor M, Spitali P, Goeman JJ, Tsonaka R. Pathway testing for longitudinal metabolomics. Stat Med 2021; 40:3053-3065. [PMID: 33768548 PMCID: PMC8252476 DOI: 10.1002/sim.8957] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 02/19/2021] [Accepted: 03/04/2021] [Indexed: 01/12/2023]
Abstract
We propose a top‐down approach for pathway analysis of longitudinal metabolite data. We apply a score test based on a shared latent process mixed model which can identify pathways with differentially progressing metabolites. The strength of our approach is that it can handle unbalanced designs, deals with potential missing values in the longitudinal markers, and gives valid results even with small sample sizes. Contrary to bottom‐up approaches, correlations between metabolites are explicitly modeled leveraging power gains. For large pathway sizes, a computationally efficient solution is proposed based on pseudo‐likelihood methodology. We demonstrate the advantages of the proposed method in identification of differentially expressed pathways through simulation studies. Finally, longitudinal metabolite data from a mice experiment is analyzed to demonstrate our methodology.
Collapse
Affiliation(s)
- Mitra Ebrahimpoor
- Medical Statistics, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Pietro Spitali
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Jelle J Goeman
- Medical Statistics, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| | - Roula Tsonaka
- Medical Statistics, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
27
|
Abstract
Background:
Gene set enrichment analyses (GSEA) provide a useful and powerful
approach to identify differentially expressed gene sets with prior biological knowledge. Several
GSEA algorithms have been proposed to perform enrichment analyses on groups of genes.
However, many of these algorithms have focused on the identification of differentially expressed
gene sets in a given phenotype.
Objective:
In this paper, we propose a gene set analytic framework, Gene Set Correlation Analysis (GSCoA), that simultaneously measures within and between gene sets variation to identify sets of genes enriched for differential expression
and highly co-related pathways.
Methods:
We apply co-inertia analysis to the comparisons of cross-gene sets in gene expression data
to measure the co-structure of expression profiles in pairs of gene sets. Co-inertia analysis (CIA) is
one multivariate method to identify trends or co-relationships in multiple datasets, which contain the
same samples. The objective of CIA is to seek ordinations (dimension reduction diagrams) of two
gene sets such that the square covariance between the projections of the gene sets on successive axes
is maximized. Simulation studies illustrate that CIA offers superior performance in identifying corelationships
between gene sets in all simulation settings when compared to correlation-based gene
set methods.
Result and Conclusion:
We also combine between-gene set CIA and GSEA to discover the relationships between gene
sets significantly associated with phenotypes. In addition, we provide a graphical technique for visualizing and simultaneously exploring the associations of between and within gene sets and their interaction and network. We then demonstrate
integration of within and between gene sets variation using CIA and GSEA, applied to the p53 gene expression data using
the c2 curated gene sets. Ultimately, the GSCoA approach provides an attractive tool for identification and visualization
of novel associations between pairs of gene sets by integrating co-relationships between gene sets into gene set analysis.
Collapse
Affiliation(s)
- Chen-An Tsai
- Department of Agronomy, National Taiwan University, Taipei,Taiwan
| | - James J. Chen
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, FDA, Jefferson, AR 72079,United States
| |
Collapse
|
28
|
Inference of Networks from Large Datasets. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11345-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
29
|
Zhang D, Guo Y, Xie N. Prognostic value and co-expression patterns of metabolic pathways in cancers. BMC Genomics 2020; 21:860. [PMID: 33372594 PMCID: PMC7771089 DOI: 10.1186/s12864-020-07251-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Accepted: 11/18/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Abnormal metabolic pathways have been considered as one of the hallmarks of cancer. While numerous metabolic pathways have been studied in various cancers, the direct link between metabolic pathway gene expression and cancer prognosis has not been established. RESULTS Using two recently developed bioinformatics analysis methods, we evaluated the prognosis potential of metabolic pathway expression and tumor-vs-normal dysregulations for up to 29 metabolic pathways in 33 cancer types. Results show that increased metabolic gene expression within tumors corresponds to poor cancer prognosis. Meta differential co-expression analysis identified four metabolic pathways with significant global co-expression network disturbance between tumor and normal samples. Differential expression analysis of metabolic pathways also demonstrated strong gene expression disturbance between paired tumor and normal samples. CONCLUSION Taken together, these results strongly suggested that metabolic pathway gene expressions are disturbed after tumorigenesis. Within tumors, many metabolic pathways are upregulated for tumor cells to activate corresponding metabolisms to sustain the required energy for cell division.
Collapse
Affiliation(s)
- Dan Zhang
- Biobank, Shenzhen Second People's Hospital, The First Affiliated Hospital of Shenzhen University, Health Science Center, Shenzhen, 518035, China
| | - Yan Guo
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, 87131, USA
| | - Ni Xie
- Biobank, Shenzhen Second People's Hospital, The First Affiliated Hospital of Shenzhen University, Health Science Center, Shenzhen, 518035, China.
| |
Collapse
|
30
|
TCox: Correlation-Based Regularization Applied to Colorectal Cancer Survival Data. Biomedicines 2020; 8:biomedicines8110488. [PMID: 33182598 PMCID: PMC7696515 DOI: 10.3390/biomedicines8110488] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/26/2020] [Accepted: 11/06/2020] [Indexed: 01/29/2023] Open
Abstract
Colorectal cancer (CRC) is one of the leading causes of mortality and morbidity in the world. Being a heterogeneous disease, cancer therapy and prognosis represent a significant challenge to medical care. The molecular information improves the accuracy with which patients are classified and treated since similar pathologies may show different clinical outcomes and other responses to treatment. However, the high dimensionality of gene expression data makes the selection of novel genes a problematic task. We propose TCox, a novel penalization function for Cox models, which promotes the selection of genes that have distinct correlation patterns in normal vs. tumor tissues. We compare TCox to other regularized survival models, Elastic Net, HubCox, and OrphanCox. Gene expression and clinical data of CRC and normal (TCGA) patients are used for model evaluation. Each model is tested 100 times. Within a specific run, eighteen of the features selected by TCox are also selected by the other survival regression models tested, therefore undoubtedly being crucial players in the survival of colorectal cancer patients. Moreover, the TCox model exclusively selects genes able to categorize patients into significant risk groups. Our work demonstrates the ability of the proposed weighted regularizer TCox to disclose novel molecular drivers in CRC survival by accounting for correlation-based network information from both tumor and normal tissue. The results presented support the relevance of network information for biomarker identification in high-dimensional gene expression data and foster new directions for the development of network-based feature selection methods in precision oncology.
Collapse
|
31
|
Bloch NI, Corral‐López A, Buechel SD, Kotrschal A, Kolm N, Mank JE. Different mating contexts lead to extensive rewiring of female brain coexpression networks in the guppy. GENES BRAIN AND BEHAVIOR 2020; 20:e12697. [DOI: 10.1111/gbb.12697] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 08/10/2020] [Accepted: 08/29/2020] [Indexed: 12/19/2022]
Affiliation(s)
- Natasha I. Bloch
- Department of Biomedical Engineering Universidad de Los Andes Bogotá D.C. Colombia
| | - Alberto Corral‐López
- Department of Zoology/Ethology Stockholm University Stockholm Sweden
- Department of Genetics, Evolution and Environment University College London UK
| | | | - Alexander Kotrschal
- Department of Zoology/Ethology Stockholm University Stockholm Sweden
- Wageningen University Behavioral Ecology Group Wageningen Netherlands
| | - Niclas Kolm
- Department of Zoology/Ethology Stockholm University Stockholm Sweden
| | - Judith E. Mank
- University of British Columbia Department of Zoology and Biodiversity Research Centre Vancouver Canada
- Department of Genetics, Evolution and Environment University College London UK
| |
Collapse
|
32
|
Yu H, Chen D, Oyebamiji O, Zhao YY, Guo Y. Expression correlation attenuates within and between key signaling pathways in chronic kidney disease. BMC Med Genomics 2020; 13:134. [PMID: 32957963 PMCID: PMC7504859 DOI: 10.1186/s12920-020-00772-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Compared to the conventional differential expression approach, differential coexpression analysis represents a different yet complementary perspective into diseased transcriptomes. In particular, global loss of transcriptome correlation was previously observed in aging mice, and a most recent study found genetic and environmental perturbations on human subjects tended to cause universal attenuation of transcriptome coherence. While methodological progresses surrounding differential coexpression have helped with research on several human diseases, there has not been an investigation of coexpression disruptions in chronic kidney disease (CKD) yet. METHODS RNA-seq was performed on total RNAs of kidney tissue samples from 140 CKD patients. A combination of differential coexpression methods were employed to analyze the transcriptome transition in CKD from the early, mild phase to the late, severe kidney damage phase. RESULTS We discovered a global expression correlation attenuation in CKD progression, with pathway Regulation of nuclear SMAD2/3 signaling demonstrating the most remarkable intra-pathway correlation rewiring. Moreover, the pathway Signaling events mediated by focal adhesion kinase displayed significantly weakened crosstalk with seven pathways, including Regulation of nuclear SMAD2/3 signaling. Well-known relevant genes, such as ACTN4, were characterized with widespread correlation disassociation with partners from a wide array of signaling pathways. CONCLUSIONS Altogether, our analysis reported a global expression correlation attenuation within and between key signaling pathways in chronic kidney disease, and presented a list of vanishing hub genes and disrupted correlations within and between key signaling pathways, illuminating on the pathophysiological mechanisms of CKD progression.
Collapse
Affiliation(s)
- Hui Yu
- Department of Internal Medicine, University of New Mexico, Albuquerque, NM 87131 USA
| | - Danqian Chen
- Key Laboratory of Resource Biology and Biotechnology in Western China, School of Life Sciences, Northwest University, Xi’an, 710069 Shaanxi China
| | | | - Ying-Yong Zhao
- Key Laboratory of Resource Biology and Biotechnology in Western China, School of Life Sciences, Northwest University, Xi’an, 710069 Shaanxi China
| | - Yan Guo
- Department of Internal Medicine, University of New Mexico, Albuquerque, NM 87131 USA
| |
Collapse
|
33
|
Zhang L, Peng TL, Wang L, Meng XH, Zhu W, Zeng Y, Zhu JQ, Zhou Y, Xiao HM, Deng HW. Network-based Transcriptome-wide Expression Study for Postmenopausal Osteoporosis. J Clin Endocrinol Metab 2020; 105:5850085. [PMID: 32483604 PMCID: PMC7320836 DOI: 10.1210/clinem/dgaa319] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 05/27/2020] [Indexed: 01/08/2023]
Abstract
PURPOSE Menopause is a crucial physiological transition during a woman's life, and it occurs with growing risks of health issues like osteoporosis. To identify postmenopausal osteoporosis-related genes, we performed transcriptome-wide expression analyses for human peripheral blood monocytes (PBMs) using Affymetrix 1.0 ST arrays in 40 Caucasian postmenopausal women with discordant bone mineral density (BMD) levels. METHODS We performed multiscale embedded gene coexpression network analysis (MEGENA) to study functionally orchestrating clusters of differentially expressed genes in the form of functional networks. Gene sets net correlations analysis (GSNCA) was applied to assess how the coexpression structure of a predefined gene set differs in high and low BMD groups. Bayesian network (BN) analysis was used to identify important regulation patterns between potential risk genes for osteoporosis. A small interfering ribonucleic acid (siRNA)-based gene silencing in vitro experiment was performed to validate the findings from BN analysis. RESULT MEGENA showed that the "T cell receptor signaling pathway" and the "osteoclast differentiation pathway" were significantly enriched in the identified compact network, which is significantly correlated with BMD variation. GSNCA revealed that the coexpression structure of the "Signaling by TGF-beta receptor complex pathway" is significantly different between the 2 BMD discordant groups; the hub genes in the postmenopausal low and high BMD group are FURIN and SMAD3 respectively. With siRNA in vitro experiments, we confirmed the regulation relationship of TGFBR2-SMAD7 and TGFBR1-SMURF2. MAIN CONCLUSION The present study suggests that biological signals involved in monocyte recruitment, monocyte/macrophage lineage development, osteoclast formation, and osteoclast differentiation might function together in PBMs that contribute to the pathogenesis of postmenopausal osteoporosis.
Collapse
Affiliation(s)
- Lan Zhang
- Center for Biomedical informatics and Genomics, Department of Medicine, Tulane University, New Orleans, Louisiana
| | - Tian-Liu Peng
- Institute of Reproduction and Stem Cell Engineering, School of Basic Medical Science, Central South University, Changsha, Hunan, China
| | - Le Wang
- Institute of Reproduction and Stem Cell Engineering, School of Basic Medical Science, Central South University, Changsha, Hunan, China
| | - Xiang-He Meng
- Laboratory of Molecular and Statistical Genetics, College of Life Sciences, Hunan Normal University, Changsha, Hunan, China
| | - Wei Zhu
- Center for Biomedical informatics and Genomics, Department of Medicine, Tulane University, New Orleans, Louisiana
| | - Yong Zeng
- Center for Biomedical informatics and Genomics, Department of Medicine, Tulane University, New Orleans, Louisiana
| | - Jia-Qiang Zhu
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan
| | - Yu Zhou
- Center for Biomedical informatics and Genomics, Department of Medicine, Tulane University, New Orleans, Louisiana
| | - Hong-Mei Xiao
- Institute of Reproduction and Stem Cell Engineering, School of Basic Medical Science, Central South University, Changsha, Hunan, China
| | - Hong-Wen Deng
- Center for Biomedical informatics and Genomics, Department of Medicine, Tulane University, New Orleans, Louisiana
- Correspondence and Reprint Requests: Hong-Wen Deng, Center for Biomedical Informatics and Genomics, Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112, USA. E-mail:
| |
Collapse
|
34
|
Chowdhury HA, Bhattacharyya DK, Kalita JK. (Differential) Co-Expression Analysis of Gene Expression: A Survey of Best Practices. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1154-1173. [PMID: 30668502 DOI: 10.1109/tcbb.2019.2893170] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Analysis of gene expression data is widely used in transcriptomic studies to understand functions of molecules inside a cell and interactions among molecules. Differential co-expression analysis studies diseases and phenotypic variations by finding modules of genes whose co-expression patterns vary across conditions. We review the best practices in gene expression data analysis in terms of analysis of (differential) co-expression, co-expression network, differential networking, and differential connectivity considering both microarray and RNA-seq data along with comparisons. We highlight hurdles in RNA-seq data analysis using methods developed for microarrays. We include discussion of necessary tools for gene expression analysis throughout the paper. In addition, we shed light on scRNA-seq data analysis by including preprocessing and scRNA-seq in co-expression analysis along with useful tools specific to scRNA-seq. To get insights, biological interpretation and functional profiling is included. Finally, we provide guidelines for the analyst, along with research issues and challenges which should be addressed.
Collapse
|
35
|
Fifteen Years of Gene Set Analysis for High-Throughput Genomic Data: A Review of Statistical Approaches and Future Challenges. ENTROPY 2020; 22:e22040427. [PMID: 33286201 PMCID: PMC7516904 DOI: 10.3390/e22040427] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Revised: 03/18/2020] [Accepted: 04/03/2020] [Indexed: 12/22/2022]
Abstract
Over the last decade, gene set analysis has become the first choice for gaining insights into underlying complex biology of diseases through gene expression and gene association studies. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Although gene set analysis approaches are extensively used in gene expression and genome wide association data analysis, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. In this article, we provide a comprehensive overview, statistical structure and steps of gene set analysis approaches used for microarrays, RNA-sequencing and genome wide association data analysis. Further, we also classify the gene set analysis approaches and tools by the type of genomic study, null hypothesis, sampling model and nature of the test statistic, etc. Rather than reviewing the gene set analysis approaches individually, we provide the generation-wise evolution of such approaches for microarrays, RNA-sequencing and genome wide association studies and discuss their relative merits and limitations. Here, we identify the key biological and statistical challenges in current gene set analysis, which will be addressed by statisticians and biologists collectively in order to develop the next generation of gene set analysis approaches. Further, this study will serve as a catalog and provide guidelines to genome researchers and experimental biologists for choosing the proper gene set analysis approach based on several factors.
Collapse
|
36
|
Zhao Y, Piekos S, Hoang TH, Shin DG. A framework using topological pathways for deeper analysis of transcriptome data. BMC Genomics 2020; 21:834. [PMID: 32138666 PMCID: PMC7057456 DOI: 10.1186/s12864-019-6155-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 09/30/2019] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Pathway analysis is one of the later stage data analysis steps essential in interpreting high-throughput gene expression data. We propose a set of algorithms which given gene expression data can recognize which portion of sub-pathways are actively utilized in the biological system being studied. The degree of activation is measured by conditional probability of the input expression data based on the Bayesian Network model constructed from the topological pathway. RESULTS We demonstrate the effectiveness of our pathway analysis method by conducting two case studies. The first one applies our method to a well-studied temporal microarray data set for the cell cycle using the KEGG Cell Cycle pathway. Our method closely reproduces the biological claims associated with the data sets, but unlike the original work ours can produce how pathway routes interact with each other above and beyond merely identifying which pathway routes are involved in the process. The second study applies the method to the p53 mutation microarray data to perform a comparative study. CONCLUSIONS We show that our method achieves comparable performance against all other pathway analysis systems included in this study in identifying p53 altered pathways. Our method could pave a new way of carrying out next generation pathway analysis.
Collapse
Affiliation(s)
- Yue Zhao
- Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, 06269 USA
| | - Stephanie Piekos
- Department of Pharmaceutical Sciences, University of Connecticut, 69 North Eagleville Road, Unit 3092, Storrs, USA
| | - Tham H. Hoang
- Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, 06269 USA
| | - Dong-Guk Shin
- Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, 06269 USA
| |
Collapse
|
37
|
X-Module: A novel fusion measure to associate co-expressed gene modules from condition-specific expression profiles. J Biosci 2020. [DOI: 10.1007/s12038-020-0007-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
38
|
Lan X, Lin W, Xu Y, Xu Y, Lv Z, Chen W. The detection and analysis of differential regulatory communities in lung cancer. Genomics 2020; 112:2535-2540. [PMID: 32045668 DOI: 10.1016/j.ygeno.2020.02.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Revised: 01/06/2020] [Accepted: 02/07/2020] [Indexed: 02/07/2023]
Abstract
The tumorgenesis process of lung cancer involves the regulatory dysfunctions of multiple pathways. Although many signaling pathways have been identified to be associated with lung cancer, there are little quantitative models of how inactions between genes change during the process from normal to cancer. These changes belong to different dynamic co-expressions patterns. We quantitatively analyzed differential co-expression of gene pairs in four datasets. Each dataset included a large number of lung cancer and normal samples. By overlapping their results, we got 14 highly confident gene pairs with consistent co-expression change patterns. Some of they, such as ARHGAP30 and GIMAP4, had been recorded in STRING network database while some of them were novel discoveries, such as C9orf135 and MORN5, TEKT1 and TSPAN1 were positively correlated in both normal and cancer but more correlated in normal than cancer. These gene pairs revealed the underlying mechanisms of lung cancer occurrence.
Collapse
Affiliation(s)
- Xiu Lan
- Department of Respiratory Medicine, Lishui Central Hospital, Lishui, China
| | - Weilong Lin
- Department of Orthopedics, Lishui Traditional Chinese Medicine Hospital, Lishui, China
| | - Yufen Xu
- Department of Oncology, The First Hospital of Jiaxing, The First Affiliated Hospital of Jiaxing University, Jiaxing, China; Department of Respiratory Medicine, Lishui Central Hospital, Lishui, China
| | - Yanyan Xu
- Department of Pharmacy, Lishui Central Hospital, Lishui, China
| | - Zhuqing Lv
- Department of Respiratory Medicine, Lishui Central Hospital, Lishui, China
| | - Wenyu Chen
- Department of Respiration, The First Hospital of Jiaxing, The First Affiliated Hospital of Jiaxing University, Jiaxing, China.
| |
Collapse
|
39
|
Kakati T, Bhattacharyya DK, Kalita JK. X-Module: A novel fusion measure to associate co-expressed gene modules from condition-specific expression profiles. J Biosci 2020; 45:33. [PMID: 32098912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
A gene co-expression network (CEN) is of biological interest, since co-expressed genes share common functions and biological processes or pathways. Finding relationships among modules can reveal inter-modular preservation, and similarity in transcriptome, functional, and biological behaviors among modules of the same or two different datasets. There is no method which explores the one-to-one relationships and one-to-many relationships among modules extracted from control and disease samples based on both topological and semantic similarity using both microarray and RNA seq data. In this work, we propose a novel fusion measure to detect mapping between modules from two sets of co-expressed modules extracted from control and disease stages of Alzheimer's disease (AD) and Parkinson's disease (PD) datasets. Our measure considers both topological and biological information of a module and is an estimation of four parameters, namely, semantic similarity, eigengene correlation, degree difference, and the number of common genes. We analyze the consensus modules shared between both control and disease stages in terms of their association with diseases. We also validate the close associations between human and chimpanzee modules and compare with the state-ofthe- art method. Additionally, we propose two novel observations on the relationships between modules for further analysis.
Collapse
Affiliation(s)
- Tulika Kakati
- Department of Computer Science and Engineering, Tezpur University, Tezpur, Assam, India
| | | | | |
Collapse
|
40
|
Bhuva DD, Cursons J, Smyth GK, Davis MJ. Differential co-expression-based detection of conditional relationships in transcriptional data: comparative analysis and application to breast cancer. Genome Biol 2019; 20:236. [PMID: 31727119 PMCID: PMC6857226 DOI: 10.1186/s13059-019-1851-8] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 10/02/2019] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Elucidation of regulatory networks, including identification of regulatory mechanisms specific to a given biological context, is a key aim in systems biology. This has motivated the move from co-expression to differential co-expression analysis and numerous methods have been developed subsequently to address this task; however, evaluation of methods and interpretation of the resulting networks has been hindered by the lack of known context-specific regulatory interactions. RESULTS In this study, we develop a simulator based on dynamical systems modelling capable of simulating differential co-expression patterns. With the simulator and an evaluation framework, we benchmark and characterise the performance of inference methods. Defining three different levels of "true" networks for each simulation, we show that accurate inference of causation is difficult for all methods, compared to inference of associations. We show that a z-score-based method has the best general performance. Further, analysis of simulation parameters reveals five network and simulation properties that explained the performance of methods. The evaluation framework and inference methods used in this study are available in the dcanr R/Bioconductor package. CONCLUSIONS Our analysis of networks inferred from simulated data show that hub nodes are more likely to be differentially regulated targets than transcription factors. Based on this observation, we propose an interpretation of the inferred differential network that can reconstruct a putative causal network.
Collapse
Affiliation(s)
- Dharmesh D Bhuva
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.,School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Joseph Cursons
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.,Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Gordon K Smyth
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.,School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Melissa J Davis
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia. .,Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, VIC, 3010, Australia. .,Department of Clinical Pathology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, VIC, 3010, Australia.
| |
Collapse
|
41
|
Zhang J, Zhu W, Wang Q, Gu J, Huang LF, Sun X. Differential regulatory network-based quantification and prioritization of key genes underlying cancer drug resistance based on time-course RNA-seq data. PLoS Comput Biol 2019; 15:e1007435. [PMID: 31682596 PMCID: PMC6827891 DOI: 10.1371/journal.pcbi.1007435] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Accepted: 09/24/2019] [Indexed: 12/22/2022] Open
Abstract
Drug resistance is a major cause for the failure of cancer chemotherapy or targeted therapy. However, the molecular regulatory mechanisms controlling the dynamic evolvement of drug resistance remain poorly understood. Thus, it is important to develop methods for identifying key gene regulatory mechanisms of the resistance to specific drugs. In this study, we developed a data-driven computational framework, DryNetMC, using a differential regulatory network-based modeling and characterization strategy to quantify and prioritize key genes underlying cancer drug resistance. The DryNetMC does not only infer gene regulatory networks (GRNs) via an integrated approach, but also characterizes and quantifies dynamical network properties for measuring node importance. We used time-course RNA-seq data from glioma cells treated with dbcAMP (a cAMP activator) as a realistic case to reconstruct the GRNs for sensitive and resistant cells. Based on a novel node importance index that comprehensively quantifies network topology, network entropy and expression dynamics, the top ranked genes were verified to be predictive of the drug sensitivities of different glioma cell lines, in comparison with other existing methods. The proposed method provides a quantitative approach to gain insights into the dynamic adaptation and regulatory mechanisms of cancer drug resistance and sheds light on the design of novel biomarkers or targets for predicting or overcoming drug resistance.
Collapse
Affiliation(s)
- Jiajun Zhang
- School of Mathematics, Sun Yat-Sen University, Guangzhou, China
| | - Wenbo Zhu
- Department of Pharmacology, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
| | - Qianliang Wang
- School of Mathematics, Sun Yat-Sen University, Guangzhou, China
| | - Jiayu Gu
- Department of Pharmacology, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
| | - L. Frank Huang
- Brain Tumor Center, Division of Experimental Hematology and Cancer Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, United States of America
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States of America
| | - Xiaoqiang Sun
- Department of Medical Informatics, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China; Key Laboratory of Tropical Disease Control (Sun Yat-Sen University), Chinese Ministry of Education, Guangzhou, Guangdong, China
| |
Collapse
|
42
|
Majumder A, Sarkar M, Sharma P. A Composite Mode Differential Gene Regulatory Architecture based on Temporal Expression Profiles. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1785-1793. [PMID: 29993888 DOI: 10.1109/tcbb.2018.2828418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Exploring the complex interactive mechanism in a Gene Regulatory Network (GRN) developed using transcriptome data obtained from standard microarray and/or RNA-seq experiments helps us to understand the triggering factors in cancer research. The Transcription Factor (TF) genes generate protein complexes which affect the transcription of various target genes. However, considering the mode of regulation in a time frame such transcriptional activities are dependent on some specific activation time points only. It is also crucial to check whether the regulating capabilities are uniform across varied stages, especially when periodicity is a big issue. In this context, we propose an algorithm called RIFT which helps to monitor the temporal differential regulatory pattern of a Differentially Expressed (DE) target gene either by a TF gene or a group of TF genes from a large time series (TS) data. We have tested our algorithm on HeLa cell cycle data and compared the result with its most advanced state of the art counterpart proposed so far. As our algorithm yields up stringent mode and target specific significant valid TF genes for a DE gene, we can expect to have new forms of genetic interactions.
Collapse
|
43
|
Glazko G, Zybailov B, Emmert-Streib F, Baranova A, Rahmatallah Y. Proteome-transcriptome alignment of molecular portraits achieved by self-contained gene set analysis: Consensus colon cancer subtypes case study. PLoS One 2019; 14:e0221444. [PMID: 31437237 PMCID: PMC6705791 DOI: 10.1371/journal.pone.0221444] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Accepted: 08/06/2019] [Indexed: 01/10/2023] Open
Abstract
Gene set analysis (GSA) has become the common methodology for analyzing transcriptomics data. However, self-contained GSA techniques are rarely, if ever, used for proteomics data analysis. Here we present a self-contained proteome level GSA of four consensus molecular subtypes (CMSs) previously established by transcriptome dissection of colon carcinoma specimens. Despite notable difference in structure of proteomics and transcriptomics data, many pathway-wide characteristic features of CMSs found at the mRNA level were reproduced at the protein level. In particular, CMS1 features show heavy involvement of immune system as well as the pathways related to mismatch repair, DNA replication and functioning of proteasome, while CMS4 tumors upregulate complement pathway and proteins participating in epithelial-to-mesenchymal transition (EMT). In addition, protein level GSA yielded a set of novel observations visible at the proteome, but not at the transcriptome level, including possible involvement of major histocompatibility complex II (MHC-II) antigens in the known immunogenicity of CMS1 and a connection between cholesterol trafficking and the regulation of Integrin-linked kinase (ILK) in CMS3. Overall, this study proves utility of self-contained GSA approaches as a critical tool for analyzing proteomics data in general and dissecting protein-level molecular portraits of human tumors in particular.
Collapse
Affiliation(s)
- Galina Glazko
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Boris Zybailov
- Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Frank Emmert-Streib
- Computational Medicine and Statistical Learning Laboratory, Tampere University of Technology, Korkeakoulunkatu, Tampere, Finland FI
| | - Ancha Baranova
- School of Systems Biology, George Mason University, Manassas VA, United States of America
- Research Center for Medical Genetics, Moscow, Russia
| | - Yasir Rahmatallah
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| |
Collapse
|
44
|
Understanding Statistical Hypothesis Testing: The Logic of Statistical Inference. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2019. [DOI: 10.3390/make1030054] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Statistical hypothesis testing is among the most misunderstood quantitative analysis methods from data science. Despite its seeming simplicity, it has complex interdependencies between its procedural components. In this paper, we discuss the underlying logic behind statistical hypothesis testing, the formal meaning of its components and their connections. Our presentation is applicable to all statistical hypothesis tests as generic backbone and, hence, useful across all application domains in data science and artificial intelligence.
Collapse
|
45
|
Jardim VC, Santos SDS, Fujita A, Buckeridge MS. BioNetStat: A Tool for Biological Networks Differential Analysis. Front Genet 2019; 10:594. [PMID: 31293621 PMCID: PMC6598498 DOI: 10.3389/fgene.2019.00594] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 06/05/2019] [Indexed: 01/25/2023] Open
Abstract
The study of interactions among biological components can be carried out by using methods grounded on network theory. Most of these methods focus on the comparison of two biological networks (e.g., control vs. disease). However, biological systems often present more than two biological states (e.g., tumor grades). To compare two or more networks simultaneously, we developed BioNetStat, a Bioconductor package with a user-friendly graphical interface. BioNetStat compares correlation networks based on the probability distribution of a feature of the graph (e.g., centrality measures). The analysis of the structural alterations on the network reveals significant modifications in the system. For example, the analysis of centrality measures provides information about how the relevance of the nodes changes among the biological states. We evaluated the performance of BioNetStat in both, toy models and two case studies. The latter related to gene expression of tumor cells and plant metabolism. Results based on simulated scenarios suggest that the statistical power of BioNetStat is less sensitive to the increase of the number of networks than Gene Set Coexpression Analysis (GSCA). Also, besides being able to identify nodes with modified centralities, BioNetStat identified altered networks associated with signaling pathways that were not identified by other methods.
Collapse
Affiliation(s)
- Vinícius Carvalho Jardim
- Department of Computer Science, Institute of Mathematics and Statistics, University of São Paulo, São Paulo, Brazil
- Department of Botany, Institute of Biosciences, University of São Paulo, São Paulo, Brazil
| | - Suzana de Siqueira Santos
- Department of Computer Science, Institute of Mathematics and Statistics, University of São Paulo, São Paulo, Brazil
| | - Andre Fujita
- Department of Computer Science, Institute of Mathematics and Statistics, University of São Paulo, São Paulo, Brazil
| | | |
Collapse
|
46
|
Grimes T, Potter SS, Datta S. Integrating gene regulatory pathways into differential network analysis of gene expression data. Sci Rep 2019; 9:5479. [PMID: 30940863 PMCID: PMC6445151 DOI: 10.1038/s41598-019-41918-3] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 03/12/2019] [Indexed: 12/22/2022] Open
Abstract
The advent of next-generation sequencing has introduced new opportunities in analyzing gene expression data. Research in systems biology has taken advantage of these opportunities by gleaning insights into gene regulatory networks through the analysis of gene association networks. Contrasting networks from different populations can reveal the many different roles genes fill, which can lead to new discoveries in gene function. Pathologies can also arise from aberrations in these gene-gene interactions. Exposing these network irregularities provides a new avenue for understanding and treating diseases. A general framework for integrating known gene regulatory pathways into a differential network analysis between two populations is proposed. The framework importantly allows for any gene-gene association measure to be used, and inference is carried out through permutation testing. A simulation study investigates the performance in identifying differentially connected genes when incorporating known pathways, even if the pathway knowledge is partially inaccurate. Another simulation study compares the general framework with four state-of-the-art methods. Two RNA-seq datasets are analyzed to illustrate the use of this framework in practice. In both examples, the analysis reveals genes and pathways that are known to be biologically significant along with potentially novel findings that may be used to motivate future research.
Collapse
Affiliation(s)
- Tyler Grimes
- University of Florida, Department of Biostatistics, Gainesville, 32611, USA
| | - S Steven Potter
- University of Cincinnati, Department of Pediatrics, Cincinnati, 45229, USA
| | - Somnath Datta
- University of Florida, Department of Biostatistics, Gainesville, 32611, USA.
| |
Collapse
|
47
|
Abstract
Gene expression profiling by microarray has been used to uncover molecular variations in many areas. The traditional analysis method to gene expression profiling just focuses on the individual genes, and the interactions among genes are ignored, while genes play their roles not by isolations but by interactions with each other. Consequently, gene-to-gene coexpression analysis emerged as a powerful approach to solve the above problems. Then complementary to the conventional differential expression analysis, the differential coexpression analysis can identify gene markers from the systematic level. There are three aspects for differential coexpression network analysis including the network global topological comparison, differential coexpression module identification, and differential coexpression genes and gene pairs identification. To date, the coexpression network and differential coexpression analysis are widely used in a variety of areas in response to environmental stresses, genetic differences, or disease changes. In this chapter, we reviewed the existing methods for differential coexpression network analysis and discussed the applications to cancer research.
Collapse
Affiliation(s)
- Bao-Hong Liu
- State Key Laboratory of Veterinary Etiological Biology; Key Laboratory of Veterinary Parasitology of Gansu Province; Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu Province, People's Republic of China. .,Jiangsu Co-Innovation Center for Prevention and Control of Animal Infectious Diseases and Zoonoses, Yangzhou, People's Republic of China.
| |
Collapse
|
48
|
Chen P, Long B, Xu Y, Wu W, Zhang S. Identification of Crucial Genes and Pathways in Human Arrhythmogenic Right Ventricular Cardiomyopathy by Coexpression Analysis. Front Physiol 2018; 9:1778. [PMID: 30574098 PMCID: PMC6291487 DOI: 10.3389/fphys.2018.01778] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Accepted: 11/23/2018] [Indexed: 12/19/2022] Open
Abstract
As one common disease causing young people to die suddenly due to cardiac arrest, arrhythmogenic right ventricular cardiomyopathy (ARVC) is a disorder of heart muscle whose progression covers one complicated gene interaction network that influence the diagnosis and prognosis of it. In our research, differentially expressed genes (DEGs) were screened, and we established a weighted gene coexpression network analysis (WGCNA) and gene set net correlations analysis (GSNCA) for identifying crucial genes as well as pathways related to ARVC pathogenic mechanism (n = 12). In the research, the results demonstrated that there were 619 DEGs in total between non-failing donor myocardial samples and ARVC tissues (FDR < 0.05). WGCNA analysis identified the two gene modules (brown and turquoise) as being most significantly associated with ARVC state. Then the ARVC-related four key biological pathways (cytokine–cytokine receptor interaction, chemokine signaling pathway, neuroactive ligand receptor interaction, and JAK-STAT signaling pathway) and four hub genes (CXCL2, TNFRSF11B, LIFR, and C5AR1) in ARVC samples were further identified by GSNCA method. Finally, we used t-test and receiver operating characteristic (ROC) curves for validating hub genes, results showed significant differences in t-test and their AUC areas all greater than 0.8. Together, these results revealed that the new four hub genes as well as key pathways that might be involved into ARVC diagnosis. Even though further experimental validation is required for the implication by association, our findings demonstrate that the computational methods based on systems biology might complement the traditional gene-wide approaches, as such, might offer a new insight in therapeutic intervention within rare diseases of people like ARVC.
Collapse
Affiliation(s)
- Peipei Chen
- Department of Cardiology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Bo Long
- Central Research Laboratory, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yi Xu
- Department of Cardiology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Wei Wu
- Department of Cardiology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Shuyang Zhang
- Department of Cardiology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
49
|
Zhang J. Bioinformatics analysis of novel transcription factors and related differentially regulated modules in non-union skeletal fractures. J Back Musculoskelet Rehabil 2018; 31:623-628. [PMID: 29578472 DOI: 10.3233/bmr-169596] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
OBJECTIVE This study aimed to further clarify the underlying pathomechanism of non-union skeletal fractures. METHODS Gene expression profile dataset GSE494 obtained from six non-union skeletal fracture and six normal samples was downloaded from the Gene Expression Omnibus database. Overlapping genes in at least two platforms were analyzed, and differentially expressed genes (DEGs) between normal and disease groups were screened. Transcriptional regulatory relationships and differentially regulated modules of various transcription factors (TFs) were determined. Differentially regulated modules with unknown functions were subjected to functional enrichment analysis. RESULTS Overall, 4,252 overlapping genes in at least two platforms and 77 DEGs, including 31 up and 46 downregulated genes, were obtained. Overall, 64,623 transcriptional regulatory relationships, including 49 TFs and 3,900 target genes, and 9 significant modules for differential regulation were identified. Three modules with unknown functions regulated by TFs, including zinc finger, ZZ-type containing 3 (ZZZ3), nuclear TF Y, alpha (NFYA), and POU class 2 homeobox 2 (POU2F2), were identified. Enriched GO-BP terms of NFYA and POU2F2 modules included cell adhesion and related terms and those of ZZ3 included cell cycle, cell proliferation, and associated terms. CONCLUSION Three TFs, including ZZZ3, POU2F2, and NFYA, and their regulated modules may have important effects on non-union skeletal fractures. Cell proliferation may be related with ZZZ3; cell adhesion and its similar process may be related with POU2F2 and NFYA.
Collapse
|
50
|
WANG J, BI Y, LI J, TIAN Y, YANG X, SUN Z. Optimal Function Prediction of Key Aberrant Genes in Early-onset Preeclampsia Using a Modified Network-based Guilt by Association Method. IRANIAN JOURNAL OF PUBLIC HEALTH 2018; 47:1688-1693. [PMID: 30581785 PMCID: PMC6294846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
BACKGROUND To predict the optimal functions of key aberrant genes in early-onset preeclampsia (EOPE) by using a modified network-based gene function inference method. METHODS First, differentially expressed genes (DEGs) were extracted using linear models for microarray data (LIMMA) package. Then the Spearman's rank correlation coefficient was calculated to assess co-expressed strength of each interaction between DEGs, based on which the co-expressed genes network was constructed to vividly exhibit their interlinking relationship. Subsequently, Gene ontology (GO) annotations for EOPE were collected according to known confirmed database and DEGs. Ultimately, the multifunctionality algorithm was used to extend the "guilt by association" method based on the co-expressed network, and a 3-fold cross validation was operated to evaluate the accuracy of the algorithm. RESULTS During the process, the GO terms, of which the area under the curve (AUC) over 0.7 were screened as the optimal gene functions for EOPE. Six functions including the ion binding and cellular response to stimulus were determined as the optimal gene functions. CONCLUSION Such findings should help to better understand the pathogenesis of EOPE, so as to provide some references for clinical diagnosis and treatment in the future.
Collapse
Affiliation(s)
- Jing WANG
- Dept. of Obstetrics, Seventh People’s Hospital of Jinan, Jiyang, Shandong, China,Corresponding Author:
| | - Yanping BI
- Dept. of Obstetrics, Seventh People’s Hospital of Jinan, Jiyang, Shandong, China
| | - Junxia LI
- Dept. of Obstetrics, Seventh People’s Hospital of Jinan, Jiyang, Shandong, China
| | - Yanfang TIAN
- Dept. of Obstetrics, Seventh People’s Hospital of Jinan, Jiyang, Shandong, China
| | - Xue YANG
- Dept. General Surgery, Seventh People’s Hospital of Jinan, Jiyang, Shandong, China
| | - Zhongfang SUN
- Dept. of Obstetrics, Seventh People’s Hospital of Jinan, Jiyang, Shandong, China
| |
Collapse
|