1
|
Lui VG, Ghosh T, Rymaszewski A, Chen S, Baxter RM, Kong DS, Ghosh D, Routes JM, Verbsky JW, Hsieh EWY. Dysregulated Lymphocyte Antigen Receptor Signaling in Common Variable Immunodeficiency with Granulomatous Lymphocytic Interstitial Lung Disease. J Clin Immunol 2023; 43:1311-1325. [PMID: 37093407 PMCID: PMC10524976 DOI: 10.1007/s10875-023-01485-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Accepted: 04/04/2023] [Indexed: 04/25/2023]
Abstract
PURPOSE A subset of common variable immunodeficiency (CVID) patients either presents with or develops autoimmune and lymphoproliferative complications, such as granulomatous lymphocytic interstitial lung disease (GLILD), a major cause of morbidity and mortality in CVID. While a myriad of phenotypic lymphocyte derangements has been associated with and described in GLILD, defects in T and B cell antigen receptor (TCR/BCR) signaling in CVID and CVID with GLILD (CVID/GLILD) remain undefined, hindering discovery of biomarkers for disease monitoring, prognostic prediction, and personalized medicine approaches. METHODS To identify perturbations of immune cell subsets and TCR/BCR signal transduction, we applied mass cytometry analysis to peripheral blood mononuclear cells (PBMCs) from healthy control participants (HC), CVID, and CVID/GLILD patients. RESULTS Patients with CVID, regardless of GLILD status, had increased frequency of HLADR+CD4+ T cells, CD57+CD8+ T cells, and CD21lo B cells when compared to healthy controls. Within these cellular populations in CVID/GLILD patients only, engagement of T or B cell antigen receptors resulted in discordant downstream signaling responses compared to CVID. In CVID/GLILD patients, CD21lo B cells showed perturbed BCR-mediated phospholipase C gamma and extracellular signal-regulated kinase activation, while HLADR+CD4+ T cells and CD57+CD8+ T cells displayed disrupted TCR-mediated activation of kinases most proximal to the receptor. CONCLUSION Both CVID and CVID/GLILD patients demonstrate an activated T and B cell phenotype compared to HC. However, only CVID/GLILD patients exhibit altered TCR/BCR signaling in the activated lymphocyte subsets. These findings contribute to our understanding of the mechanisms of immune dysregulation in CVID with GLILD.
Collapse
Affiliation(s)
- Victor G Lui
- Department of Immunology and Microbiology, School of Medicine, University of Colorado, 12800 East 19Th Ave, Mail Stop 8333, RC1 North P18-8117, Aurora, CO, 80045, USA
| | - Tusharkanti Ghosh
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado, Aurora, CO, USA
| | - Amy Rymaszewski
- Division of Allergy and Clinical Immunology, Department of Pediatrics, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Shaoying Chen
- Division of Rheumatology, Department of Pediatrics, Medical College of Wisconsin, Milwaukee, WI, USA
- Division of Asthma, Allergy, and Clinical Immunology, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Ryan M Baxter
- Department of Immunology and Microbiology, School of Medicine, University of Colorado, 12800 East 19Th Ave, Mail Stop 8333, RC1 North P18-8117, Aurora, CO, 80045, USA
| | - Daniel S Kong
- Department of Immunology and Microbiology, School of Medicine, University of Colorado, 12800 East 19Th Ave, Mail Stop 8333, RC1 North P18-8117, Aurora, CO, 80045, USA
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado, Aurora, CO, USA
| | - John M Routes
- Division of Allergy and Clinical Immunology, Department of Pediatrics, Medical College of Wisconsin, Milwaukee, WI, USA
- Children's Research Institute, Medical College of Wisconsin, Milwaukee, WI, USA
| | - James W Verbsky
- Division of Rheumatology, Department of Pediatrics, Medical College of Wisconsin, Milwaukee, WI, USA
- Children's Research Institute, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Elena W Y Hsieh
- Department of Immunology and Microbiology, School of Medicine, University of Colorado, 12800 East 19Th Ave, Mail Stop 8333, RC1 North P18-8117, Aurora, CO, 80045, USA.
- Department of Pediatrics, Section of Allergy and Immunology, School of Medicine, University of Colorado, Aurora, CO, USA.
- Children's Hospital Colorado, Aurora, CO, USA.
| |
Collapse
|
2
|
Bahadoor A, Robinson KA, Loewen MC, Demissie ZA. Clonostachys rosea 'omics profiling: identification of putative metabolite-gene associations mediating its in vitro antagonism against Fusarium graminearum. BMC Genomics 2023; 24:352. [PMID: 37365507 DOI: 10.1186/s12864-023-09463-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 06/17/2023] [Indexed: 06/28/2023] Open
Abstract
BACKGROUND Clonostachys rosea is an established biocontrol agent. Selected strains have either mycoparasitic activity against known pathogens (e.g. Fusarium species) and/or plant growth promoting activity on various crops. Here we report outcomes from a comparative 'omics analysis leveraging a temporal variation in the in vitro antagonistic activities of C. rosea strains ACM941 and 88-710, toward understanding the molecular mechanisms underpinning mycoparasitism. RESULTS Transcriptomic data highlighted specialized metabolism and membrane transport related genes as being significantly upregulated in ACM941 compared to 88-710 at a time point when the ACM941 strain had higher in vitro antagonistic activity than 88-710. In addition, high molecular weight specialized metabolites were differentially secreted by ACM941, with accumulation patterns of some metabolites matching the growth inhibition differences displayed by the exometabolites of the two strains. In an attempt to identify statistically relevant relationships between upregulated genes and differentially secreted metabolites, transcript and metabolomic abundance data were associated using IntLIM (Integration through Linear Modeling). Of several testable candidate associations, a putative C. rosea epidithiodiketopiperazine (ETP) gene cluster was identified as a prime candidate based on both co-regulation analysis and transcriptomic-metabolomic data association. CONCLUSIONS Although remaining to be validated functionally, these results suggest that a data integration approach may be useful for identification of potential biomarkers underlying functional divergence in C. rosea strains.
Collapse
Affiliation(s)
- Adilah Bahadoor
- Metrology Research Center, National Research Council Canada, 1200 Montreal Rd, Ottawa, ON, K1A 0R6, Canada
| | - Kelly A Robinson
- Aquatic and Crop Resource Development, National Research Council of Canada, Ottawa, ON, Canada
| | - Michele C Loewen
- Aquatic and Crop Resource Development, National Research Council of Canada, Ottawa, ON, Canada.
| | - Zerihun A Demissie
- Aquatic and Crop Resource Development, National Research Council of Canada, Ottawa, ON, Canada.
| |
Collapse
|
3
|
Duan M, Liu Y, Zhao D, Li H, Zhang G, Liu H, Wang Y, Fan Y, Huang L, Zhou F. Gender-specific dysregulations of nondifferentially expressed biomarkers of metastatic colon cancer. Comput Biol Chem 2023; 104:107858. [PMID: 37058814 DOI: 10.1016/j.compbiolchem.2023.107858] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 03/12/2023] [Accepted: 03/29/2023] [Indexed: 04/16/2023]
Abstract
Colon cancer is a common cancer type in both sexes and its mortality rate increases at the metastatic stage. Most studies exclude nondifferentially expressed genes from biomarker analysis of metastatic colon cancers. The motivation of this study is to find the latent associations of the nondifferentially expressed genes with metastatic colon cancers and to evaluate the gender specificity of such associations. This study formulates the expression level prediction of a gene as a regression model trained for primary colon cancers. The difference between a gene's predicted and original expression levels in a testing sample is defined as its mqTrans value (model-based quantitative measure of transcription regulation), which quantitatively measures the change of the gene's transcription regulation in this testing sample. We use the mqTrans analysis to detect the messenger RNA (mRNA) genes with nondifferential expression on their original expression levels but differentially expressed mqTrans values between primary and metastatic colon cancers. These genes are referred to as dark biomarkers of metastatic colon cancer. All dark biomarker genes were verified by two transcriptome profiling technologies, RNA-seq and microarray. The mqTrans analysis of a mixed cohort of both sexes could not recover gender-specific dark biomarkers. Most dark biomarkers overlap with long non-coding RNAs (lncRNAs), and these lncRNAs might have contributed their transcripts to calculating the dark biomarkers' expression levels. Therefore, mqTrans analysis serves as a complementary approach to identify dark biomarkers generally ignored by conventional studies, and it is essential to separate the female and male samples into two analysis experiments. The dataset and mqTrans analysis code are available at https://figshare.com/articles/dataset/22250536.
Collapse
Affiliation(s)
- Meiyu Duan
- College of Computer Science and Technology, Jilin University, Changchun, Jilin 130012, China; School of Biology and Engineering, Guizhou Medical University, Guiyang 550025, Guizhou, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
| | - Yaqing Liu
- College of Computer Science and Technology, Jilin University, Changchun, Jilin 130012, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
| | - Dong Zhao
- School of Biology and Engineering, Guizhou Medical University, Guiyang 550025, Guizhou, China
| | - Haijun Li
- School of Biology and Engineering, Guizhou Medical University, Guiyang 550025, Guizhou, China
| | - Gongyou Zhang
- School of Biology and Engineering, Guizhou Medical University, Guiyang 550025, Guizhou, China
| | - Hongmei Liu
- School of Biology and Engineering, Guizhou Medical University, Guiyang 550025, Guizhou, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China; Engineering Research Center of Medical Biotechnology, Guizhou Medical University, Guiyang 550025, Guizhou, China
| | - Yueying Wang
- College of Computer Science and Technology, Jilin University, Changchun, Jilin 130012, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
| | - Yusi Fan
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China; College of Software, Jilin University, Changchun, Jilin 130012, China.
| | - Lan Huang
- College of Computer Science and Technology, Jilin University, Changchun, Jilin 130012, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
| | - Fengfeng Zhou
- College of Computer Science and Technology, Jilin University, Changchun, Jilin 130012, China; School of Biology and Engineering, Guizhou Medical University, Guiyang 550025, Guizhou, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China.
| |
Collapse
|
4
|
Arevalillo JM, Martin-Arevalillo R. Patterns of differential expression by association in omic data using a new measure based on ensemble learning. Stat Appl Genet Mol Biol 2023; 22:sagmb-2023-0009. [PMID: 37991399 DOI: 10.1515/sagmb-2023-0009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 08/04/2023] [Indexed: 11/23/2023]
Abstract
The ongoing development of high-throughput technologies is allowing the simultaneous monitoring of the expression levels for hundreds or thousands of biological inputs with the proliferation of what has been coined as omic data sources. One relevant issue when analyzing such data sources is concerned with the detection of differential expression across two experimental conditions, clinical status or two classes of a biological outcome. While a great deal of univariate data analysis approaches have been developed to address the issue, strategies for assessing interaction patterns of differential expression are scarce in the literature and have been limited to ad hoc solutions. This paper contributes to the problem by exploiting the facilities of an ensemble learning algorithm like random forests to propose a measure that assesses the differential expression explained by the interaction of the omic variables so subtle biological patterns may be uncovered as a result. The out of bag error rate, which is an estimate of the predictive accuracy of a random forests classifier, is used as a by-product to propose a new measure that assesses interaction patterns of differential expression. Its performance is studied in synthetic scenarios and it is also applied to real studies on SARS-CoV-2 and colon cancer data where it uncovers associations that remain undetected by other methods. Our proposal is aimed at providing a novel approach that may help the experts in biomedical and life sciences to unravel insightful interaction patterns that may decipher the molecular mechanisms underlying biological and clinical outcomes.
Collapse
Affiliation(s)
- Jorge M Arevalillo
- UC3M-Santander Big Data Institute, Madrid Street 135, 28903, Getafe, Madrid, Spain
- Department of Statistics and Operational Research, UNED, Juan del Rosal 10, 28040, Madrid, Spain
| | - Raquel Martin-Arevalillo
- Laboratoire de Reproduction et Développement des Plantes, Ecole Normale Superieure de Lyon, 46, allée d'Italie, 69007, Lyon, Auvergne-Rhone-Alpes, France
| |
Collapse
|
5
|
Yu H, Wang L, Chen D, Li J, Guo Y. Conditional transcriptional relationships may serve as cancer prognostic markers. BMC Med Genomics 2021; 14:101. [PMID: 34856998 PMCID: PMC8638091 DOI: 10.1186/s12920-021-00958-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2021] [Accepted: 04/08/2021] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND While most differential coexpression (DC) methods are bound to quantify a single correlation value for a gene pair across multiple samples, a newly devised approach under the name Correlation by Individual Level Product (CILP) revolutionarily projects the summary correlation value to individual product correlation values for separate samples. CILP greatly widened DC analysis opportunities by allowing integration of non-compromised statistical methods. METHODS Here, we performed a study to verify our hypothesis that conditional relationships, i.e., gene pairs of remarkable differential coexpression, may be sought as quantitative prognostic markers for human cancers. Alongside the seeking of prognostic gene links in a pan-cancer setting, we also examined whether a trend of global expression correlation loss appeared in a wide panel of cancer types and revisited the controversial subject of mutual relationship between the DE approach and the DC approach. RESULTS By integrating CILP with classical univariate survival analysis, we identified up to 244 conditional gene links as potential prognostic markers in five cancer types. In particular, five prognostic gene links for kidney renal papillary cell carcinoma tended to condense around cancer gene ESPL1, and the transcriptional synchrony between ESPL1 and PTTG1 tended to be elevated in patients of adverse prognosis. In addition, we extended the observation of global trend of correlation loss in more than ten cancer types and empirically proved DC analysis results were independent of gene differential expression in five cancer types. CONCLUSIONS Combining the power of CILP and the classical survival analysis, we successfully fetched conditional transcriptional relationships that conferred prognosis power for five cancer types. Despite a general trend of global correlation loss in tumor transcriptomes, most of these prognosis conditional links demonstrated stronger expression correlation in tumors, and their stronger coexpression was associated with poor survival.
Collapse
Affiliation(s)
- Hui Yu
- Department of Internal Medicine, University of New Mexico, Albuquerque, NM, 87131, USA.
| | - Limei Wang
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, Hainan Medical University, Kaikou, Hainan, 571199, China.,College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, 150001, Heilongjiang, China
| | - Danqian Chen
- Key Laboratory of Resource Biology and Biotechnology in Western China, School of Life Sciences, Northwest University, Xi'an, 710069, Shaanxi, China
| | - Jin Li
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, Hainan Medical University, Kaikou, Hainan, 571199, China
| | - Yan Guo
- Department of Internal Medicine, University of New Mexico, Albuquerque, NM, 87131, USA.
| |
Collapse
|
6
|
Arbet J, Zhuang Y, Litkowski E, Saba L, Kechris K. Comparing Statistical Tests for Differential Network Analysis of Gene Modules. Front Genet 2021; 12:630215. [PMID: 34093641 PMCID: PMC8170128 DOI: 10.3389/fgene.2021.630215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 04/19/2021] [Indexed: 11/13/2022] Open
Abstract
Genes often work together to perform complex biological processes, and "networks" provide a versatile framework for representing the interactions between multiple genes. Differential network analysis (DiNA) quantifies how this network structure differs between two or more groups/phenotypes (e.g., disease subjects and healthy controls), with the goal of determining whether differences in network structure can help explain differences between phenotypes. In this paper, we focus on gene co-expression networks, although in principle, the methods studied can be used for DiNA for other types of features (e.g., metabolome, epigenome, microbiome, proteome, etc.). Three common applications of DiNA involve (1) testing whether the connections to a single gene differ between groups, (2) testing whether the connection between a pair of genes differs between groups, or (3) testing whether the connections within a "module" (a subset of 3 or more genes) differs between groups. This article focuses on the latter, as there is a lack of studies comparing statistical methods for identifying differentially co-expressed modules (DCMs). Through extensive simulations, we compare several previously proposed test statistics and a new p-norm difference test (PND). We demonstrate that the true positive rate of the proposed PND test is competitive with and often higher than the other methods, while controlling the false positive rate. The R package discoMod (differentially co-expressed modules) implements the proposed method and provides a full pipeline for identifying DCMs: clustering tools to derive gene modules, tests to identify DCMs, and methods for visualizing the results.
Collapse
Affiliation(s)
- Jaron Arbet
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| | - Yaxu Zhuang
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| | - Elizabeth Litkowski
- Department of Epidemiology, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| | - Laura Saba
- Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora CO, United States
| | - Katerina Kechris
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| |
Collapse
|
7
|
Rps27a might act as a controller of microglia activation in triggering neurodegenerative diseases. PLoS One 2020; 15:e0239219. [PMID: 32941527 PMCID: PMC7498011 DOI: 10.1371/journal.pone.0239219] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2020] [Accepted: 09/01/2020] [Indexed: 01/10/2023] Open
Abstract
Neurodegenerative diseases (NDDs) are increasing serious menaces to human health in the recent years. Despite exhibiting different clinical phenotypes and selective neuronal loss, there are certain common features in these disorders, suggesting the presence of commonly dysregulated pathways. Identifying causal genes and dysregulated pathways can be helpful in providing effective treatment in these diseases. Interestingly, in spite of the considerable researches on NDDs, to the best of our knowledge, no dysregulated genes and/or pathways were reported in common across all the major NDDs so far. In this study, for the first time, we have applied the three-way interaction model, as an approach to unravel sophisticated gene interactions, to trace switch genes and significant pathways that are involved in six major NDDs. Subsequently, a gene regulatory network was constructed to investigate the regulatory communication of statistically significant triplets. Finally, KEGG pathway enrichment analysis was applied to find possible common pathways. Because of the central role of neuroinflammation and immune system responses in both pathogenic and protective mechanisms in the NDDs, we focused on immune genes in this study. Our results suggest that "cytokine-cytokine receptor interaction" pathway is enriched in all of the studied NDDs, while "osteoclast differentiation" and "natural killer cell mediated cytotoxicity" pathways are enriched in five of the NDDs each. The results of this study indicate that three pathways that include "osteoclast differentiation", "natural killer cell mediated cytotoxicity" and "cytokine-cytokine receptor interaction" are common in five, five and six NDDs, respectively. Additionally, our analysis showed that Rps27a as a switch gene, together with the gene pair {Il-18, Cx3cl1} form a statistically significant and biologically relevant triplet in the major NDDs. More specifically, we suggested that Cx3cl1 might act as a potential upstream regulator of Il-18 in microglia activation, and in turn, might be controlled with Rps27a in triggering NDDs.
Collapse
|
8
|
Chowdhury HA, Bhattacharyya DK, Kalita JK. (Differential) Co-Expression Analysis of Gene Expression: A Survey of Best Practices. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1154-1173. [PMID: 30668502 DOI: 10.1109/tcbb.2019.2893170] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Analysis of gene expression data is widely used in transcriptomic studies to understand functions of molecules inside a cell and interactions among molecules. Differential co-expression analysis studies diseases and phenotypic variations by finding modules of genes whose co-expression patterns vary across conditions. We review the best practices in gene expression data analysis in terms of analysis of (differential) co-expression, co-expression network, differential networking, and differential connectivity considering both microarray and RNA-seq data along with comparisons. We highlight hurdles in RNA-seq data analysis using methods developed for microarrays. We include discussion of necessary tools for gene expression analysis throughout the paper. In addition, we shed light on scRNA-seq data analysis by including preprocessing and scRNA-seq in co-expression analysis along with useful tools specific to scRNA-seq. To get insights, biological interpretation and functional profiling is included. Finally, we provide guidelines for the analyst, along with research issues and challenges which should be addressed.
Collapse
|
9
|
Bhuva DD, Cursons J, Smyth GK, Davis MJ. Differential co-expression-based detection of conditional relationships in transcriptional data: comparative analysis and application to breast cancer. Genome Biol 2019; 20:236. [PMID: 31727119 PMCID: PMC6857226 DOI: 10.1186/s13059-019-1851-8] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 10/02/2019] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND Elucidation of regulatory networks, including identification of regulatory mechanisms specific to a given biological context, is a key aim in systems biology. This has motivated the move from co-expression to differential co-expression analysis and numerous methods have been developed subsequently to address this task; however, evaluation of methods and interpretation of the resulting networks has been hindered by the lack of known context-specific regulatory interactions. RESULTS In this study, we develop a simulator based on dynamical systems modelling capable of simulating differential co-expression patterns. With the simulator and an evaluation framework, we benchmark and characterise the performance of inference methods. Defining three different levels of "true" networks for each simulation, we show that accurate inference of causation is difficult for all methods, compared to inference of associations. We show that a z-score-based method has the best general performance. Further, analysis of simulation parameters reveals five network and simulation properties that explained the performance of methods. The evaluation framework and inference methods used in this study are available in the dcanr R/Bioconductor package. CONCLUSIONS Our analysis of networks inferred from simulated data show that hub nodes are more likely to be differentially regulated targets than transcription factors. Based on this observation, we propose an interpretation of the inferred differential network that can reconstruct a putative causal network.
Collapse
Affiliation(s)
- Dharmesh D Bhuva
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.,School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Joseph Cursons
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.,Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Gordon K Smyth
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia.,School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Melissa J Davis
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia. .,Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, VIC, 3010, Australia. .,Department of Clinical Pathology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, VIC, 3010, Australia.
| |
Collapse
|
10
|
Siddiqui JK, Baskin E, Liu M, Cantemir-Stone CZ, Zhang B, Bonneville R, McElroy JP, Coombes KR, Mathé EA. IntLIM: integration using linear models of metabolomics and gene expression data. BMC Bioinformatics 2018; 19:81. [PMID: 29506475 PMCID: PMC5838881 DOI: 10.1186/s12859-018-2085-6] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Accepted: 02/21/2018] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Integration of transcriptomic and metabolomic data improves functional interpretation of disease-related metabolomic phenotypes, and facilitates discovery of putative metabolite biomarkers and gene targets. For this reason, these data are increasingly collected in large (> 100 participants) cohorts, thereby driving a need for the development of user-friendly and open-source methods/tools for their integration. Of note, clinical/translational studies typically provide snapshot (e.g. one time point) gene and metabolite profiles and, oftentimes, most metabolites measured are not identified. Thus, in these types of studies, pathway/network approaches that take into account the complexity of transcript-metabolite relationships may neither be applicable nor readily uncover novel relationships. With this in mind, we propose a simple linear modeling approach to capture disease-(or other phenotype) specific gene-metabolite associations, with the assumption that co-regulation patterns reflect functionally related genes and metabolites. RESULTS The proposed linear model, metabolite ~ gene + phenotype + gene:phenotype, specifically evaluates whether gene-metabolite relationships differ by phenotype, by testing whether the relationship in one phenotype is significantly different from the relationship in another phenotype (via a statistical interaction gene:phenotype p-value). Statistical interaction p-values for all possible gene-metabolite pairs are computed and significant pairs are then clustered by the directionality of associations (e.g. strong positive association in one phenotype, strong negative association in another phenotype). We implemented our approach as an R package, IntLIM, which includes a user-friendly R Shiny web interface, thereby making the integrative analyses accessible to non-computational experts. We applied IntLIM to two previously published datasets, collected in the NCI-60 cancer cell lines and in human breast tumor and non-tumor tissue, for which transcriptomic and metabolomic data are available. We demonstrate that IntLIM captures relevant tumor-specific gene-metabolite associations involved in known cancer-related pathways, including glutamine metabolism. Using IntLIM, we also uncover biologically relevant novel relationships that could be further tested experimentally. CONCLUSIONS IntLIM provides a user-friendly, reproducible framework to integrate transcriptomic and metabolomic data and help interpret metabolomic data and uncover novel gene-metabolite relationships. The IntLIM R package is publicly available in GitHub ( https://github.com/mathelab/IntLIM ) and includes a user-friendly web application, vignettes, sample data and data/code to reproduce results.
Collapse
Affiliation(s)
- Jalal K Siddiqui
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| | - Elizabeth Baskin
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| | - Mingrui Liu
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| | - Carmen Z Cantemir-Stone
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| | - Bofei Zhang
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA.,Biomedical Engineering Undegraduate Program, The Ohio State University, Columbus, OH, 43210, USA
| | - Russell Bonneville
- Biomedical Sciences Graduate Program, The Ohio State University, Columbus, OH, USA.,Comprehensive Cancer Center, Department of Internal Medicine, The Ohio State University, Columbus, OH, USA
| | - Joseph P McElroy
- Center for Biostatistics, The Ohio State University, Columbus, OH, USA
| | - Kevin R Coombes
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA
| | - Ewy A Mathé
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
11
|
Kayano M, Higaki S, Satoh JI, Matsumoto K, Matsubara E, Takikawa O, Niida S. Plasma microRNA biomarker detection for mild cognitive impairment using differential correlation analysis. Biomark Res 2016; 4:22. [PMID: 27999671 PMCID: PMC5151129 DOI: 10.1186/s40364-016-0076-1] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 11/22/2016] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND Mild cognitive impairment (MCI) is an intermediate state between normal aging and dementia including Alzheimer's disease. Early detection of dementia, and MCI, is a crucial issue in terms of secondary prevention. Blood biomarker detection is a possible way for early detection of MCI. Although disease biomarkers are detected by, in general, using single molecular analysis such as t-test, another possible approach is based on interaction between molecules. RESULTS Differential correlation analysis, which detects difference on correlation of two variables in case/control study, was carried out to plasma microRNA (miRNA) expression profiles of 30 age- and race-matched controls and 23 Japanese MCI patients. The 20 pairs of miRNAs, which consist of 20 miRNAs, were selected as MCI markers. Two pairs of miRNAs (hsa-miR-191 and hsa-miR-101, and hsa-miR-103 and hsa-miR-222) out of 20 attained the highest area under the curve (AUC) value of 0.962 for MCI detection. Other two miRNA pairs that include hsa-miR-191 and hsa-miR-125b also attained high AUC value of ≥ 0.95. Pathway analysis was performed to the MCI markers for further understanding of biological implications. As a result, collapsed correlation on hsa-miR-191 and emerged correlation on hsa-miR-125b might have key role in MCI and dementia progression. CONCLUSION Differential correlation analysis, a bioinformatics tool to elucidate complicated and interdependent biological systems behind diseases, detects effective MCI markers that cannot be found by single molecule analysis such as t-test.
Collapse
Affiliation(s)
- Mitsunori Kayano
- Research Center for Global Agromedicine, Obihiro University of Agriculture and Veterinary Medicine, Obihiro, Hokkaido, Japan
- Medical Genome Center, National Center for Geriatrics and Gerontology, Obu, Aichi, Japan
| | - Sayuri Higaki
- Medical Genome Center, National Center for Geriatrics and Gerontology, Obu, Aichi, Japan
| | - Jun-ichi Satoh
- Department of Bioinformatics and Molecular Neuropathology, Meiji Pharmaceutical University, Kiyose, Tokyo, Japan
| | - Kenji Matsumoto
- Department of Allergy and Clinical Immunology, National Center for Child Health and Development, Setagaya, Tokyo, Japan
| | - Etsuro Matsubara
- Department of Neurology, Hirosaki University Graduate School of Medicine, Hirosaki, Aomori, Japan
- Department of Neurology, Oita University Faculty of Medicine, Yufu, Oita, Japan
| | - Osamu Takikawa
- Innovation Center for Clinical Research, National Center for Geriatrics and Gerontology, Obu, Aichi, Japan
| | - Shumpei Niida
- Medical Genome Center, National Center for Geriatrics and Gerontology, Obu, Aichi, Japan
| |
Collapse
|
12
|
Wang D, Wang J, Jiang Y, Liang Y, Xu D. BFDCA: A Comprehensive Tool of Using Bayes Factor for Differential Co-Expression Analysis. J Mol Biol 2016; 429:446-453. [PMID: 27984044 DOI: 10.1016/j.jmb.2016.10.030] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Revised: 10/22/2016] [Accepted: 10/23/2016] [Indexed: 10/20/2022]
Abstract
Comparing the gene-expression profiles between biological conditions is useful for understanding gene regulation underlying complex phenotypes. Along this line, analysis of differential co-expression (DC) has gained attention in the recent years, where genes under one condition have different co-expression patterns compared with another. We developed an R package Bayes Factor approach for Differential Co-expression Analysis (BFDCA) for DC analysis. BFDCA is unique in integrating various aspects of DC patterns (including Shift, Cross, and Re-wiring) into one uniform Bayes factor. We tested BFDCA using simulation data and experimental data. Simulation results indicate that BFDCA outperforms existing methods in accuracy and robustness of detecting DC pairs and DC modules. Results of using experimental data suggest that BFDCA can cluster disease-related genes into functional DC subunits and estimate the regulatory impact of disease-related genes well. BFDCA also achieves high accuracy in predicting case-control phenotypes by using significant DC gene pairs as markers. BFDCA is publicly available at http://dx.doi.org/10.17632/jdz4vtvnm3.1.
Collapse
Affiliation(s)
- Duolin Wang
- College of Computer Science and Technology, Jilin University, Changchun, China 130012; Department of Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Juexin Wang
- Department of Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Yuexu Jiang
- College of Computer Science and Technology, Jilin University, Changchun, China 130012; Department of Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Yanchun Liang
- College of Computer Science and Technology, Jilin University, Changchun, China 130012; Department of Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Dong Xu
- College of Computer Science and Technology, Jilin University, Changchun, China 130012; Department of Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.
| |
Collapse
|
13
|
Padayachee T, Khamiakova T, Shkedy Z, Perola M, Salo P, Burzykowski T. The Detection of Metabolite-Mediated Gene Module Co-Expression Using Multivariate Linear Models. PLoS One 2016; 11:e0150257. [PMID: 26918614 PMCID: PMC4769021 DOI: 10.1371/journal.pone.0150257] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2015] [Accepted: 02/11/2016] [Indexed: 12/29/2022] Open
Abstract
Investigating whether metabolites regulate the co-expression of a predefined gene module is one of the relevant questions posed in the integrative analysis of metabolomic and transcriptomic data. This article concerns the integrative analysis of the two high-dimensional datasets by means of multivariate models and statistical tests for the dependence between metabolites and the co-expression of a gene module. The general linear model (GLM) for correlated data that we propose models the dependence between adjusted gene expression values through a block-diagonal variance-covariance structure formed by metabolic-subset specific general variance-covariance blocks. Performance of statistical tests for the inference of conditional co-expression are evaluated through a simulation study. The proposed methodology is applied to the gene expression data of the previously characterized lipid-leukocyte module. Our results show that the GLM approach improves on a previous approach by being less prone to the detection of spurious conditional co-expression.
Collapse
Affiliation(s)
- Trishanta Padayachee
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-Biostat), Hasselt University, Diepenbeek, Belgium
- * E-mail:
| | - Tatsiana Khamiakova
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-Biostat), Hasselt University, Diepenbeek, Belgium
| | - Ziv Shkedy
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-Biostat), Hasselt University, Diepenbeek, Belgium
| | - Markus Perola
- Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Finland
| | - Perttu Salo
- Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Finland
| | - Tomasz Burzykowski
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-Biostat), Hasselt University, Diepenbeek, Belgium
| |
Collapse
|
14
|
Siska C, Bowler R, Kechris K. The discordant method: a novel approach for differential correlation. ACTA ACUST UNITED AC 2015; 32:690-6. [PMID: 26520855 DOI: 10.1093/bioinformatics/btv633] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Accepted: 10/24/2015] [Indexed: 11/12/2022]
Abstract
MOTIVATION Current differential correlation methods are designed to determine molecular feature pairs that have the largest magnitude of difference between correlation coefficients. These methods do not easily capture molecular feature pairs that experience no correlation in one group but correlation in another, which may reflect certain types of biological interactions. We have developed a tool, the Discordant method, which categorizes the correlation types for each group to make this possible. RESULTS We compare the Discordant method to existing approaches using simulations and two biological datasets with different types of -omics data. In contrast to other methods, Discordant identifies phenotype-related features at a similar or higher rate while maintaining reasonable computational tractability and usability. AVAILABILITY AND IMPLEMENTATION R code and sample data are available at https://github.com/siskac/discordant CONTACT katerina.kechris@ucdenver.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Charlotte Siska
- Computational Bioscience Program, Department of Pharmacology, University of Colorado Denver
| | | | - Katerina Kechris
- Department of Biostatistics and Informatics, University of Colorado Denver, Denver, CO, USA
| |
Collapse
|
15
|
Lareau CA, White BC, Montgomery CG, McKinney BA. dcVar: a method for identifying common variants that modulate differential correlation structures in gene expression data. Front Genet 2015; 6:312. [PMID: 26539209 PMCID: PMC4609883 DOI: 10.3389/fgene.2015.00312] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 10/02/2015] [Indexed: 11/26/2022] Open
Abstract
Recent studies have implicated the role of differential co-expression or correlation structure in gene expression data to help explain phenotypic differences. However, few attempts have been made to characterize the function of variants based on their role in regulating differential co-expression. Here, we describe a statistical methodology that identifies pairs of transcripts that display differential correlation structure conditioned on genotypes of variants that regulate co-expression. Additionally, we present a user-friendly, computationally efficient tool, dcVar, that can be applied to expression quantitative trait loci (eQTL) or RNA-Seq datasets to infer differential co-expression variants (dcVars). We apply dcVar to the HapMap3 eQTL dataset and demonstrate the utility of this methodology at uncovering novel function of variants of interest with examples from a height genome-wide association and cancer drug resistance. We provide evidence that differential correlation structure is a valuable intermediate molecular phenotype for further characterizing the function of variants identified in GWAS and related studies.
Collapse
Affiliation(s)
- Caleb A Lareau
- Tandy School of Computer Science - Department of Mathematics, University of Tulsa Tulsa, OK, USA ; Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation Oklahoma City, OK, USA
| | - Bill C White
- Tandy School of Computer Science - Department of Mathematics, University of Tulsa Tulsa, OK, USA
| | - Courtney G Montgomery
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation Oklahoma City, OK, USA
| | - Brett A McKinney
- Tandy School of Computer Science - Department of Mathematics, University of Tulsa Tulsa, OK, USA ; Laureate Institute for Brain Research Tulsa, OK, USA
| |
Collapse
|