1
|
Xu S, Leng Y, Feng G, Zhang C, Chen M. A gene pathway enrichment method based on improved TF-IDF algorithm. Biochem Biophys Rep 2023; 34:101421. [PMID: 36923007 PMCID: PMC10009669 DOI: 10.1016/j.bbrep.2023.101421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 12/20/2022] [Accepted: 01/03/2023] [Indexed: 03/08/2023] Open
Abstract
Gene pathway enrichment analysis is a widely used method to analyze whether a gene set is statistically enriched on certain biological pathway network. Current gene pathway enrichment methods commonly consider local importance of genes in pathways without considering the interactions between genes. In this paper, we propose a gene pathway enrichment method (GIGSEA) based on improved TF-IDF algorithm. This method employs gene interaction data to calculate the influence of genes based on the local importance in a pathway as well as the global specificity. Computational experiment result shows that, compared with traditional gene set enrichment analysis method, our proposed method in this paper can find more specific enriched pathways related to phenotype with higher efficiency.
Collapse
Affiliation(s)
- Shutan Xu
- College of Information Technology, Shanghai Ocean University, Shanghai, 201306, China.,Key Laboratory of Fisheries Information, Ministry of Agriculture, Shanghai, 201306, China
| | - Yinhui Leng
- College of Information Technology, Shanghai Ocean University, Shanghai, 201306, China
| | - Guofu Feng
- College of Information Technology, Shanghai Ocean University, Shanghai, 201306, China
| | - Chenjing Zhang
- College of Information Technology, Shanghai Ocean University, Shanghai, 201306, China
| | - Ming Chen
- College of Information Technology, Shanghai Ocean University, Shanghai, 201306, China.,Key Laboratory of Fisheries Information, Ministry of Agriculture, Shanghai, 201306, China
| |
Collapse
|
2
|
Zhang Y, Xu Y, Li F, Li X, Feng L, Shi X, Wang L, Li X. Dissecting dysfunctional crosstalk pathways regulated by miRNAs during glioma progression. Oncotarget 2017; 7:25769-82. [PMID: 27013589 PMCID: PMC5041942 DOI: 10.18632/oncotarget.8265] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2015] [Accepted: 03/08/2016] [Indexed: 01/14/2023] Open
Abstract
Glioma is a malignant nervous system tumor with a high fatality rate and poor prognosis. MicroRNAs (miRNAs) are important post-transcriptional modulators of glioma initiation and progression. Tumor progression often results from dysfunctional co-operation between pathways regulated by miRNAs. We therefore constructed a glioma progression-related miRNA-pathway crosstalk network that not only revealed some key miRNA-pathway patterns, but also helped characterize the functional roles of miRNAs during glioma progression. Our data indicate that crosstalk between cell cycle and p53 pathways is associated with grade II to grade III progression, while cell communications-related pathways involving regulation of actin cytoskeleton and adherens junctions are associated with grade IV glioblastoma progression. Furthermore, miRNAs and their crosstalk pathways may be useful for stratifying glioma and glioblastoma patients into groups with short or long survival times. Our data indicate that a combination of miRNA and pathway crosstalk information can be used for survival prediction.
Collapse
Affiliation(s)
- Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yanjun Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Feng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Xiang Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Li Feng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Xinrui Shi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Lihua Wang
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin 150081, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| |
Collapse
|
3
|
Fang H, Li X, Zan X, Shen L, Ma R, Liu W. Signaling pathway impact analysis by incorporating the importance and specificity of genes (SPIA-IS). Comput Biol Chem 2017; 71:236-244. [DOI: 10.1016/j.compbiolchem.2017.09.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Accepted: 09/25/2017] [Indexed: 01/28/2023]
|
4
|
Investigation of coordination and order in transcription regulation of innate and adaptive immunity genes in type 1 diabetes. BMC Med Genomics 2017; 10:7. [PMID: 28143555 PMCID: PMC5282641 DOI: 10.1186/s12920-017-0243-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 01/25/2017] [Indexed: 01/19/2023] Open
Abstract
Background Type 1 diabetes (T1D) is an autoimmune disease and extensive evidence has indicated a critical role of both the innate and the adaptive arms of immune system in disease development. To date most clinical trials of immunomodulation therapies failed to show efficacy. A number of gene expression studies of T1D have been carried out. However, a systems analysis of the expression variations of the innate and adaptive immunity gene sets, or their co-expression network structures in cohorts at different disease states or of different disease risks, is not available till now. Methods We utilized data from a large gene expression study that included transcription profiles of control peripheral blood mononuclear cells (PBMC) exposed to plasma of 148 human subjects from four cohorts that included unrelated healthy controls (uHC), recent onset T1D patients (RO-T1D), and healthy siblings of probands that possess high (HRS, High Risk Sibling) or low (LRS, Low Risk Sibling) risk HLA haplotypes. Both weighted and non-weighted co-expression networks were constructed in each cohort separately, and edge weight distribution and the activation of known protein complexes were examined. The co-expression networks of the innate and adaptive immunity genes were further examined in more detail through a number of network measures that included network density, Shannon entropy, h-index, and the scaling exponent γ of degree distribution. Pathway analysis was carried out using CoGA, a tool for detecting significant network structural changes of a gene set. Results Weighted network edge distribution revealed a globally weakened co-expression network induced by the RO-T1D cohort as compared to that by the uHC, suggesting a broad spectrum loss of transcriptional coordination. The two healthy T1D family cohorts (HRS and LRS) induced more active but heterogeneous transcription coordination globally, and among both the innate and the adaptive immunity genes, than the uHC. This finding is consistent with our previous report of these cohorts sharing a heightened innate inflammatory state. The spike-in of IL-1RA to RO-T1D sera improved co-expression network strength of both the innate and the adaptive immunity genes, and enabled a global order recovery in transcription regulation that resulted in significantly increased number of activated protein complexes. Many of the top pathways that showed significant difference in co-expression network structures and order between RO-T1D and uHC have strong links to T1D. Conclusions Network level analysis of the innate and adaptive immunity genes, and the whole genome, revealed striking cohort-dependent differences in co-expression network structural measures, suggesting their potential in cohort classification and disease-relevant pathway identification. The results demonstrated the advantages of systems analysis in defining molecular signatures as well as in predicting targets in future research. Electronic supplementary material The online version of this article (doi:10.1186/s12920-017-0243-8) contains supplementary material, which is available to authorized users.
Collapse
|
5
|
Dussaut JS, Gallo CA, Cecchini RL, Carballido JA, Ponzoni I. Crosstalk pathway inference using topological information and biclustering of gene expression data. Biosystems 2016; 150:1-12. [DOI: 10.1016/j.biosystems.2016.08.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Revised: 06/03/2016] [Accepted: 08/04/2016] [Indexed: 11/30/2022]
|
6
|
Parham F, Portier CJ, Chang X, Mevissen M. The Use of Signal-Transduction and Metabolic Pathways to Predict Human Disease Targets from Electric and Magnetic Fields Using in vitro Data in Human Cell Lines. Front Public Health 2016; 4:193. [PMID: 27656641 PMCID: PMC5013261 DOI: 10.3389/fpubh.2016.00193] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 08/25/2016] [Indexed: 12/23/2022] Open
Abstract
Using in vitro data in human cell lines, several research groups have investigated changes in gene expression in cellular systems following exposure to extremely low frequency (ELF) and radiofrequency (RF) electromagnetic fields (EMF). For ELF EMF, we obtained five studies with complete microarray data and three studies with only lists of significantly altered genes. Likewise, for RF EMF, we obtained 13 complete microarray datasets and 5 limited datasets. Plausible linkages between exposure to ELF and RF EMF and human diseases were identified using a three-step process: (a) linking genes associated with classes of human diseases to molecular pathways, (b) linking pathways to ELF and RF EMF microarray data, and (c) identifying associations between human disease and EMF exposures where the pathways are significantly similar. A total of 60 pathways were associated with human diseases, mostly focused on basic cellular functions like JAK–STAT signaling or metabolic functions like xenobiotic metabolism by cytochrome P450 enzymes. ELF EMF datasets were sporadically linked to human diseases, but no clear pattern emerged. Individual datasets showed some linkage to cancer, chemical dependency, metabolic disorders, and neurological disorders. RF EMF datasets were not strongly linked to any disorders but strongly linked to changes in several pathways. Based on these analyses, the most promising area for further research would be to focus on EMF and neurological function and disorders.
Collapse
Affiliation(s)
- Fred Parham
- National Institute of Environmental Health Sciences, Research Triangle Park , Durham, NC , USA
| | | | - Xiaoqing Chang
- National Institute of Environmental Health Sciences, Research Triangle Park , Durham, NC , USA
| | - Meike Mevissen
- Division of Veterinary Pharmacology and Toxicology, Vetsuisse Faculty , University of Bern, Bern , Switzerland
| |
Collapse
|
7
|
McNamee JP, Bellier PV, Konkle ATM, Thomas R, Wasoontarajaroen S, Lemay E, Gajda GB. Analysis of gene expression in mouse brain regions after exposure to 1.9 GHz radiofrequency fields. Int J Radiat Biol 2016; 92:338-50. [PMID: 27028625 PMCID: PMC4898144 DOI: 10.3109/09553002.2016.1159353] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Revised: 02/15/2016] [Accepted: 02/20/2016] [Indexed: 12/23/2022]
Abstract
PURPOSE To assess 1.9 GHz radiofrequency (RF) field exposure on gene expression within a variety of discrete mouse brain regions using whole genome microarray analysis. MATERIALS AND METHODS Adult male C57BL/6 mice were exposed to 1.9 GHz pulse-modulated or continuous-wave RF fields for 4 h/day for 5 consecutive days at whole body average (WBA) specific absorption rates of 0 (sham), ∼0.2 W/kg and ∼1.4 W/kg. Total RNA was isolated from the auditory cortex, amygdala, caudate, cerebellum, hippocampus, hypothalamus, and medial prefrontal cortex and differential gene expression was assessed using Illumina MouseWG-6 (v2) BeadChip arrays. Validation of potentially responding genes was conducted by RT-PCR. RESULTS When analysis of gene expression was conducted within individual brain regions when controlling the false discovery rate (FDR), no differentially expressed genes were identified relative to the sham control. However, it must be noted that most fold changes among groups were observed to be less than 1.5-fold and this study had limited ability to detect such small changes. While some genes were differentially expressed without correction for multiple-comparisons testing, no consistent pattern of response was observed among different RF-exposure levels or among different RF-modulations. CONCLUSIONS The current study provides the most comprehensive analysis of potential gene expression changes in the rodent brain in response to RF field exposure conducted to date. Within the exposure conditions and limitations of this study, no convincing evidence of consistent changes in gene expression was found in response to 1.9 GHz RF field exposure.
Collapse
Affiliation(s)
- James P. McNamee
- Health Canada, Environmental and Radiation Health Sciences Directorate, Consumer and Clinical Radiation Protection Bureau,
Ottawa
| | - Pascale V. Bellier
- Health Canada, Environmental and Radiation Health Sciences Directorate, Consumer and Clinical Radiation Protection Bureau,
Ottawa
| | - Anne T. M. Konkle
- Interdisciplinary School of Health Sciences, University of Ottawa,
Ottawa,
ON,
Canada
| | | | | | - Eric Lemay
- Health Canada, Environmental and Radiation Health Sciences Directorate, Consumer and Clinical Radiation Protection Bureau,
Ottawa
| | - Greg B. Gajda
- Health Canada, Environmental and Radiation Health Sciences Directorate, Consumer and Clinical Radiation Protection Bureau,
Ottawa
| |
Collapse
|
8
|
Stopper GF, Richards-Hrdlicka KL, Wagner GP. Hedgehog inhibition causes complete loss of limb outgrowth and transformation of digit identity in Xenopus tropicalis. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2016; 326:110-24. [PMID: 26918681 DOI: 10.1002/jez.b.22669] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2016] [Accepted: 01/14/2016] [Indexed: 11/12/2022]
Abstract
The study of the tetrapod limb has contributed greatly to our understanding of developmental pathways and how changes to these pathways affect the evolution of morphology. Most of our understanding of tetrapod limb development comes from research on amniotes, with far less known about mechanisms of limb development in amphibians. To better understand the mechanisms of limb development in anuran amphibians, we used cyclopamine to inhibit Hedgehog signaling at various stages of development in the western clawed frog, Xenopus tropicalis, and observed resulting morphologies. We also analyzed gene expression changes resulting from similar experiments in Xenopus laevis. Inhibition of Hedgehog signaling in X. tropicalis results in limb abnormalities including reduced digit number, missing skeletal elements, and complete absence of limbs. In addition, posterior digits assume an anterior identity by developing claws that are usually only found on anterior digits, confirming Sonic hedgehog's role in digit identity determination. Thus, Sonic hedgehog appears to play mechanistically separable roles in digit number specification and digit identity specification as in other studied tetrapods. The complete limb loss observed in response to reduced Hedgehog signaling in X. tropicalis, however, is striking, as this functional role for Hedgehog signaling has not been found in any other tetrapod. This changed mechanism may represent a substantial developmental constraint to digit number evolution in frogs. J. Exp. Zool. (Mol. Dev. Evol.) 9999B:XX-XX, 2016. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Geffrey F Stopper
- Department of Biology, Sacred Heart University, Fairfield, Connecticut
| | | | - Günter P Wagner
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut
| |
Collapse
|
9
|
Understanding disease mechanisms with models of signaling pathway activities. BMC SYSTEMS BIOLOGY 2014; 8:121. [PMID: 25344409 PMCID: PMC4213475 DOI: 10.1186/s12918-014-0121-3] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/02/2014] [Accepted: 10/13/2014] [Indexed: 02/02/2023]
Abstract
BACKGROUND Understanding the aspects of the cell functionality that account for disease or drug action mechanisms is one of the main challenges in the analysis of genomic data and is on the basis of the future implementation of precision medicine. RESULTS Here we propose a simple probabilistic model in which signaling pathways are separated into elementary sub-pathways or signal transmission circuits (which ultimately trigger cell functions) and then transforms gene expression measurements into probabilities of activation of such signal transmission circuits. Using this model, differential activation of such circuits between biological conditions can be estimated. Thus, circuit activation statuses can be interpreted as biomarkers that discriminate among the compared conditions. This type of mechanism-based biomarkers accounts for cell functional activities and can easily be associated to disease or drug action mechanisms. The accuracy of the proposed model is demonstrated with simulations and real datasets. CONCLUSIONS The proposed model provides detailed information that enables the interpretation disease mechanisms as a consequence of the complex combinations of altered gene expression values. Moreover, it offers a framework for suggesting possible ways of therapeutic intervention in a pathologically perturbed system.
Collapse
|
10
|
Thomas R, Hubbard AE, McHale CM, Zhang L, Rappaport SM, Lan Q, Rothman N, Vermeulen R, Guyton KZ, Jinot J, Sonawane BR, Smith MT. Characterization of changes in gene expression and biochemical pathways at low levels of benzene exposure. PLoS One 2014; 9:e91828. [PMID: 24786086 PMCID: PMC4006721 DOI: 10.1371/journal.pone.0091828] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Accepted: 02/14/2014] [Indexed: 11/19/2022] Open
Abstract
Benzene, a ubiquitous environmental pollutant, causes acute myeloid leukemia (AML). Recently, through transcriptome profiling of peripheral blood mononuclear cells (PBMC), we reported dose-dependent effects of benzene exposure on gene expression and biochemical pathways in 83 workers exposed across four airborne concentration ranges (from <1 ppm to >10 ppm) compared with 42 subjects with non-workplace ambient exposure levels. Here, we further characterize these dose-dependent effects with continuous benzene exposure in all 125 study subjects. We estimated air benzene exposure levels in the 42 environmentally-exposed subjects from their unmetabolized urinary benzene levels. We used a novel non-parametric, data-adaptive model selection method to estimate the change with dose in the expression of each gene. We describe non-parametric approaches to model pathway responses and used these to estimate the dose responses of the AML pathway and 4 other pathways of interest. The response patterns of majority of genes as captured by mean estimates of the first and second principal components of the dose-response for the five pathways and the profiles of 6 AML pathway response-representative genes (identified by clustering) exhibited similar apparent supra-linear responses. Responses at or below 0.1 ppm benzene were observed for altered expression of AML pathway genes and CYP2E1. Together, these data show that benzene alters disease-relevant pathways and genes in a dose-dependent manner, with effects apparent at doses as low as 100 ppb in air. Studies with extensive exposure assessment of subjects exposed in the low-dose range between 10 ppb and 1 ppm are needed to confirm these findings.
Collapse
Affiliation(s)
- Reuben Thomas
- Superfund Research Program, School of Public Health, University of California, Berkeley, California, United States of America
| | - Alan E. Hubbard
- Superfund Research Program, School of Public Health, University of California, Berkeley, California, United States of America
| | - Cliona M. McHale
- Superfund Research Program, School of Public Health, University of California, Berkeley, California, United States of America
| | - Luoping Zhang
- Superfund Research Program, School of Public Health, University of California, Berkeley, California, United States of America
| | - Stephen M. Rappaport
- Superfund Research Program, School of Public Health, University of California, Berkeley, California, United States of America
| | - Qing Lan
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Nathaniel Rothman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Roel Vermeulen
- Institute of Risk assessment Sciences, Utrecht University, Utrecht, The Netherlands
| | - Kathryn Z. Guyton
- National Center for Environmental Assessment, Office of Research and Development, US EPA, Washington, DC, United States of America
| | - Jennifer Jinot
- National Center for Environmental Assessment, Office of Research and Development, US EPA, Washington, DC, United States of America
| | - Babasaheb R. Sonawane
- National Center for Environmental Assessment, Office of Research and Development, US EPA, Washington, DC, United States of America
| | - Martyn T. Smith
- Superfund Research Program, School of Public Health, University of California, Berkeley, California, United States of America
| |
Collapse
|
11
|
Thomas R, McHale CM, Lan Q, Hubbard AE, Zhang L, Vermeulen R, Li G, Rappaport SM, Yin S, Rothman N, Smith MT. Global gene expression response of a population exposed to benzene: a pilot study exploring the use of RNA-sequencing technology. ENVIRONMENTAL AND MOLECULAR MUTAGENESIS 2013; 54:566-73. [PMID: 23907980 PMCID: PMC4353497 DOI: 10.1002/em.21801] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2012] [Revised: 06/09/2013] [Accepted: 06/11/2013] [Indexed: 05/29/2023]
Abstract
The mechanism of toxicity of the leukemogen benzene is not entirely known. This pilot study used RNA-sequencing (RNA-seq) technology to examine the effect of benzene exposure on gene expression in peripheral blood mononuclear cells obtained from 10 workers occupationally exposed to high levels of benzene (≥5 ppm) in air and 10 matched unexposed control workers, from a large study (n = 125) in which gene expression was previously measured by microarray. RNA-seq is more sensitive and has a wider dynamic range for the quantification of gene expression. Further, it has the ability to detect novel transcripts and alternative splice variants. The main conclusions from our analysis of the 20 workers by RNA-seq are as follows: The Pearson correlation between the two technical replicates for the RNA-seq experiments was 0.98 and the correlation between RNA-seq and microarray signals for the 20 subjects was around 0.6. 60% of the transcripts with detected reads from the RNA-seq experiments did not have corresponding probes on the microarrays. Fifty-three percent of the transcripts detected by RNA-seq and 99% of those with probes on the microarray were protein-coding. There was a significant overlap (P < 0.05) in transcripts declared differentially expressed due to benzene exposure using the two technologies. About 20% of the transcripts declared differentially expressed using the RNA-seq data were non-coding transcripts. Six transcripts were determined (false-discovery rate < 0.05) to be alternatively spliced as a result of benzene exposure. Overall, this pilot study shows that RNA-seq can complement the information obtained by microarray in the analysis of changes in transcript expression from chemical exposures.
Collapse
Affiliation(s)
- Reuben Thomas
- School of Public Health, University of California, Berkeley, California 94720-7356, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Thomas R, Thomas RS, Auerbach SS, Portier CJ. Biological networks for predicting chemical hepatocarcinogenicity using gene expression data from treated mice and relevance across human and rat species. PLoS One 2013; 8:e63308. [PMID: 23737943 PMCID: PMC3667849 DOI: 10.1371/journal.pone.0063308] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2012] [Accepted: 04/04/2013] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND Several groups have employed genomic data from subchronic chemical toxicity studies in rodents (90 days) to derive gene-centric predictors of chronic toxicity and carcinogenicity. Genes are annotated to belong to biological processes or molecular pathways that are mechanistically well understood and are described in public databases. OBJECTIVES To develop a molecular pathway-based prediction model of long term hepatocarcinogenicity using 90-day gene expression data and to evaluate the performance of this model with respect to both intra-species, dose-dependent and cross-species predictions. METHODS Genome-wide hepatic mRNA expression was retrospectively measured in B6C3F1 mice following subchronic exposure to twenty-six (26) chemicals (10 were positive, 2 equivocal and 14 negative for liver tumors) previously studied by the US National Toxicology Program. Using these data, a pathway-based predictor model for long-term liver cancer risk was derived using random forests. The prediction model was independently validated on test sets associated with liver cancer risk obtained from mice, rats and humans. RESULTS Using 5-fold cross validation, the developed prediction model had reasonable predictive performance with the area under receiver-operator curve (AUC) equal to 0.66. The developed prediction model was then used to extrapolate the results to data associated with rat and human liver cancer. The extrapolated model worked well for both extrapolated species (AUC value of 0.74 for rats and 0.91 for humans). The prediction models implied a balanced interplay between all pathway responses leading to carcinogenicity predictions. CONCLUSIONS Pathway-based prediction models estimated from sub-chronic data hold promise for predicting long-term carcinogenicity and also for its ability to extrapolate results across multiple species.
Collapse
Affiliation(s)
- Reuben Thomas
- Division of Environmental Health Sciences, School of Public Health, University of California, Berkeley, California, United States of America
| | - Russell S. Thomas
- The Hamner Institutes for Health Sciences, Research Triangle Park, North Carolina, United States of America
| | - Scott S. Auerbach
- Biomolecular Screening Branch, National Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, United States of America
| | - Christopher J. Portier
- National Center for Environmental Health and Agency for Toxic Substances and Disease Registry, United States Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America
- * E-mail:
| |
Collapse
|
13
|
Abstract
Life science technologies generate a deluge of data that hold the keys to unlocking the secrets of important biological functions and disease mechanisms. We present DEAP, Differential Expression Analysis for Pathways, which capitalizes on information about biological pathways to identify important regulatory patterns from differential expression data. DEAP makes significant improvements over existing approaches by including information about pathway structure and discovering the most differentially expressed portion of the pathway. On simulated data, DEAP significantly outperformed traditional methods: with high differential expression, DEAP increased power by two orders of magnitude; with very low differential expression, DEAP doubled the power. DEAP performance was illustrated on two different gene and protein expression studies. DEAP discovered fourteen important pathways related to chronic obstructive pulmonary disease and interferon treatment that existing approaches omitted. On the interferon study, DEAP guided focus towards a four protein path within the 26 protein Notch signalling pathway. The data deluge represents a growing challenge for life sciences. Within this sea of data surely lie many secrets to understanding important biological and medical systems. To quantify important patterns in this data, we present DEAP (Differential Expression Analysis for Pathways). DEAP amalgamates information about biological pathway structure and differential expression to identify important patterns of regulation. On both simulated and biological data, we show that DEAP is able to identify key mechanisms while making significant improvements over existing methodologies. For example, on the interferon study, DEAP uniquely identified both the interferon gamma signalling pathway and the JAK STAT signalling pathway.
Collapse
|
14
|
Chang B, Kustra R, Tian W. Functional-network-based gene set analysis using gene-ontology. PLoS One 2013; 8:e55635. [PMID: 23418449 PMCID: PMC3572115 DOI: 10.1371/journal.pone.0055635] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2012] [Accepted: 12/31/2012] [Indexed: 11/19/2022] Open
Abstract
To account for the functional non-equivalence among a set of genes within a biological pathway when performing gene set analysis, we introduce GOGANPA, a network-based gene set analysis method, which up-weights genes with functions relevant to the gene set of interest. The genes are weighted according to its degree within a genome-scale functional network constructed using the functional annotations available from the gene ontology database. By benchmarking GOGANPA using a well-studied P53 data set and three breast cancer data sets, we will demonstrate the power and reproducibility of our proposed method over traditional unweighted approaches and a competing network-based approach that involves a complex integrated network. GOGANPA’s sole reliance on gene ontology further allows GOGANPA to be widely applicable to the analysis of any gene-ontology-annotated genome.
Collapse
Affiliation(s)
- Billy Chang
- State Key Laboratory of Genetic Engineering, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, P.R. China
- Dalla Lana School of Public Health, Division of Biostatistics, University of Toronto, Toronto, Ontario, Canada
| | - Rafal Kustra
- Dalla Lana School of Public Health, Division of Biostatistics, University of Toronto, Toronto, Ontario, Canada
| | - Weidong Tian
- State Key Laboratory of Genetic Engineering, Institute of Biostatistics, School of Life Sciences, Fudan University, Shanghai, P.R. China
- * E-mail:
| |
Collapse
|
15
|
Abstract
With the advent of microarrays and next-generation biotechnologies, the use of gene expression data has become ubiquitous in biological research. One potential drawback of these data is that they are very rich in features or genes though cost considerations allow for the use of only relatively small sample sizes. A useful way of getting at biologically meaningful interpretations of the environmental or toxicological condition of interest would be to make inferences at the level of a priori defined biochemical pathways or networks of interacting genes or proteins that are known to perform certain biological functions. This chapter describes approaches taken in the literature to make such inferences at the biochemical pathway level. In addition this chapter describes approaches to create hypotheses on genes playing important roles in response to a treatment, using organism level gene coexpression or protein-protein interaction networks. Also, approaches to reverse engineer gene networks or methods that seek to identify novel interactions between genes are described. Given the relatively small sample numbers typically available, these reverse engineering approaches are generally useful in inferring interactions only among a relatively small or an order 10 number of genes. Finally, given the vast amounts of publicly available gene expression data from different sources, this chapter summarizes the important sources of these data and characteristics of these sources or databases. In line with the overall aims of this book of providing practical knowledge to a researcher interested in analyzing gene expression data from a network perspective, the chapter provides convenient publicly accessible tools for performing analyses described, and in addition describe three motivating examples taken from the published literature that illustrate some of the relevant analyses.
Collapse
Affiliation(s)
- Reuben Thomas
- Division of Environmental Health Sciences, University of California, Berkeley, CA, USA
| | | |
Collapse
|
16
|
Hwang S. Comparison and evaluation of pathway-level aggregation methods of gene expression data. BMC Genomics 2012; 13 Suppl 7:S26. [PMID: 23282027 PMCID: PMC3521227 DOI: 10.1186/1471-2164-13-s7-s26] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Background Microarray experiments produce expression measurements in genomic scale. A way to derive functional understanding of the data is to focus on functional sets of genes, such as pathways, instead of individual genes. While a common practice for the pathway-level analysis has been functional enrichment analysis such as over-representation analysis and gene set enrichment analysis, an alternative approach has also been explored. In this approach, gene expression data are first aggregated at pathway level to transform the original data into a compact representation in which each row corresponds to a pathway instead of a gene. Thereafter the pathway expression data can be used for differential expression and classification analyses in pathway space, leveraging existing algorithms usually applied to gene expression data. While several studies have proposed the pathway-level aggregation methods, it remains unclear how they compare with one another, since the evaluations were done to a limited extent. Thus this study presents a comprehensive evaluation of six most prominent aggregation methods. Results The compared methods include five existing methods--mean of all member genes (Mean all), mean of condition-responsive genes (Mean CORGs), analysis of sample set enrichment scores (ASSESS), principal component analysis (PCA), and partial least squares (PLS)--and a variant of an existing method (Mean top 50%, averaging top half of member genes). Comprehensive and stringent benchmarking was performed by collecting seven pairs of related but independent datasets encompassing various phenotypes. Aggregation was done in the space of KEGG pathways. Performance of the methods was assessed by classification accuracy validated both internally and externally, and by examining the correlative extent of pathway signatures between the dataset pairs. The assessment revealed that (i) the best accuracy and correlation were obtained from ASSESS and Mean top 50%, (ii) Mean all showed the lowest accuracy, and (iii) Mean CORGs and PLS gave rise to the largest extent of discordance in the pathway signature correlation. Conclusions The two best performing method (ASSESS and Mean top 50%) are suggested to be preferred. The benchmarking analysis also suggests that there is both room and necessity for developing a novel method for pathway-level aggregation.
Collapse
Affiliation(s)
- Seungwoo Hwang
- Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea.
| |
Collapse
|
17
|
Løkke H, Ragas AMJ, Holmstrup M. Tools and perspectives for assessing chemical mixtures and multiple stressors. Toxicology 2012; 313:73-82. [PMID: 23238274 DOI: 10.1016/j.tox.2012.11.009] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2012] [Revised: 10/29/2012] [Accepted: 11/24/2012] [Indexed: 01/22/2023]
Abstract
The present paper summarizes the most important insights and findings of the EU NoMiracle project with a focus on (1) risk assessment of chemical mixtures, (2) combinations of chemical and natural stressors, and (3) the receptor-oriented approach in cumulative risk assessment. The project aimed at integration of methods for human and ecological risk assessment. A mechanistically based model, considering uptake and toxicity as a processes in time, has demonstrated considerable potential for predicting mixture effects in ecotoxicology, but requires the measurement of toxicity endpoints at different moments in time. Within a novel framework for risk assessment of chemical mixtures, the importance of environmental factors on toxicokinetic processes is highlighted. A new paradigm for applying personal characteristics that determine individual exposure and sensitivity in human risk assessment is suggested. The results are discussed in the light of recent developments in risk assessment of mixtures and multiple stressors.
Collapse
Affiliation(s)
- Hans Løkke
- Aarhus University, Department of Bioscience, Vejlsøvej 25, P.O. Box 314, DK-8600 Silkeborg, Denmark.
| | | | | |
Collapse
|
18
|
Emmert-Streib F, Tripathi S, de Matos Simoes R. Harnessing the complexity of gene expression data from cancer: from single gene to structural pathway methods. Biol Direct 2012; 7:44. [PMID: 23227854 PMCID: PMC3769148 DOI: 10.1186/1745-6150-7-44] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2012] [Accepted: 10/01/2012] [Indexed: 12/22/2022] Open
Abstract
High-dimensional gene expression data provide a rich source of information because they capture the expression level of genes in dynamic states that reflect the biological functioning of a cell. For this reason, such data are suitable to reveal systems related properties inside a cell, e.g., in order to elucidate molecular mechanisms of complex diseases like breast or prostate cancer. However, this is not only strongly dependent on the sample size and the correlation structure of a data set, but also on the statistical hypotheses tested. Many different approaches have been developed over the years to analyze gene expression data to (I) identify changes in single genes, (II) identify changes in gene sets or pathways, and (III) identify changes in the correlation structure in pathways. In this paper, we review statistical methods for all three types of approaches, including subtypes, in the context of cancer data and provide links to software implementations and tools and address also the general problem of multiple hypotheses testing. Further, we provide recommendations for the selection of such analysis methods.
Collapse
Affiliation(s)
- Frank Emmert-Streib
- Computational Biology and Machine Learning Laboratory, Queen's University Belfast, Belfast, UK.
| | | | | |
Collapse
|
19
|
Dutta B, Wallqvist A, Reifman J. PathNet: a tool for pathway analysis using topological information. SOURCE CODE FOR BIOLOGY AND MEDICINE 2012; 7:10. [PMID: 23006764 PMCID: PMC3563509 DOI: 10.1186/1751-0473-7-10] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2012] [Accepted: 08/03/2012] [Indexed: 01/01/2023]
Abstract
Background Identification of canonical pathways through enrichment of differentially expressed genes in a given pathway is a widely used method for interpreting gene lists generated from high-throughput experimental studies. However, most algorithms treat pathways as sets of genes, disregarding any inter- and intra-pathway connectivity information, and do not provide insights beyond identifying lists of pathways. Results We developed an algorithm (PathNet) that utilizes the connectivity information in canonical pathway descriptions to help identify study-relevant pathways and characterize non-obvious dependencies and connections among pathways using gene expression data. PathNet considers both the differential expression of genes and their pathway neighbors to strengthen the evidence that a pathway is implicated in the biological conditions characterizing the experiment. As an adjunct to this analysis, PathNet uses the connectivity of the differentially expressed genes among all pathways to score pathway contextual associations and statistically identify biological relations among pathways. In this study, we used PathNet to identify biologically relevant results in two Alzheimer’s disease microarray datasets, and compared its performance with existing methods. Importantly, PathNet identified de-regulation of the ubiquitin-mediated proteolysis pathway as an important component in Alzheimer’s disease progression, despite the absence of this pathway in the standard enrichment analyses. Conclusions PathNet is a novel method for identifying enrichment and association between canonical pathways in the context of gene expression data. It takes into account topological information present in pathways to reveal biological information. PathNet is available as an R workspace image from
http://www.bhsai.org/downloads/pathnet/.
Collapse
Affiliation(s)
- Bhaskar Dutta
- DoD Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U,S, Army Medical Research and Materiel Command, Ft, Detrick, MD, 21702, USA.
| | | | | |
Collapse
|
20
|
Gao S, Jia S, Hessner MJ, Wang X. Predicting disease-related subnetworks for type 1 diabetes using a new network activity score. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2012; 16:566-78. [PMID: 22917479 DOI: 10.1089/omi.2012.0029] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
In this study we investigated the advantage of including network information in prioritizing disease genes of type 1 diabetes (T1D). First, a naïve Bayesian network (NBN) model was developed to integrate information from multiple data sources and to define a T1D-involvement probability score (PS) for each individual gene. The algorithm was validated using known functional candidate genes as a benchmark. Genes with higher PS were found to be more likely to appear in T1D-related publications. Next a new network activity metric was proposed to evaluate the T1D relevance of protein-protein interaction (PPI) subnetworks. The metric considered the contribution both from individual genes and from network topological characteristics. The predictions were confirmed by several independent datasets, including a genome wide association study (GWAS), and two large-scale human gene expression studies. We found that novel candidate genes in the T1D subnetworks showed more significant associations with T1D than genes predicted using PS alone. Interestingly, most novel candidates were not encoded within the human leukocyte antigen (HLA) region, and their expression levels showed correlation with disease only in cohorts with low-risk HLA genotypes. The results suggested the importance of mapping disease gene networks in dissecting the genetics of complex diseases, and offered a general approach to network-based disease gene prioritization from multiple data sources.
Collapse
Affiliation(s)
- Shouguo Gao
- Department of Physics, the University of Alabama at Birmingham, Birmingham, Alabama 35294, USA
| | | | | | | |
Collapse
|
21
|
Thomas R, Phuong J, McHale CM, Zhang L. Using bioinformatic approaches to identify pathways targeted by human leukemogens. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2012; 9:2479-503. [PMID: 22851955 PMCID: PMC3407916 DOI: 10.3390/ijerph9072479] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2012] [Revised: 06/25/2012] [Accepted: 06/26/2012] [Indexed: 12/28/2022]
Abstract
We have applied bioinformatic approaches to identify pathways common to chemical leukemogens and to determine whether leukemogens could be distinguished from non-leukemogenic carcinogens. From all known and probable carcinogens classified by IARC and NTP, we identified 35 carcinogens that were associated with leukemia risk in human studies and 16 non-leukemogenic carcinogens. Using data on gene/protein targets available in the Comparative Toxicogenomics Database (CTD) for 29 of the leukemogens and 11 of the non-leukemogenic carcinogens, we analyzed for enrichment of all 250 human biochemical pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The top pathways targeted by the leukemogens included metabolism of xenobiotics by cytochrome P450, glutathione metabolism, neurotrophin signaling pathway, apoptosis, MAPK signaling, Toll-like receptor signaling and various cancer pathways. The 29 leukemogens formed 18 distinct clusters comprising 1 to 3 chemicals that did not correlate with known mechanism of action or with structural similarity as determined by 2D Tanimoto coefficients in the PubChem database. Unsupervised clustering and one-class support vector machines, based on the pathway data, were unable to distinguish the 29 leukemogens from 11 non-leukemogenic known and probable IARC carcinogens. However, using two-class random forests to estimate leukemogen and non-leukemogen patterns, we estimated a 76% chance of distinguishing a random leukemogen/non-leukemogen pair from each other.
Collapse
Affiliation(s)
- Reuben Thomas
- Genes and Environment Laboratory, School of Public Health, University of California, Berkeley, CA 94720, USA.
| | | | | | | |
Collapse
|
22
|
Tarca AL, Draghici S, Bhatti G, Romero R. Down-weighting overlapping genes improves gene set analysis. BMC Bioinformatics 2012; 13:136. [PMID: 22713124 PMCID: PMC3443069 DOI: 10.1186/1471-2105-13-136] [Citation(s) in RCA: 103] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2012] [Accepted: 05/18/2012] [Indexed: 11/10/2022] Open
Abstract
Background The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set. Results In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method Pathway Analysis with Down-weighting of Overlapping Genes (PADOG). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results. Conclusions PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: http://bioinformaticsprb.med.wayne.edu/PADOG/or http://www.bioconductor.org.
Collapse
|
23
|
Effect of chemical mutagens and carcinogens on gene expression profiles in human TK6 cells. PLoS One 2012; 7:e39205. [PMID: 22723965 PMCID: PMC3377624 DOI: 10.1371/journal.pone.0039205] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2011] [Accepted: 05/18/2012] [Indexed: 12/19/2022] Open
Abstract
Characterization of toxicogenomic signatures of carcinogen exposure holds significant promise for mechanistic and predictive toxicology. In vitro transcriptomic studies allow the comparison of the response to chemicals with diverse mode of actions under controlled experimental conditions. We conducted an in vitro study in TK6 cells to characterize gene expression signatures of exposure to 15 genotoxic carcinogens frequently used in European industries. We also examined the dose-responsive changes in gene expression, and perturbation of biochemical pathways in response to these carcinogens. TK6 cells were exposed at 3 dose levels for 24 h with and without S9 human metabolic mix. Since S9 had an impact on gene expression (885 genes), we analyzed the gene expression data from cells cultures incubated with S9 and without S9 independently. The ribosome pathway was affected by all chemical-dose combinations. However in general, no similar gene expression was observed among carcinogens. Further, pathways, i.e. cell cycle, DNA repair mechanisms, RNA degradation, that were common within sets of chemical-dose combination were suggested by clustergram. Linear trends in dose–response of gene expression were observed for Trichloroethylene, Benz[a]anthracene, Epichlorohydrin, Benzene, and Hydroquinone. The significantly altered genes were involved in the regulation of (anti-) apoptosis, maintenance of cell survival, tumor necrosis factor-related pathways and immune response, in agreement with several other studies. Similarly in S9+ cultures, Benz[a]pyrene, Styrene and Trichloroethylene each modified over 1000 genes at high concentrations. Our findings expand our understanding of the transcriptomic response to genotoxic carcinogens, revealing the alteration of diverse sets of genes and pathways involved in cellular homeostasis and cell cycle control.
Collapse
|
24
|
Gu Z, Liu J, Cao K, Zhang J, Wang J. Centrality-based pathway enrichment: a systematic approach for finding significant pathways dominated by key genes. BMC SYSTEMS BIOLOGY 2012; 6:56. [PMID: 22672776 PMCID: PMC3443660 DOI: 10.1186/1752-0509-6-56] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Accepted: 05/24/2012] [Indexed: 12/18/2022]
Abstract
Background Biological pathways are important for understanding biological mechanisms. Thus, finding important pathways that underlie biological problems helps researchers to focus on the most relevant sets of genes. Pathways resemble networks with complicated structures, but most of the existing pathway enrichment tools ignore topological information embedded within pathways, which limits their applicability. Results A systematic and extensible pathway enrichment method in which nodes are weighted by network centrality was proposed. We demonstrate how choice of pathway structure and centrality measurement, as well as the presence of key genes, affects pathway significance. We emphasize two improvements of our method over current methods. First, allowing for the diversity of genes’ characters and the difficulty of covering gene importance from all aspects, we set centrality as an optional parameter in the model. Second, nodes rather than genes form the basic unit of pathways, such that one node can be composed of several genes and one gene may reside in different nodes. By comparing our methodology to the original enrichment method using both simulation data and real-world data, we demonstrate the efficacy of our method in finding new pathways from biological perspective. Conclusions Our method can benefit the systematic analysis of biological pathways and help to extract more meaningful information from gene expression data. The algorithm has been implemented as an R package CePa, and also a web-based version of CePa is provided.
Collapse
Affiliation(s)
- Zuguang Gu
- The State Key Laboratory of Pharmaceutical Biotechnology and Jiangsu Engineering Research Center for MicroRNA Biology and Biotechnology, School of Life Science, Nanjing University, Nanjing, 210093, China
| | | | | | | | | |
Collapse
|
25
|
Cheng S, Prot JM, Leclerc E, Bois FY. Zonation related function and ubiquitination regulation in human hepatocellular carcinoma cells in dynamic vs. static culture conditions. BMC Genomics 2012; 13:54. [PMID: 22296956 PMCID: PMC3295679 DOI: 10.1186/1471-2164-13-54] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2011] [Accepted: 02/01/2012] [Indexed: 01/19/2023] Open
Abstract
Background Understanding hepatic zonation is important both for liver physiology and pathology. There is currently no effective systemic chemotherapy for human hepatocellular carcinoma (HCC) and its pathogenesis is of special interest. Genomic and proteomic data of HCC cells in different culture models, coupled to pathway-based analysis, can help identify HCC-related gene and pathway dysfunctions. Results We identified zonation-related expression profiles contributing to selective phenotypes of HCC, by integrating relevant experimental observations through gene set enrichment analysis (GSEA). Analysis was based on gene and protein expression data measured on a human HCC cell line (HepG2/C3A) in two culture conditions: dynamic microfluidic biochips and static Petri dishes. Metabolic activity (HCC-related cytochromes P450) and genetic information processing were dominant in the dynamic cultures, in contrast to kinase signaling and cancer-specific profiles in static cultures. That, together with analysis of the published literature, leads us to propose that biochips culture conditions induce a periportal-like hepatocyte phenotype while standard plates cultures are more representative of a perivenous-like phenotype. Both proteomic data and GSEA results further reveal distinct ubiquitin-mediated protein regulation in the two culture conditions. Conclusions Pathways analysis, using gene and protein expression data from two cell culture models, confirmed specific human HCC phenotypes with regard to CYPs and kinases, and revealed a zonation-related pattern of expression. Ubiquitin-mediated regulation mechanism gives plausible explanations of our findings. Altogether, our results suggest that strategies aimed at inhibiting activated kinases and signaling pathways may lead to enhanced metabolism-mediated drug resistance of treated tumors. If that were the case, mitigating inhibition or targeting inactive forms of kinases would be an alternative.
Collapse
Affiliation(s)
- Shu Cheng
- Université de Technologie de Compiègne, BP 20529, 60205 Compiègne Cedex, France
| | | | | | | |
Collapse
|
26
|
Goetz AK, Singh BP, Battalora M, Breier JM, Bailey JP, Chukwudebe AC, Janus ER. Current and future use of genomics data in toxicology: Opportunities and challenges for regulatory applications. Regul Toxicol Pharmacol 2011; 61:141-53. [DOI: 10.1016/j.yrtph.2011.07.012] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2011] [Revised: 07/27/2011] [Accepted: 07/29/2011] [Indexed: 12/01/2022]
|
27
|
Abstract
Classical algorithms aiming at identifying biological pathways significantly related to studying conditions frequently reduced pathways to gene sets, with an obvious ignorance of the constitutive non-equivalence of various genes within a defined pathway. We here designed a network-based method to determine such non-equivalence in terms of gene weights. The gene weights determined are biologically consistent and robust to network perturbations. By integrating the gene weights into the classical gene set analysis, with a subsequent correction for the "over-counting" bias associated with multi-subunit proteins, we have developed a novel gene-weighed pathway analysis approach, as implemented in an R package called "Gene Associaqtion Network-based Pathway Analysis" (GANPA). Through analysis of several microarray datasets, including the p53 dataset, asthma dataset and three breast cancer datasets, we demonstrated that our approach is biologically reliable and reproducible, and therefore helpful for microarray data interpretation and hypothesis generation.
Collapse
|
28
|
North M, Tandon VJ, Thomas R, Loguinov A, Gerlovina I, Hubbard AE, Zhang L, Smith MT, Vulpe CD. Genome-wide functional profiling reveals genes required for tolerance to benzene metabolites in yeast. PLoS One 2011; 6:e24205. [PMID: 21912624 PMCID: PMC3166172 DOI: 10.1371/journal.pone.0024205] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2010] [Accepted: 08/06/2011] [Indexed: 11/18/2022] Open
Abstract
Benzene is a ubiquitous environmental contaminant and is widely used in industry. Exposure to benzene causes a number of serious health problems, including blood disorders and leukemia. Benzene undergoes complex metabolism in humans, making mechanistic determination of benzene toxicity difficult. We used a functional genomics approach to identify the genes that modulate the cellular toxicity of three of the phenolic metabolites of benzene, hydroquinone (HQ), catechol (CAT) and 1,2,4-benzenetriol (BT), in the model eukaryote Saccharomyces cerevisiae. Benzene metabolites generate oxidative and cytoskeletal stress, and tolerance requires correct regulation of iron homeostasis and the vacuolar ATPase. We have identified a conserved bZIP transcription factor, Yap3p, as important for a HQ-specific response pathway, as well as two genes that encode putative NAD(P)H:quinone oxidoreductases, PST2 and YCP4. Many of the yeast genes identified have human orthologs that may modulate human benzene toxicity in a similar manner and could play a role in benzene exposure-related disease.
Collapse
Affiliation(s)
- Matthew North
- Department of Nutritional Science and Toxicology, University of California, Berkeley, California, United States of America
| | - Vickram J. Tandon
- Department of Nutritional Science and Toxicology, University of California, Berkeley, California, United States of America
| | - Reuben Thomas
- Division of Environmental Health Sciences, School of Public Health, University of California, Berkeley, California, United States of America
| | - Alex Loguinov
- Department of Nutritional Science and Toxicology, University of California, Berkeley, California, United States of America
| | - Inna Gerlovina
- Division of Biostatistics, School of Public Health, University of California, Berkeley, California, United States of America
| | - Alan E. Hubbard
- Division of Environmental Health Sciences, School of Public Health, University of California, Berkeley, California, United States of America
- Division of Biostatistics, School of Public Health, University of California, Berkeley, California, United States of America
| | - Luoping Zhang
- Division of Environmental Health Sciences, School of Public Health, University of California, Berkeley, California, United States of America
| | - Martyn T. Smith
- Division of Environmental Health Sciences, School of Public Health, University of California, Berkeley, California, United States of America
| | - Chris D. Vulpe
- Department of Nutritional Science and Toxicology, University of California, Berkeley, California, United States of America
- * E-mail:
| |
Collapse
|
29
|
McHale CM, Zhang L, Lan Q, Vermeulen R, Li G, Hubbard AE, Porter KE, Thomas R, Portier CJ, Shen M, Rappaport SM, Yin S, Smith MT, Rothman N. Global gene expression profiling of a population exposed to a range of benzene levels. ENVIRONMENTAL HEALTH PERSPECTIVES 2011; 119:628-34. [PMID: 21147609 PMCID: PMC3094412 DOI: 10.1289/ehp.1002546] [Citation(s) in RCA: 84] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2010] [Accepted: 12/13/2010] [Indexed: 05/17/2023]
Abstract
BACKGROUND Benzene, an established cause of acute myeloid leukemia (AML), may also cause one or more lymphoid malignancies in humans. Previously, we identified genes and pathways associated with exposure to high (> 10 ppm) levels of benzene through transcriptomic analyses of blood cells from a small number of occupationally exposed workers. OBJECTIVES The goals of this study were to identify potential biomarkers of benzene exposure and/or early effects and to elucidate mechanisms relevant to risk of hematotoxicity, leukemia, and lymphoid malignancy in occupationally exposed individuals, many of whom were exposed to benzene levels < 1 ppm, the current U.S. occupational standard. METHODS We analyzed global gene expression in the peripheral blood mononuclear cells of 125 workers exposed to benzene levels ranging from < 1 ppm to > 10 ppm. Study design and analysis with a mixed-effects model minimized potential confounding and experimental variability. RESULTS We observed highly significant widespread perturbation of gene expression at all exposure levels. The AML pathway was among the pathways most significantly associated with benzene exposure. Immune response pathways were associated with most exposure levels, potentially providing biological plausibility for an association between lymphoma and benzene exposure. We identified a 16-gene expression signature associated with all levels of benzene exposure. CONCLUSIONS Our findings suggest that chronic benzene exposure, even at levels below the current U.S. occupational standard, perturbs many genes, biological processes, and pathways. These findings expand our understanding of the mechanisms by which benzene may induce hematotoxicity, leukemia, and lymphoma and reveal relevant potential biomarkers associated with a range of exposures.
Collapse
Affiliation(s)
- Cliona M McHale
- School of Public Health, University of California-Berkeley, Berkeley, California 64720, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Warsow G, Greber B, Falk SSI, Harder C, Siatkowski M, Schordan S, Som A, Endlich N, Schöler H, Repsilber D, Endlich K, Fuellen G. ExprEssence--revealing the essence of differential experimental data in the context of an interaction/regulation net-work. BMC SYSTEMS BIOLOGY 2010; 4:164. [PMID: 21118483 PMCID: PMC3012047 DOI: 10.1186/1752-0509-4-164] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2010] [Accepted: 11/30/2010] [Indexed: 12/15/2022]
Abstract
Background Experimentalists are overwhelmed by high-throughput data and there is an urgent need to condense information into simple hypotheses. For example, large amounts of microarray and deep sequencing data are becoming available, describing a variety of experimental conditions such as gene knockout and knockdown, the effect of interventions, and the differences between tissues and cell lines. Results To address this challenge, we developed a method, implemented as a Cytoscape plugin called ExprEssence. As input we take a network of interaction, stimulation and/or inhibition links between genes/proteins, and differential data, such as gene expression data, tracking an intervention or development in time. We condense the network, highlighting those links across which the largest changes can be observed. Highlighting is based on a simple formula inspired by the law of mass action. We can interactively modify the threshold for highlighting and instantaneously visualize results. We applied ExprEssence to three scenarios describing kidney podocyte biology, pluripotency and ageing: 1) We identify putative processes involved in podocyte (de-)differentiation and validate one prediction experimentally. 2) We predict and validate the expression level of a transcription factor involved in pluripotency. 3) Finally, we generate plausible hypotheses on the role of apoptosis, cell cycle deregulation and DNA repair in ageing data obtained from the hippocampus. Conclusion Reducing the size of gene/protein networks to the few links affected by large changes allows to screen for putative mechanistic relationships among the genes/proteins that are involved in adaptation to different experimental conditions, yielding important hypotheses, insights and suggestions for new experiments. We note that we do not focus on the identification of 'active subnetworks'. Instead we focus on the identification of single links (which may or may not form subnetworks), and these single links are much easier to validate experimentally than submodules. ExprEssence is available at http://sourceforge.net/projects/expressence/.
Collapse
Affiliation(s)
- Gregor Warsow
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, University of Rostock, Ernst-Heydemann-Strasse 8, Rostock, Germany
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Minguez P, Dopazo J. Functional genomics and networks: new approaches in the extraction of complex gene modules. Expert Rev Proteomics 2010; 7:55-63. [PMID: 20121476 DOI: 10.1586/epr.09.103] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The engine that makes the cell work is made of an intricate network of molecular interactions. Nowadays, the elements and relationships of this complex network can be studied with several types of high-throughput techniques. The dream of having a global picture of the cell from different perspectives that can jointly explain cell behavior is, at least technically, feasible. However, this task can only be accomplished by filling the gap between data and information. The availability of methods capable of accurately managing, integrating and analyzing the results from these experiments is crucial for this purpose. Here, we review the new challenges raised by the availability of different genomic data, as well as the new proposals presented to cope with the increasing data complexity. Special emphasis is given to approaches that explore the transcriptome trying to describe the modules of genes that account for the traits studied.
Collapse
Affiliation(s)
- Pablo Minguez
- Department of Bioinformatics and Genomics, Centro de Investigación Príncipe Felipe, Valencia, Spain
| | | |
Collapse
|
32
|
Chiu WA, Euling SY, Scott CS, Subramaniam RP. Approaches to advancing quantitative human health risk assessment of environmental chemicals in the post-genomic era. Toxicol Appl Pharmacol 2010; 271:309-23. [PMID: 20353796 DOI: 10.1016/j.taap.2010.03.019] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2010] [Revised: 03/19/2010] [Accepted: 03/22/2010] [Indexed: 10/19/2022]
Abstract
The contribution of genomics and associated technologies to human health risk assessment for environmental chemicals has focused largely on elucidating mechanisms of toxicity, as discussed in other articles in this issue. However, there is interest in moving beyond hazard characterization to making more direct impacts on quantitative risk assessment (QRA)--i.e., the determination of toxicity values for setting exposure standards and cleanup values. We propose that the evolution of QRA of environmental chemicals in the post-genomic era will involve three, somewhat overlapping phases in which different types of approaches begin to mature. The initial focus (in Phase I) has been and continues to be on "augmentation" of weight of evidence--using genomic and related technologies qualitatively to increase the confidence in and scientific basis of the results of QRA. Efforts aimed towards "integration" of these data with traditional animal-based approaches, in particular quantitative predictors, or surrogates, for the in vivo toxicity data to which they have been anchored are just beginning to be explored now (in Phase II). In parallel, there is a recognized need for "expansion" of the use of established biomarkers of susceptibility or risk of human diseases and disorders for QRA, particularly for addressing the issues of cumulative assessment and population risk. Ultimately (in Phase III), substantial further advances could be realized by the development of novel molecular and pathway-based biomarkers and statistical and in silico models that build on anticipated progress in understanding the pathways of human diseases and disorders. Such efforts would facilitate a gradual "reorientation" of QRA towards approaches that more directly link environmental exposures to human outcomes.
Collapse
Affiliation(s)
- Weihsueh A Chiu
- National Center for Environmental Assessment, U.S. Environmental Protection Agency, Washington DC, 20460, USA.
| | | | | | | |
Collapse
|
33
|
Gohlke JM, Thomas R, Zhang Y, Rosenstein MC, Davis AP, Murphy C, Becker KG, Mattingly CJ, Portier CJ. Genetic and environmental pathways to complex diseases. BMC SYSTEMS BIOLOGY 2009; 3:46. [PMID: 19416532 PMCID: PMC2680807 DOI: 10.1186/1752-0509-3-46] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2009] [Accepted: 05/05/2009] [Indexed: 12/23/2022]
Abstract
BACKGROUND Pathogenesis of complex diseases involves the integration of genetic and environmental factors over time, making it particularly difficult to tease apart relationships between phenotype, genotype, and environmental factors using traditional experimental approaches. RESULTS Using gene-centered databases, we have developed a network of complex diseases and environmental factors through the identification of key molecular pathways associated with both genetic and environmental contributions. Comparison with known chemical disease relationships and analysis of transcriptional regulation from gene expression datasets for several environmental factors and phenotypes clustered in a metabolic syndrome and neuropsychiatric subnetwork supports our network hypotheses. This analysis identifies natural and synthetic retinoids, antipsychotic medications, Omega 3 fatty acids, and pyrethroid pesticides as potential environmental modulators of metabolic syndrome phenotypes through PPAR and adipocytokine signaling and organophosphate pesticides as potential environmental modulators of neuropsychiatric phenotypes. CONCLUSION Identification of key regulatory pathways that integrate genetic and environmental modulators define disease associated targets that will allow for efficient screening of large numbers of environmental factors, screening that could set priorities for further research and guide public health decisions.
Collapse
Affiliation(s)
- Julia M Gohlke
- Environmental Systems Biology Group, Laboratory of Molecular Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA
| | - Reuben Thomas
- Environmental Systems Biology Group, Laboratory of Molecular Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA
| | - Yonqing Zhang
- Gene Expression and Genomics Unit, National Institute on Aging, National Institutes of Health, Baltimore, MD 21224, USA
| | - Michael C Rosenstein
- Department of Bioinformatics, Mount Desert Island Biological Laboratory, Old Bar Harbor Road, Salisbury Cove, ME 04672, USA
| | - Allan P Davis
- Department of Bioinformatics, Mount Desert Island Biological Laboratory, Old Bar Harbor Road, Salisbury Cove, ME 04672, USA
| | - Cynthia Murphy
- Department of Bioinformatics, Mount Desert Island Biological Laboratory, Old Bar Harbor Road, Salisbury Cove, ME 04672, USA
| | - Kevin G Becker
- Gene Expression and Genomics Unit, National Institute on Aging, National Institutes of Health, Baltimore, MD 21224, USA
| | - Carolyn J Mattingly
- Department of Bioinformatics, Mount Desert Island Biological Laboratory, Old Bar Harbor Road, Salisbury Cove, ME 04672, USA
| | - Christopher J Portier
- Environmental Systems Biology Group, Laboratory of Molecular Toxicology, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA
| |
Collapse
|