1
|
Li S, Pan T, Xu G, Gao Y, Zhang Y, Xu Q, Pan J, Zhou W, Xu J, Li Q, Li Y. Deep immunophenotyping reveals clinically distinct cellular states and ecosystems in large-scale colorectal cancer. Commun Biol 2023; 6:785. [PMID: 37500893 PMCID: PMC10374645 DOI: 10.1038/s42003-023-05117-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 07/07/2023] [Indexed: 07/29/2023] Open
Abstract
Determining the diverse cell types in the tumor microenvironment (TME) and their organization into cellular communities, is critical for understanding the biological heterogeneity and therapy of cancer. Here, we deeply immunophenotype the colorectal cancer (CRC) by integrative analysis of large-scale bulk and single cell transcriptome of 2350 patients and 53,137 cells. A rich landscape of 42 cellular states and 7 ecosystems in TMEs is uncovered and extend the previous immune classifications of CRC. Functional pathways and potential transcriptional regulators analysis of cellular states and ecosystems reveal cancer hallmark-related pathways and several critical transcription factors in CRC. High-resolution characterization of the TMEs, we discover the potential utility of cellular states (i.e., Monocytes/Macrophages and CD8 T cell) and ecosystems for prognosis and clinical therapy selection of CRC. Together, our results expand our understanding of cellular organization in TMEs of CRC, with potential implications for the development of biomarkers and precision therapies.
Collapse
Affiliation(s)
- Si Li
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
- School of Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, 150081, China
| | - Tao Pan
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
| | - Gang Xu
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yueying Gao
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
- School of Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, 150081, China
| | - Ya Zhang
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
| | - Qi Xu
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
| | - Jiwei Pan
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China
| | - Weiwei Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Juan Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China.
| | - Qifu Li
- The First Affiliated Hospital, Hainan Medical University, Haikou, 571199, China.
| | - Yongsheng Li
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Women and Children's Medical Center, Hainan Medical University, Haikou, 571199, China.
- School of Interdisciplinary Medicine and Engineering, Harbin Medical University, Harbin, 150081, China.
| |
Collapse
|
2
|
White M, Arif-Pardy J, Connor KL. Identification of novel nutrient-sensitive gene regulatory networks in amniocytes from fetuses with spina bifida. Reprod Toxicol 2023; 116:108333. [PMID: 36584796 DOI: 10.1016/j.reprotox.2022.12.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 12/14/2022] [Accepted: 12/26/2022] [Indexed: 12/28/2022]
Abstract
Neural tube defects (NTDs) remain among the most common congenital anomalies. Contributing risk factors include genetics and nutrient deficiencies, however, a comprehensive assessment of nutrient-gene interactions in NTDs is lacking. We applied a nutrient-focused gene expression analysis pipeline to identify nutrient-sensitive gene regulatory networks in amniocyte gene expression data (GSE4182) from fetuses with NTDs (cases; n = 3) and fetuses with no congenital anomalies (controls; n = 5). Differentially expressed genes (DEGs) were screened for having nutrient cofactors. Nutrient-dependent transcriptional regulators (TRs) that regulated DEGs, and nutrient-sensitive miRNAs with a previous link to NTDs, were identified. Of the 880 DEGs in cases, 10% had at least one nutrient cofactor. DEG regulatory network analysis revealed that 39% and 52% of DEGs in cases were regulated by 22 nutrient-sensitive miRNAs and 10 nutrient-dependent TRs, respectively. Zinc- and B vitamin-dependent gene regulatory networks (Zinc: 10 TRs targeting 50.6% of DEGs; B vitamins: 4 TRs targeting 37.7% of DEGs, 9 miRNAs targeting 17.6% of DEGs) were dysregulated in cases. We identified novel, nutrient-sensitive gene regulatory networks not previously linked to NTDs, which may indicate new targets to explore for NTD prevention or to optimise fetal development.
Collapse
Affiliation(s)
- Marina White
- Health Sciences, Carleton University, 1125 Colonel By Dr, Ottawa K1S 5B6, ON, Canada
| | - Jayden Arif-Pardy
- Health Sciences, Carleton University, 1125 Colonel By Dr, Ottawa K1S 5B6, ON, Canada
| | - Kristin L Connor
- Health Sciences, Carleton University, 1125 Colonel By Dr, Ottawa K1S 5B6, ON, Canada.
| |
Collapse
|
3
|
Gillis RF, Palmour RM. mRNA expression analysis of the hippocampus in a vervet monkey model of fetal alcohol spectrum disorder. J Neurodev Disord 2022; 14:21. [PMID: 35305552 PMCID: PMC8934503 DOI: 10.1186/s11689-022-09427-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 02/10/2022] [Indexed: 11/12/2022] Open
Abstract
Background Fetal alcohol spectrum disorders (FASD) are common, yet preventable developmental disorders that stem from prenatal exposure to alcohol. This exposure leads to a wide array of behavioural and physical problems with a complex and poorly defined biological basis. Molecular investigations to date predominantly use rodent animal models, but because of genetic, developmental and social behavioral similarity, primate models are more relevant. We previously reported reduced cortical and hippocampal neuron levels in an Old World monkey (Chlorocebus sabaeus) model with ethanol exposure targeted to the period of rapid synaptogenesis and report here an initial molecular study of this model. The goal of this study was to evaluate mRNA expression of the hippocampus at two different behavioural stages (5 months, 2 years) corresponding to human infancy and early childhood. Methods Offspring of alcohol-preferring or control dams drank a maximum of 3.5 g ethanol per kg body weight or calorically matched sucrose solution 4 days per week during the last 2 months of gestation. Total mRNA expression was measured with the Affymetrix GeneChip Rhesus Macaque Genome Array in a 2 × 2 study design that interrogated two independent variables, age at sacrifice, and alcohol consumption during gestation. Results and discussion Statistical analysis identified a preferential downregulation of expression when interrogating the factor ‘alcohol’ with a balanced effect of upregulation vs. downregulation for the independent variable ‘age’. Functional exploration of both independent variables shows that the alcohol consumption factor generates broad functional annotation clusters that likely implicate a role for epigenetics in the observed differential expression, while the variable age reliably produced functional annotation clusters predominantly related to development. Furthermore, our data reveals a novel connection between EFNB1 and the FASDs; this is highly plausible both due to the role of EFNB1 in neuronal development as well as its central role in craniofrontal nasal syndrome (CFNS). Fold changes for key genes were subsequently confirmed via qRT-PCR. Conclusion Prenatal alcohol exposure leads to global downregulation in mRNA expression. The cellular interference model of EFNB1 provides a potential clue regarding how genetically susceptible individuals may develop the phenotypic triad generally associated with classic fetal alcohol syndrome. Supplementary Information The online version contains supplementary material available at 10.1186/s11689-022-09427-z.
Collapse
|
4
|
Identification of shared molecular signatures between multiple sclerosis and Parkinson's disease using systems biology approach. GENE REPORTS 2022. [DOI: 10.1016/j.genrep.2022.101604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
5
|
Silveira DA, Ribeiro FM, Simão ÉM, Mattos VLD, Góes EG. Expression of genes and pathways associated with the B7-CD28 superfamily in response to irradiation of blood cells using 137Cs. Int J Radiat Biol 2020; 97:149-155. [PMID: 33253600 DOI: 10.1080/09553002.2021.1857454] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
PURPOSE DNA damage is one of the main consequences of exposure to ionizing irradiation (IR). Recent studies indicate that IR can modulate the expression of immune system-related genes. However, the effects of IR on the expression of genes and pathways of the B7-CD28 superfamily remain poorly defined. The aim of this study was to evaluate the modulation of genes and pathways related to the B7-CD28 superfamily in response to IR. MATERIALS AND METHODS In this study, we used transcriptome data available from the Gene Expression Omnibus (GEO) database to investigate the modulation of the response of genes and pathways of samples of human peripheral blood irradiated with doses of 150, 300, and 600 cGy. The data were obtained at 6 and 24 h after irradiation. The relationship between genes and pathways was established through the Reactome database. The behavior of these pathways was analyzed using mathematical methods based on relative activity and diversity. Analysis of variance (ANOVA) followed by multiple comparisons tests (Bonferroni and Tamhanes) was used to identify differentially expressed genes. Data on transcriptomes were analyzed through ViaComplex V.1.0 and IBM SPSS Statistics 22. RESULTS For the pathways investigated in this study, we observed that the effects produced by these doses significantly modified the behavior of five pathways associated with the immune system. Also, the dose of 300 cGy might trigger signaling for the activation of T cells through the negative regulation (p < .05) of the co-inhibitory PDCD1LG2 gene. Positive regulation caused by 300 cGy (p < .05) of the CD80 receptor was observed by us, which might be related to a stimulatory signal. According to our findings, this dose induced the production of cytokines and genes that are associated with the activation and differentiation of T cells. CONCLUSIONS Our findings indicate that the irradiation modulated the organization of the biological system, suggesting that 300 cGy is more efficient in activating the immune system.
Collapse
Affiliation(s)
- Daner A Silveira
- Institute of Mathematics, Statistics and Physics, Federal University of Rio Grande, Rio Grande, Brazil
| | - Fernanda M Ribeiro
- Institute of Mathematics, Statistics and Physics, Federal University of Rio Grande, Rio Grande, Brazil
| | - Éder M Simão
- Nanoscience Graduate Program, Franciscan University, Santa Maria, Brazil
| | - Viviane L D Mattos
- Institute of Mathematics, Statistics and Physics, Federal University of Rio Grande, Rio Grande, Brazil
| | - Evamberto G Góes
- Institute of Mathematics, Statistics and Physics, Federal University of Rio Grande, Rio Grande, Brazil
| |
Collapse
|
6
|
Shin J, Lee YM, Oh J, Jung S, Oh JW. Effects of gamma-aminobutyric acid and piperine on gene regulation in pig kidney epithelial cell lines. ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES 2020; 33:1497-1506. [PMID: 32054169 PMCID: PMC7468175 DOI: 10.5713/ajas.19.0745] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Accepted: 12/14/2019] [Indexed: 12/24/2022]
Abstract
Objective Gamma-aminobutyric acid (GABA) and piperine (PIP) are both nutritional supplements with potential use in animal diets. The purpose of this study is to investigate the effect of GABA and/or PIP treatment on the gene expression pattern of a pig kidney epithelial cell line. Methods LLCPK1 cells were treated with GABA, PIP, or both, and then the gene expression pattern was analyzed using microarray. Gene ontology analysis was done using GeneOntology (Geneontology.org), and validation was performed using quantitative real-time polymerase chain reaction. Results Gene ontology enrichment analysis was used to identify key pathway(s) of genes whose expression levels were regulated by these treatments. Microarray results showed that GABA had a positive effect on the transcription of genes related to regulation of erythrocyte differentiation and that GABA and PIP in combination had a synergistic effect on genes related to immune systems and processes. Furthermore, we found that effects of GABA and/or PIP on these selected genes were controlled by JNK/p38 MAPK pathway. Conclusion These results can improve our understanding of mechanisms involved in the effect of GABA and/or PIP treatment on pig kidney epithelial cells. They can also help us evaluate their potential as a clinical diagnosis and treatment.
Collapse
Affiliation(s)
- Juhyun Shin
- Department of Stem Cell and Regenerative Biotechnology, KIT, Konkuk University, Seoul, 05029 Korea
| | - Yoon-Mi Lee
- Department of Stem Cell and Regenerative Biotechnology, KIT, Konkuk University, Seoul, 05029 Korea
| | - Jeongheon Oh
- Department of Stem Cell and Regenerative Biotechnology, KIT, Konkuk University, Seoul, 05029 Korea
| | - Seunghwa Jung
- Department of Stem Cell and Regenerative Biotechnology, KIT, Konkuk University, Seoul, 05029 Korea
| | - Jae-Wook Oh
- Department of Stem Cell and Regenerative Biotechnology, KIT, Konkuk University, Seoul, 05029 Korea
| |
Collapse
|
7
|
Agrahari R, Foroushani A, Docking TR, Chang L, Duns G, Hudoba M, Karsan A, Zare H. Applications of Bayesian network models in predicting types of hematological malignancies. Sci Rep 2018; 8:6951. [PMID: 29725024 PMCID: PMC5934387 DOI: 10.1038/s41598-018-24758-5] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 04/05/2018] [Indexed: 12/17/2022] Open
Abstract
Network analysis is the preferred approach for the detection of subtle but coordinated changes in expression of an interacting and related set of genes. We introduce a novel method based on the analyses of coexpression networks and Bayesian networks, and we use this new method to classify two types of hematological malignancies; namely, acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS). Our classifier has an accuracy of 93%, a precision of 98%, and a recall of 90% on the training dataset (n = 366); which outperforms the results reported by other scholars on the same dataset. Although our training dataset consists of microarray data, our model has a remarkable performance on the RNA-Seq test dataset (n = 74, accuracy = 89%, precision = 88%, recall = 98%), which confirms that eigengenes are robust with respect to expression profiling technology. These signatures are useful in classification and correctly predicting the diagnosis. They might also provide valuable information about the underlying biology of diseases. Our network analysis approach is generalizable and can be useful for classifying other diseases based on gene expression profiles. Our previously published Pigengene package is publicly available through Bioconductor, which can be used to conveniently fit a Bayesian network to gene expression data.
Collapse
Affiliation(s)
- Rupesh Agrahari
- Department of Computer Science, Texas State University, San Marcos, Texas, 78666, USA
| | - Amir Foroushani
- Department of Computer Science, Texas State University, San Marcos, Texas, 78666, USA
| | - T Roderick Docking
- Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - Linda Chang
- Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - Gerben Duns
- Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - Monika Hudoba
- Department of Pathology and Laboratory Medicine, Vancouver General Hospital, Vancouver, British Columbia, V5Z 1M9, Canada
| | - Aly Karsan
- Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - Habil Zare
- Department of Computer Science, Texas State University, San Marcos, Texas, 78666, USA. .,Department of Cell Systems & Anatomy, The University of Texas Health Science Center, San Antonio, Texas, 78229, USA.
| |
Collapse
|
8
|
Heßelbach K, Kim GJ, Flemming S, Häupl T, Bonin M, Dornhof R, Günther S, Merfort I, Humar M. Disease relevant modifications of the methylome and transcriptome by particulate matter (PM 2.5) from biomass combustion. Epigenetics 2017; 12:779-792. [PMID: 28742980 PMCID: PMC5739103 DOI: 10.1080/15592294.2017.1356555] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Exposure to particulate matter (PM) is recognized as a major health hazard, but molecular responses are still insufficiently described. We analyzed the epigenetic impact of ambient PM2.5 from biomass combustion on the methylome of primary human bronchial epithelial BEAS-2B cells using the Illumina HumanMethylation450 BeadChip. The transcriptome was determined by the Affymetrix HG-U133 Plus 2.0 Array. PM2.5 induced genome wide alterations of the DNA methylation pattern, including differentially methylated CpGs in the promoter region associated with CpG islands. Gene ontology analysis revealed that differentially methylated genes were significantly clustered in pathways associated with the extracellular matrix, cellular adhesion, function of GTPases, and responses to extracellular stimuli, or were involved in ion binding and shuttling. Differential methylations also affected tandem repeats. Additionally, 45 different miRNA CpG loci showed differential DNA methylation, most of them proximal to their promoter. These miRNAs are functionally relevant for lung cancer, inflammation, asthma, and other PM-associated diseases. Correlation of the methylome and transcriptome demonstrated a clear bias toward transcriptional activation by hypomethylation. Genes that exhibited both differential methylation and expression were functionally linked to cytokine and immune responses, cellular motility, angiogenesis, inflammation, wound healing, cell growth, differentiation and development, or responses to exogenous matter. Disease ontology of differentially methylated and expressed genes indicated their prominent role in lung cancer and their participation in dominant cancer related signaling pathways. Thus, in lung epithelial cells, PM2.5 alters the methylome of genes and noncoding transcripts or elements that might be relevant for PM- and lung-associated diseases.
Collapse
Affiliation(s)
- Katharina Heßelbach
- a Pharmaceutical Biology and Biotechnology, Albert-Ludwigs-University Freiburg , Freiburg , Germany
| | - Gwang-Jin Kim
- b Pharmaceutical Bioinformatics, Albert-Ludwigs-University Freiburg , Freiburg , Germany
| | - Stephan Flemming
- b Pharmaceutical Bioinformatics, Albert-Ludwigs-University Freiburg , Freiburg , Germany
| | - Thomas Häupl
- c Department of Rheumatology and Clinical Immunology , Charité University Hospital Berlin , Germany
| | - Marc Bonin
- a Pharmaceutical Biology and Biotechnology, Albert-Ludwigs-University Freiburg , Freiburg , Germany
| | - Regina Dornhof
- a Pharmaceutical Biology and Biotechnology, Albert-Ludwigs-University Freiburg , Freiburg , Germany
| | - Stefan Günther
- d Pharmaceutical Bioinformatics and Freiburg Institute for Advanced Studies (FRIAS), Albert-Ludwigs University Freiburg , Freiburg , Germany
| | - Irmgard Merfort
- a Pharmaceutical Biology and Biotechnology, Albert-Ludwigs-University Freiburg , Freiburg , Germany
| | - Matjaz Humar
- a Pharmaceutical Biology and Biotechnology, Albert-Ludwigs-University Freiburg , Freiburg , Germany
| |
Collapse
|
9
|
Hira ZM, Gillies DF. Identifying Significant Features in Cancer Methylation Data Using Gene Pathway Segmentation. Cancer Inform 2016; 15:189-98. [PMID: 27688706 PMCID: PMC5030825 DOI: 10.4137/cin.s39859] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Revised: 06/19/2016] [Accepted: 07/03/2016] [Indexed: 12/19/2022] Open
Abstract
In order to provide the most effective therapy for cancer, it is important to be able to diagnose whether a patient's cancer will respond to a proposed treatment. Methylation profiling could contain information from which such predictions could be made. Currently, hypothesis testing is used to determine whether possible biomarkers for cancer progression produce statistically significant results. However, this approach requires the identification of individual genes, or sets of genes, as candidate hypotheses, and with the increasing size of modern microarrays, this task is becoming progressively harder. Exhaustive testing of small sets of genes is computationally infeasible, and so hypothesis generation depends either on the use of established biological knowledge or on heuristic methods. As an alternative machine learning, methods can be used to identify groups of genes that are acting together within sets of cancer data and associate their behaviors with cancer progression. These methods have the advantage of being multivariate and unbiased but unfortunately also rapidly become computationally infeasible as the number of gene probes and datasets increases. To address this problem, we have investigated a way of utilizing prior knowledge to segment microarray datasets in such a way that machine learning can be used to identify candidate sets of genes for hypothesis testing. A methylation dataset is divided into subsets, where each subset contains only the probes that relate to a known gene pathway. Each of these pathway subsets is used independently for classification. The classification method is AdaBoost with decision trees as weak classifiers. Since each pathway subset contains a relatively small number of gene probes, it is possible to train and test its classification accuracy quickly and determine whether it has valuable diagnostic information. Finally, genes from successful pathway subsets can be combined to create a classifier of high accuracy.
Collapse
Affiliation(s)
- Zena M. Hira
- Department of Computing, Imperial College London, London, UK
| | | |
Collapse
|
10
|
Williams EG, Wu Y, Jha P, Dubuis S, Blattmann P, Argmann CA, Houten SM, Amariuta T, Wolski W, Zamboni N, Aebersold R, Auwerx J. Systems proteomics of liver mitochondria function. Science 2016; 352:aad0189. [PMID: 27284200 PMCID: PMC10859670 DOI: 10.1126/science.aad0189] [Citation(s) in RCA: 212] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Accepted: 04/15/2016] [Indexed: 12/14/2022]
Abstract
Recent improvements in quantitative proteomics approaches, including Sequential Window Acquisition of all Theoretical Mass Spectra (SWATH-MS), permit reproducible large-scale protein measurements across diverse cohorts. Together with genomics, transcriptomics, and other technologies, transomic data sets can be generated that permit detailed analyses across broad molecular interaction networks. Here, we examine mitochondrial links to liver metabolism through the genome, transcriptome, proteome, and metabolome of 386 individuals in the BXD mouse reference population. Several links were validated between genetic variants toward transcripts, proteins, metabolites, and phenotypes. Among these, sequence variants in Cox7a2l alter its protein's activity, which in turn leads to downstream differences in mitochondrial supercomplex formation. This data set demonstrates that the proteome can now be quantified comprehensively, serving as a key complement to transcriptomics, genomics, and metabolomics--a combination moving us forward in complex trait analysis.
Collapse
Affiliation(s)
- Evan G Williams
- Laboratory of Integrative and Systems Physiology, Interfaculty Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, CH-1015, Switzerland. These authors contributed equally to this work
| | - Yibo Wu
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, CH-8093, Switzerland. These authors contributed equally to this work
| | - Pooja Jha
- Laboratory of Integrative and Systems Physiology, Interfaculty Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, CH-1015, Switzerland
| | - Sébastien Dubuis
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, CH-8093, Switzerland
| | - Peter Blattmann
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, CH-8093, Switzerland
| | - Carmen A Argmann
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, Box 1498, New York, NY 10029, USA
| | - Sander M Houten
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, Box 1498, New York, NY 10029, USA
| | - Tiffany Amariuta
- Laboratory of Integrative and Systems Physiology, Interfaculty Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, CH-1015, Switzerland
| | - Witold Wolski
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, CH-8093, Switzerland
| | - Nicola Zamboni
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, CH-8093, Switzerland
| | - Ruedi Aebersold
- Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, CH-8093, Switzerland. Faculty of Science, University of Zurich, CH-8057, Switzerland.
| | - Johan Auwerx
- Laboratory of Integrative and Systems Physiology, Interfaculty Institute of Bioengineering, École Polytechnique Fédérale de Lausanne, CH-1015, Switzerland.
| |
Collapse
|
11
|
Bell EH, Chakraborty AR, Mo X, Liu Z, Shilo K, Kirste S, Stegmaier P, McNulty M, Karachaliou N, Rosell R, Bepler G, Carbone DP, Chakravarti A. SMARCA4/BRG1 Is a Novel Prognostic Biomarker Predictive of Cisplatin-Based Chemotherapy Outcomes in Resected Non-Small Cell Lung Cancer. Clin Cancer Res 2015; 22:2396-404. [PMID: 26671993 DOI: 10.1158/1078-0432.ccr-15-1468] [Citation(s) in RCA: 101] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2015] [Accepted: 12/06/2015] [Indexed: 01/18/2023]
Abstract
PURPOSE Identification of predictive biomarkers is critically needed to improve selection of patients who derive the most benefit from platinum-based chemotherapy. We hypothesized that decreased expression of SMARCA4/BRG1, a known regulator of transcription and DNA repair, is a novel predictive biomarker of increased sensitivity to adjuvant platinum-based therapies in non-small cell lung cancer (NSCLC). EXPERIMENTAL DESIGN The prognostic value was tested using a gene-expression microarray from the Director's Challenge Lung Study (n = 440). The predictive significance of SMARCA4 was determined using a gene-expression microarray (n = 133) from control and treatment arms of the JBR.10 trial of adjuvant cisplatin/vinorelbine. Kaplan-Meier method and log-rank tests were used to estimate and test the differences of probabilities in overall survival (OS) and disease-specific survival (DSS) between expression groups and treatment arms. Multivariate Cox regression models were used while adjusting for other clinical covariates. RESULTS In the Director's Challenge Study, reduced expression of SMARCA4 was associated with poor OS compared with high and intermediate expression (P < 0.001 and P = 0.009, respectively). In multivariate analysis, compared with low, high SMARCA4 expression predicted a decrease in risk of death [HR, 0.6; 95% confidence interval (CI), 0.4-0.8; P = 0.002]. In the JBR.10 trial, improved 5-year DSS was noted only in patients with low SMARCA4 expression when treated with adjuvant cisplatin/vinorelbine [HR, 0.1; 95% CI, 0.0-0.5, P = 0.002 (low); HR, 1.0; 95% CI, 0.5-2.3, P = 0.92 (high)]. An interaction test was highly significant (P = 0.01). CONCLUSIONS Low expression of SMARCA4/BRG1 is significantly associated with worse prognosis; however, it is a novel significant predictive biomarker for increased sensitivity to platinum-based chemotherapy in NSCLC. Clin Cancer Res; 22(10); 2396-404. ©2015 AACR.
Collapse
Affiliation(s)
- Erica Hlavin Bell
- Department of Radiation Oncology, Arthur G. James Hospital/Ohio State Comprehensive Cancer Center, Columbus, Ohio.
| | - Arup R Chakraborty
- Department of Radiation Oncology, Arthur G. James Hospital/Ohio State Comprehensive Cancer Center, Columbus, Ohio
| | - Xiaokui Mo
- Center for Biostatistics, The Ohio State University Wexner Medical Center, Columbus, Ohio
| | - Ziyan Liu
- Department of Radiation Oncology, Arthur G. James Hospital/Ohio State Comprehensive Cancer Center, Columbus, Ohio
| | - Konstantin Shilo
- Department of Pathology, The Ohio State University Wexner Medical Center, Columbus, Ohio
| | - Simon Kirste
- Department of Radiation Oncology, Arthur G. James Hospital/Ohio State Comprehensive Cancer Center, Columbus, Ohio. Department of Radiation Oncology, University Medical Center Freiburg, Freiburg, Germany
| | - Petra Stegmaier
- Department of Radiation Oncology, Arthur G. James Hospital/Ohio State Comprehensive Cancer Center, Columbus, Ohio. Department of Radiation Oncology, University Medical Center Freiburg, Freiburg, Germany
| | - Maureen McNulty
- Department of Radiation Oncology, Arthur G. James Hospital/Ohio State Comprehensive Cancer Center, Columbus, Ohio
| | - Niki Karachaliou
- Translational Research Unit, Dr. Rosell Oncology Institute, Quirón Dexeus University Hospital, Barcelona, Spain
| | - Rafael Rosell
- Translational Research Unit, Dr. Rosell Oncology Institute, Quirón Dexeus University Hospital, Barcelona, Spain. Catalan Institute of Oncology, Badalona, Barcelona, Spain
| | - Gerold Bepler
- Barbara Ann Karmanos Cancer Institute, Wayne State University, Detroit, Michigan
| | - David P Carbone
- Department of Internal Medicine, The Ohio State University Wexner Medical Center, Columbus, Ohio
| | - Arnab Chakravarti
- Department of Radiation Oncology, Arthur G. James Hospital/Ohio State Comprehensive Cancer Center, Columbus, Ohio
| |
Collapse
|
12
|
Marakhonov A, Sadovskaya N, Antonov I, Baranova A, Skoblov M. Analysis of discordant Affymetrix probesets casts serious doubt on idea of microarray data reutilization. BMC Genomics 2014; 15 Suppl 12:S8. [PMID: 25563078 PMCID: PMC4303952 DOI: 10.1186/1471-2164-15-s12-s8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Background Affymetrix microarray technology allows one to investigate expression of thousands of genes simultaneously upon a variety of conditions. In a popular U133A microarray platform, the expression of 37% of genes is measured by more than one probeset. The discordant expression observed for two different probesets that match the same gene is a widespread phenomenon which is usually underestimated, ignored or disregarded. Results Here we evaluate the prevalence of discordant expression in data collected using Affymetrix HG-U133A microarray platform. In U133A, about 30% of genes annotated by two different probesets demonstrate a substantial correlation between independently measured expression values. To our surprise, sorting the probesets according to the nature of the discrepancy in their expression levels allowed the classification of the respective genes according to their fundamental functional properties, including observed enrichment by tissue-specific transcripts and alternatively spliced variants. On another hand, an absence of discrepancies in probesets that simultaneously match several different genes allowed us to pinpoint non-expressed pseudogenes and gene groups with highly correlated expression patterns. Nevertheless, in many cases, the nature of discordant expression of two probesets that match the same transcript remains unexplained. It is possible that these probesets report differently regulated sets of transcripts, or, in best case scenario, two different sets of transcripts that represent the same gene. Conclusion The majority of absolute gene expression values collected using Affymetrix microarrays may not be suitable for typical interpretative downstream analysis.
Collapse
|
13
|
Johnston A, Guzman AM, Swindell WR, Wang F, Kang S, Gudjonsson JE. Early tissue responses in psoriasis to the antitumour necrosis factor-α biologic etanercept suggest reduced interleukin-17 receptor expression and signalling. Br J Dermatol 2014; 171:97-107. [PMID: 24601997 DOI: 10.1111/bjd.12937] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/24/2014] [Indexed: 12/20/2022]
Abstract
BACKGROUND Antitumour necrosis factor (anti-TNF)-α therapy has made a significant impact on the treatment of psoriasis. Despite these agents being designed to neutralize TNF-α activity, their mechanism of action in the resolution of psoriasis remains unclear. OBJECTIVES To understand better the mechanism of action of etanercept by examining very early changes in the lesional skin of patients with psoriasis responding to etanercept. METHODS Twenty patients with chronic plaque psoriasis were enrolled and received etanercept 50 mg twice weekly. Skin biopsies were obtained before treatment and on days 1, 3, 7 and 14 post-treatment. Skin mRNA expression was analysed by quantitative reverse-transcription polymerase chain reaction and microarray; cytokine and phosphoprotein levels were assessed using multiplexed bead arrays. RESULTS In etanercept responders, we observed no significant changes in interleukin (IL)-17A, IL-22 or interferon-γ mRNA or protein in the first week of treatment; however, there was a 2·5-fold downregulation of IL-17 receptor C (IL-17RC) mRNA (P < 0·05) after day 1, accompanied by decreased extracellular signal-regulated kinase-1/2 phosphorylation. Transcriptional analysis revealed that genes suppressed by etanercept significantly overlapped with IL-17A-induced genes, and a marked overlap was also observed between the genes suppressed by etanercept and by the anti-IL-17A agent ixekizumab. Finally we show that TNF-α enhances the expression of IL-17RC, and short hairpin RNA inhibition of IL-17R expression abrogates synergistic gene induction by TNF and IL-17A. CONCLUSIONS These results suggest that the early responses of psoriasis plaques to etanercept may be due to decreased tissue responsiveness to IL-17A due to suppressed IL-17RC expression in keratinocytes, blunting the strong synergy between TNF-α and IL-17, which contributes to the maintenance of psoriasis lesions.
Collapse
Affiliation(s)
- A Johnston
- Department of Dermatology, University of Michigan, Ann Arbor, MI, 48109, U.S.A
| | | | | | | | | | | |
Collapse
|
14
|
Swindell WR, Stuart PE, Sarkar MK, Voorhees JJ, Elder JT, Johnston A, Gudjonsson JE. Cellular dissection of psoriasis for transcriptome analyses and the post-GWAS era. BMC Med Genomics 2014; 7:27. [PMID: 24885462 PMCID: PMC4060870 DOI: 10.1186/1755-8794-7-27] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2013] [Accepted: 05/16/2014] [Indexed: 12/20/2022] Open
Abstract
Background Genome-scale studies of psoriasis have been used to identify genes of potential relevance to disease mechanisms. For many identified genes, however, the cell type mediating disease activity is uncertain, which has limited our ability to design gene functional studies based on genomic findings. Methods We identified differentially expressed genes (DEGs) with altered expression in psoriasis lesions (n = 216 patients), as well as candidate genes near susceptibility loci from psoriasis GWAS studies. These gene sets were characterized based upon their expression across 10 cell types present in psoriasis lesions. Susceptibility-associated variation at intergenic (non-coding) loci was evaluated to identify sites of allele-specific transcription factor binding. Results Half of DEGs showed highest expression in skin cells, although the dominant cell type differed between psoriasis-increased DEGs (keratinocytes, 35%) and psoriasis-decreased DEGs (fibroblasts, 33%). In contrast, psoriasis GWAS candidates tended to have highest expression in immune cells (71%), with a significant fraction showing maximal expression in neutrophils (24%, P < 0.001). By identifying candidate cell types for genes near susceptibility loci, we could identify and prioritize SNPs at which susceptibility variants are predicted to influence transcription factor binding. This led to the identification of potentially causal (non-coding) SNPs for which susceptibility variants influence binding of AP-1, NF-κB, IRF1, STAT3 and STAT4. Conclusions These findings underscore the role of innate immunity in psoriasis and highlight neutrophils as a cell type linked with pathogenetic mechanisms. Assignment of candidate cell types to genes emerging from GWAS studies provides a first step towards functional analysis, and we have proposed an approach for generating hypotheses to explain GWAS hits at intergenic loci.
Collapse
Affiliation(s)
- William R Swindell
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, MI 48109-2200, USA.
| | | | | | | | | | | | | |
Collapse
|
15
|
Swindell WR, Xing X, Voorhees JJ, Elder JT, Johnston A, Gudjonsson JE. Integrative RNA-seq and microarray data analysis reveals GC content and gene length biases in the psoriasis transcriptome. Physiol Genomics 2014; 46:533-46. [PMID: 24844236 DOI: 10.1152/physiolgenomics.00022.2014] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Gene expression profiling of psoriasis has driven research advances and may soon provide the basis for clinical applications. For expression profiling studies, RNA-seq is now a competitive technology, but RNA-seq results may differ from those obtained by microarray. We therefore compared findings obtained by RNA-seq with those from eight microarray studies of psoriasis. RNA-seq and microarray datasets identified similar numbers of differentially expressed genes (DEGs), with certain genes uniquely identified by each technology. Correspondence between platforms and the balance of increased to decreased DEGs was influenced by mRNA abundance, GC content, and gene length. Weakly expressed genes, genes with low GC content, and long genes were all biased toward decreased expression in psoriasis lesions. The strength of these trends differed among array datasets, most likely due to variations in RNA quality. Gene length bias was by far the strongest trend and was evident in all datasets regardless of the expression profiling technology. The effect was due to differences between lesional and uninvolved skin with respect to the genome-wide correlation between gene length and gene expression, which was consistently more negative in psoriasis lesions. These findings demonstrate the complementary nature of RNA-seq and microarray technology and show that integrative analysis of both data types can provide a richer view of the transcriptome than strict reliance on a single method alone. Our results also highlight factors affecting correspondence between technologies, and we have established that gene length is a major determinant of differential expression in psoriasis lesions.
Collapse
Affiliation(s)
- William R Swindell
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, Michigan
| | - Xianying Xing
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, Michigan
| | - John J Voorhees
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, Michigan
| | - James T Elder
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, Michigan
| | - Andrew Johnston
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, Michigan
| | - Johann E Gudjonsson
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, Michigan
| |
Collapse
|
16
|
Okyere J, Oppon E, Dzidzienyo D, Sharma L, Ball G. Cross-species gene expression analysis of species specific differences in the preclinical assessment of pharmaceutical compounds. PLoS One 2014; 9:e96853. [PMID: 24823806 PMCID: PMC4019543 DOI: 10.1371/journal.pone.0096853] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2013] [Accepted: 04/11/2014] [Indexed: 01/11/2023] Open
Abstract
Animals are frequently used as model systems for determination of safety and efficacy in pharmaceutical research and development. However, significant quantitative and qualitative differences exist between humans and the animal models used in research. This is as a result of genetic variation between human and the laboratory animal. Therefore the development of a system that would allow the assessment of all molecular differences between species after drug exposure would have a significant impact on drug evaluation for toxicity and efficacy. Here we describe a cross-species microarray methodology that identifies and selects orthologous probes after cross-species sequence comparison to develop an orthologous cross-species gene expression analysis tool. The assumptions made by the use of this orthologous gene expression strategy for cross-species extrapolation is that; conserved changes in gene expression equate to conserved pharmacodynamic endpoints. This assumption is supported by the fact that evolution and selection have maintained the structure and function of many biochemical pathways over time, resulting in the conservation of many important processes. We demonstrate this cross-species methodology by investigating species specific differences of the peroxisome proliferator-activator receptor (PPAR) α response in rat and human.
Collapse
Affiliation(s)
- John Okyere
- CrossGen Limited, BioCity Nottingham, Pennyfoot Street, Nottingham, United Kingdom
- * E-mail:
| | - Ekow Oppon
- CrossGen Limited, BioCity Nottingham, Pennyfoot Street, Nottingham, United Kingdom
| | - Daniel Dzidzienyo
- CrossGen Limited, BioCity Nottingham, Pennyfoot Street, Nottingham, United Kingdom
| | - Lav Sharma
- CrossGen Limited, BioCity Nottingham, Pennyfoot Street, Nottingham, United Kingdom
| | - Graham Ball
- John Van Geest Cancer Research Centre, Nottingham Trent University, Clifton Campus, Clifton Lane, Nottingham, United Kingdom
| |
Collapse
|
17
|
Zhao S, Fung-Leung WP, Bittner A, Ngo K, Liu X. Comparison of RNA-Seq and microarray in transcriptome profiling of activated T cells. PLoS One 2014; 9:e78644. [PMID: 24454679 PMCID: PMC3894192 DOI: 10.1371/journal.pone.0078644] [Citation(s) in RCA: 603] [Impact Index Per Article: 60.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2013] [Accepted: 09/13/2013] [Indexed: 12/18/2022] Open
Abstract
To demonstrate the benefits of RNA-Seq over microarray in transcriptome profiling, both RNA-Seq and microarray analyses were performed on RNA samples from a human T cell activation experiment. In contrast to other reports, our analyses focused on the difference, rather than similarity, between RNA-Seq and microarray technologies in transcriptome profiling. A comparison of data sets derived from RNA-Seq and Affymetrix platforms using the same set of samples showed a high correlation between gene expression profiles generated by the two platforms. However, it also demonstrated that RNA-Seq was superior in detecting low abundance transcripts, differentiating biologically critical isoforms, and allowing the identification of genetic variants. RNA-Seq also demonstrated a broader dynamic range than microarray, which allowed for the detection of more differentially expressed genes with higher fold-change. Analysis of the two datasets also showed the benefit derived from avoidance of technical issues inherent to microarray probe performance such as cross-hybridization, non-specific hybridization and limited detection range of individual probes. Because RNA-Seq does not rely on a pre-designed complement sequence detection probe, it is devoid of issues associated with probe redundancy and annotation, which simplified interpretation of the data. Despite the superior benefits of RNA-Seq, microarrays are still the more common choice of researchers when conducting transcriptional profiling experiments. This is likely because RNA-Seq sequencing technology is new to most researchers, more expensive than microarray, data storage is more challenging and analysis is more complex. We expect that once these barriers are overcome, the RNA-Seq platform will become the predominant tool for transcriptome analysis.
Collapse
Affiliation(s)
- Shanrong Zhao
- Systems Pharmacology and Biomarkers, Janssen Research & Development, LLC, San Diego, California, United States of America
- * E-mail: (SZ); (XL)
| | - Wai-Ping Fung-Leung
- Immunology, Janssen Research & Development, LLC, San Diego, California, United States of America
| | - Anton Bittner
- C.R.E.A.Te Integrative Systems Biology, Janssen Research & Development, LLC, San Diego, California, United States of America
| | - Karen Ngo
- Immunology, Janssen Research & Development, LLC, San Diego, California, United States of America
| | - Xuejun Liu
- Systems Pharmacology and Biomarkers, Janssen Research & Development, LLC, San Diego, California, United States of America
- * E-mail: (SZ); (XL)
| |
Collapse
|
18
|
Saha A, Tan AC, Kang J. Automatic context-specific subnetwork discovery from large interaction networks. PLoS One 2014; 9:e84227. [PMID: 24392115 PMCID: PMC3877685 DOI: 10.1371/journal.pone.0084227] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2013] [Accepted: 11/21/2013] [Indexed: 01/18/2023] Open
Abstract
Genes act in concert via specific networks to drive various biological processes, including progression of diseases such as cancer. Under different phenotypes, different subsets of the gene members of a network participate in a biological process. Single gene analyses are less effective in identifying such core gene members (subnetworks) within a gene set/network, as compared to gene set/network-based analyses. Hence, it is useful to identify a discriminative classifier by focusing on the subnetworks that correspond to different phenotypes. Here we present a novel algorithm to automatically discover the important subnetworks of closely interacting molecules to differentiate between two phenotypes (context) using gene expression profiles. We name it COSSY (COntext-Specific Subnetwork discoverY). It is a non-greedy algorithm and thus unlikely to have local optima problems. COSSY works for any interaction network regardless of the network topology. One added benefit of COSSY is that it can also be used as a highly accurate classification platform which can produce a set of interpretable features.
Collapse
Affiliation(s)
- Ashis Saha
- Department of Computer Science and Engineering, Korea University, Seoul, Korea
| | - Aik Choon Tan
- Department of Medicine/Medical Oncology, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
| | - Jaewoo Kang
- Department of Computer Science and Engineering, Korea University, Seoul, Korea
- Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul, Korea
| |
Collapse
|
19
|
Bellis M. Estimating the similarity of alternative Affymetrix probe sets using transcriptional networks. BMC Res Notes 2013; 6:107. [PMID: 23517579 PMCID: PMC3630002 DOI: 10.1186/1756-0500-6-107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2013] [Accepted: 02/28/2013] [Indexed: 11/23/2022] Open
Abstract
Background The usefulness of the data from Affymetrix microarray analysis depends largely on the reliability of the files describing the correspondence between probe sets, genes and transcripts. Particularly, when a gene is targeted by several probe sets, these files should give information about the similarity of each alternative probe set pair. Transcriptional networks integrate the multiple correlations that exist between all probe sets and supply much more information than a simple correlation coefficient calculated for two series of signals. In this study, we used the PSAWN (Probe Set Assignment With Networks) programme we developed to investigate whether similarity of alternative probe sets resulted in some specific properties. Findings PSAWNpy delivered a full textual description of each probe set and information on the number and properties of secondary targets. PSAWNml calculated the similarity of each alternative probe set pair and allowed finding relationships between similarity and localisation of probes in common transcripts or exons. Similar alternative probe sets had very low negative correlation, high positive correlation and similar neighbourhood overlap. Using these properties, we devised a test that allowed grouping similar probe sets in a given network. By considering several networks, additional information concerning the similarity reproducibility was obtained, which allowed defining the actual similarity of alternative probe set pairs. In particular, we calculated the common localisation of probes in exons and in known transcripts and we showed that similarity was correctly correlated with them. The information collected on all pairs of alternative probe sets in the most popular 3’ IVT Affymetrix chips is available in tabular form at http://bns.crbm.cnrs.fr/download.html. Conclusions These processed data can be used to obtain a finer interpretation when comparing microarray data between biological conditions. They are particularly well adapted for searching 3’ alternative poly-adenylation events and can be also useful for studying the structure of transcriptional networks. The PSAWNpy, (in Python) and PSAWNml (in Matlab) programmes are freely available and can be downloaded at http://code.google.com/p/arraymatic. Tutorials and reference manuals are available at BMC Research Notes online (Additional file 1) or from http://bns.crbm.cnrs.fr/softwares.html.
Collapse
|
20
|
Swindell WR, Johnston A, Xing X, Voorhees JJ, Elder JT, Gudjonsson JE. Modulation of epidermal transcription circuits in psoriasis: new links between inflammation and hyperproliferation. PLoS One 2013; 8:e79253. [PMID: 24260178 PMCID: PMC3829857 DOI: 10.1371/journal.pone.0079253] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2013] [Accepted: 09/19/2013] [Indexed: 12/16/2022] Open
Abstract
Background Whole-genome expression profiling has been used to characterize molecular-level differences between psoriasis lesions and normal skin. Pathway analysis, however, is complicated by the fact that expression profiles have been derived from bulk skin biopsies with RNA derived from multiple cell types. Results We analyzed gene expression across a large sample of psoriatic (PP) and uninvolved/normal (PN) skin biopsies (n = 215 patients). We identified 1975 differentially expressed genes, including 8 associated with psoriasis susceptibility loci. To facilitate pathway analysis, PP versus PN differences in gene expression were analyzed with respect to 235 gene modules, each containing genes with a similar expression pattern in keratinocytes and epidermis. We identified 30 differentially expressed modules (DEMs) biased towards PP-increased or PP-decreased expression. These DEMs were associated with regulatory axes involving cytokines (e.g., IFN-γ, IL-17A, TNF-α), transcription factors (e.g., STAT1, NF-κB, E2F, RUNX1) and chromatin modifiers (SETDB1). We identified an interferon-induced DEM with genes encoding anti-viral proteins (designated “STAT1-57”), which was activated in psoriatic epidermis but repressed following biologic therapy. Genes within this DEM shared a motif near the transcription start site resembling the interferon-stimulated response element (ISRE). Conclusions We analyzed a large patient cohort and developed a new approach for delineating epidermis-specific pathways and regulatory mechanisms that underlie altered gene expression in psoriasis. Our findings highlight previously unrecognized “transcription circuits” that can provide targets for development of non-systemic therapies.
Collapse
Affiliation(s)
- William R. Swindell
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, Michigan, United States of America
- * E-mail:
| | - Andrew Johnston
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, Michigan, United States of America
| | - Xianying Xing
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, Michigan, United States of America
| | - John J. Voorhees
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, Michigan, United States of America
| | - James T. Elder
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, Michigan, United States of America
| | - Johann E. Gudjonsson
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, Michigan, United States of America
| |
Collapse
|
21
|
Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies. ISRN BIOINFORMATICS 2013; 2013:481545. [PMID: 25937948 PMCID: PMC4393068 DOI: 10.1155/2013/481545] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Accepted: 08/07/2013] [Indexed: 01/31/2023]
Abstract
RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets.
Collapse
|
22
|
Seifuddin F, Pirooznia M, Judy JT, Goes FS, Potash JB, Zandi PP. Systematic review of genome-wide gene expression studies of bipolar disorder. BMC Psychiatry 2013; 13:213. [PMID: 23945090 PMCID: PMC3765828 DOI: 10.1186/1471-244x-13-213] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/02/2013] [Accepted: 08/13/2013] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Numerous genome-wide gene expression studies of bipolar disorder (BP) have been carried out. These studies are heterogeneous, underpowered and use overlapping samples. We conducted a systematic review of these studies to synthesize the current findings. METHODS We identified all genome-wide gene expression studies on BP in humans. We then carried out a quantitative mega-analysis of studies done with post-mortem brain tissue. We obtained raw data from each study and used standardized procedures to process and analyze the data. We then combined the data and conducted three separate mega-analyses on samples from 1) any region of the brain (9 studies); 2) the prefrontal cortex (PFC) (6 studies); and 3) the hippocampus (2 studies). To minimize heterogeneity across studies, we focused primarily on the most numerous, recent and comprehensive studies. RESULTS A total of 30 genome-wide gene expression studies of BP done with blood or brain tissue were identified. We included 10 studies with data on 211 microarrays on 57 unique BP cases and 229 microarrays on 60 unique controls in the quantitative mega-analysis. A total of 382 genes were identified as significantly differentially expressed by the three analyses. Eleven genes survived correction for multiple testing with a q-value < 0.05 in the PFC. Among these were FKBP5 and WFS1, which have been previously implicated in mood disorders. Pathway analyses suggested a role for metallothionein proteins, MAP Kinase phosphotases, and neuropeptides. CONCLUSION We provided an up-to-date summary of results from gene expression studies of the brain in BP. Our analyses focused on the highest quality data available and provided results by brain region so that similarities and differences can be examined relative to disease status. The results are available for closer inspection on-line at Metamoodics [http://metamoodics.igm.jhmi.edu/], where investigators can look up any genes of interest and view the current results in their genomic context and in relation to leading findings from other genomic experiments in bipolar disorder.
Collapse
Affiliation(s)
- Fayaz Seifuddin
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| | - Mehdi Pirooznia
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Jennifer T Judy
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Fernando S Goes
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - James B Potash
- Department of Psychiatry, University of Iowa Carver College of Medicine, Iowa City, IA, USA
| | - Peter P Zandi
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA,Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| |
Collapse
|
23
|
Swindell WR, Johnston A, Voorhees JJ, Elder JT, Gudjonsson JE. Dissecting the psoriasis transcriptome: inflammatory- and cytokine-driven gene expression in lesions from 163 patients. BMC Genomics 2013; 14:527. [PMID: 23915137 PMCID: PMC3751090 DOI: 10.1186/1471-2164-14-527] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2012] [Accepted: 07/31/2013] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Psoriasis lesions are characterized by large-scale shifts in gene expression. Mechanisms that underlie differentially expressed genes (DEGs), however, are not completely understood. We analyzed existing datasets to evaluate genome-wide expression in lesions from 163 psoriasis patients. Our aims were to identify mechanisms that drive differential expression and to characterize heterogeneity among lesions in this large sample. RESULTS We identified 1233 psoriasis-increased DEGs and 977 psoriasis-decreased DEGs. Increased DEGs were attributed to keratinocyte activity (56%) and infiltration of lesions by T-cells (14%) and macrophages (11%). Decreased DEGs, in contrast, were associated with adipose tissue (63%), epidermis (14%) and dermis (4%). KC/epidermis DEGs were enriched for genes induced by IL-1, IL-17A and IL-20 family cytokines, and were also disproportionately associated with AP-1 binding sites. Among all patients, 50% exhibited a heightened inflammatory signature, with increased expression of genes expressed by T-cells, monocytes and dendritic cells. 66% of patients displayed an IFN-γ-strong signature, with increased expression of genes induced by IFN-γ in addition to several other cytokines (e.g., IL-1, IL-17A and TNF). We show that such differences in gene expression can be used to differentiate between etanercept responders and non-responders. CONCLUSIONS Psoriasis DEGs are partly explained by shifts in the cellular composition of psoriasis lesions. Epidermal DEGs, however, may be driven by the activity of AP-1 and cellular responses to IL-1, IL-17A and IL-20 family cytokines. Among patients, we uncovered a range of inflammatory- and cytokine-associated gene expression patterns. Such patterns may provide biomarkers for predicting individual responses to biologic therapy.
Collapse
Affiliation(s)
- William R Swindell
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, MI 48109-2200, USA
| | - Andrew Johnston
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, MI 48109-2200, USA
| | - John J Voorhees
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, MI 48109-2200, USA
| | - James T Elder
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, MI 48109-2200, USA
| | - Johann E Gudjonsson
- Department of Dermatology, University of Michigan School of Medicine, Ann Arbor, MI 48109-2200, USA
| |
Collapse
|
24
|
Cohen D, Bogeat-Triboulot MB, Vialet-Chabrand S, Merret R, Courty PE, Moretti S, Bizet F, Guilliot A, Hummel I. Developmental and environmental regulation of Aquaporin gene expression across Populus species: divergence or redundancy? PLoS One 2013; 8:e55506. [PMID: 23393587 PMCID: PMC3564762 DOI: 10.1371/journal.pone.0055506] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2012] [Accepted: 12/24/2012] [Indexed: 11/29/2022] Open
Abstract
Aquaporins (AQPs) are membrane channels belonging to the major intrinsic proteins family and are known for their ability to facilitate water movement. While in Populus trichocarpa, AQP proteins form a large family encompassing fifty-five genes, most of the experimental work focused on a few genes or subfamilies. The current work was undertaken to develop a comprehensive picture of the whole AQP gene family in Populus species by delineating gene expression domain and distinguishing responsiveness to developmental and environmental cues. Since duplication events amplified the poplar AQP family, we addressed the question of expression redundancy between gene duplicates. On these purposes, we carried a meta-analysis of all publicly available Affymetrix experiments. Our in-silico strategy controlled for previously identified biases in cross-species transcriptomics, a necessary step for any comparative transcriptomics based on multispecies design chips. Three poplar AQPs were not supported by any expression data, even in a large collection of situations (abiotic and biotic constraints, temporal oscillations and mutants). The expression of 11 AQPs was never or poorly regulated whatever the wideness of their expression domain and their expression level. Our work highlighted that PtTIP1;4 was the most responsive gene of the AQP family. A high functional divergence between gene duplicates was detected across species and in response to tested cues, except for the root-expressed PtTIP2;3/PtTIP2;4 pair exhibiting 80% convergent responses. Our meta-analysis assessed key features of aquaporin expression which had remained hidden in single experiments, such as expression wideness, response specificity and genotype and environment interactions. By consolidating expression profiles using independent experimental series, we showed that the large expansion of AQP family in poplar was accompanied with a strong divergence of gene expression, even if some cases of functional redundancy could be suspected.
Collapse
Affiliation(s)
- David Cohen
- INRA, UMR1137 Ecologie et Ecophysiologie Forestières, Champenoux, France
- Université de Lorraine, UMR1137 Ecologie et Ecophysiologie Forestières, Faculté des Sciences, Vandœuvre-lès-Nancy, France
| | - Marie-Béatrice Bogeat-Triboulot
- INRA, UMR1137 Ecologie et Ecophysiologie Forestières, Champenoux, France
- Université de Lorraine, UMR1137 Ecologie et Ecophysiologie Forestières, Faculté des Sciences, Vandœuvre-lès-Nancy, France
- * E-mail:
| | - Silvère Vialet-Chabrand
- INRA, UMR1137 Ecologie et Ecophysiologie Forestières, Champenoux, France
- Université de Lorraine, UMR1137 Ecologie et Ecophysiologie Forestières, Faculté des Sciences, Vandœuvre-lès-Nancy, France
| | - Rémy Merret
- INRA, UMR1137 Ecologie et Ecophysiologie Forestières, Champenoux, France
- Université de Lorraine, UMR1137 Ecologie et Ecophysiologie Forestières, Faculté des Sciences, Vandœuvre-lès-Nancy, France
| | - Pierre-Emmanuel Courty
- Zürich-Basel Plant Science Center, Botanical Institute, University of Basel, Basel, Switzerland
| | - Sébastien Moretti
- Vital-IT, SIB Swiss Institute of Bioinformatics, Quartier Sorge, bâtiment Génopode, Lausanne, Switzerland
- Department of Ecology and Evolution, bâtiment Biophore, Lausanne University, Lausanne, Switzerland
| | - François Bizet
- INRA, UMR1137 Ecologie et Ecophysiologie Forestières, Champenoux, France
- Université de Lorraine, UMR1137 Ecologie et Ecophysiologie Forestières, Faculté des Sciences, Vandœuvre-lès-Nancy, France
| | - Agnès Guilliot
- INRA, UMR1137 Ecologie et Ecophysiologie Forestières, Champenoux, France
- Université de Lorraine, UMR1137 Ecologie et Ecophysiologie Forestières, Faculté des Sciences, Vandœuvre-lès-Nancy, France
| | - Irène Hummel
- INRA, UMR1137 Ecologie et Ecophysiologie Forestières, Champenoux, France
- Université de Lorraine, UMR1137 Ecologie et Ecophysiologie Forestières, Faculté des Sciences, Vandœuvre-lès-Nancy, France
| |
Collapse
|
25
|
Fasold M, Binder H. Estimating RNA-quality using GeneChip microarrays. BMC Genomics 2012; 13:186. [PMID: 22583818 PMCID: PMC3519671 DOI: 10.1186/1471-2164-13-186] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2011] [Accepted: 04/13/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Microarrays are a powerful tool for transcriptome analysis. Best results are obtained using high-quality RNA samples for preparation and hybridization. Issues with RNA integrity can lead to low data quality and failure of the microarray experiment. RESULTS Microarray intensity data contains information to estimate the RNA quality of the sample. We here study the interplay of the characteristics of RNA surface hybridization with the effects of partly truncated transcripts on probe intensity. The 3'/5' intensity gradient, the basis of microarray RNA quality measures, is shown to depend on the degree of competitive binding of specific and of non-specific targets to a particular probe, on the degree of saturation of the probes with bound transcripts and on the distance of the probe from the 3'-end of the transcript. Increasing degrees of non-specific hybridization or of saturation reduce the 3'/5' intensity gradient and if not taken into account, this leads to biased results in common quality measures for GeneChip arrays such as affyslope or the control probe intensity ratio. We also found that short probe sets near the 3'-end of the transcripts are prone to non-specific hybridization presumable because of inaccurate positional assignment and the existence of transcript isoforms with variable 3' UTRs. Poor RNA quality is associated with a decreased amount of RNA material hybridized on the array paralleled by a decreased total signal level. Additionally, it causes a gene-specific loss of signal due to the positional bias of transcript abundance which requires an individual, gene-specific correction. We propose a new RNA quality measure that considers the hybridization mode. Graphical characteristics are introduced allowing assessment of RNA quality of each single array ('tongs plot' and 'degradation hook'). Furthermore, we suggest a method to correct for effects of RNA degradation on microarray intensities. CONCLUSIONS The presented RNA degradation measure has best correlation with the independent RNA integrity measure RIN, and therefore presents itself as a valuable tool for quality control and even for the study of RNA degradation. When RNA degradation effects are detected in microarray experiments, a correction of the induced bias in probe intensities is advised.
Collapse
Affiliation(s)
- Mario Fasold
- Interdisciplinary Center for Bioinformatics, Universität Leipzig, Haertelstr 16-18, Leipzig, D-4107, Germany
- LIFE - Leipzig Research Center for Civilization Diseases, Universität Leipzig, Leipzig, Germany
| | - Hans Binder
- Interdisciplinary Center for Bioinformatics, Universität Leipzig, Haertelstr 16-18, Leipzig, D-4107, Germany
- LIFE - Leipzig Research Center for Civilization Diseases, Universität Leipzig, Leipzig, Germany
| |
Collapse
|
26
|
Yaghoobi H, Haghipour S, Hamzeiy H, Asadi-Khiavi M. A review of modeling techniques for genetic regulatory networks. JOURNAL OF MEDICAL SIGNALS & SENSORS 2012; 2:61-70. [PMID: 23493097 PMCID: PMC3592506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2011] [Accepted: 01/15/2012] [Indexed: 12/04/2022]
Abstract
Understanding the genetic regulatory networks, the discovery of interactions between genes and understanding regulatory processes in a cell at the gene level are the major goals of system biology and computational biology. Modeling gene regulatory networks and describing the actions of the cells at the molecular level are used in medicine and molecular biology applications such as metabolic pathways and drug discovery. Modeling these networks is also one of the important issues in genomic signal processing. After the advent of microarray technology, it is possible to model these networks using time-series data. In this paper, we provide an extensive review of methods that have been used on time-series data and represent the features, advantages and disadvantages of each. Also, we classify these methods according to their nature. A parallel study of these methods can lead to the discovery of new synthetic methods or improve previous methods.
Collapse
Affiliation(s)
- Hanif Yaghoobi
- Department of Biomedical Engineering, Tabriz Branch, Islamic Azad University, Tabriz, Iran
| | - Siyamak Haghipour
- Department of Biomedical Engineering, Tabriz Branch, Islamic Azad University, Tabriz, Iran
| | - Hossein Hamzeiy
- Department of Pharmacology and Toxicology, School of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Masoud Asadi-Khiavi
- School of Pharmacy, Zanjan University of Medical Sciences, Zanjan, Iran,Address for correspondence: Prof. Masoud Asadi-Khiavi, School of Pharmacy, Zanjan University of Medical Science, Zanjan, Iran. E-mail:
| |
Collapse
|
27
|
Shanahan HP, Memon FN, Upton GJG, Harrison AP. Normalized Affymetrix expression data are biased by G-quadruplex formation. Nucleic Acids Res 2011; 40:3307-15. [PMID: 22199258 PMCID: PMC3333884 DOI: 10.1093/nar/gkr1230] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Probes with runs of four or more guanines (G-stacks) in their sequences can exhibit a level of hybridization that is unrelated to the expression levels of the mRNA that they are intended to measure. This is most likely caused by the formation of G-quadruplexes, where inter-probe guanines form Hoogsteen hydrogen bonds, which probes with G-stacks are capable of forming. We demonstrate that for a specific microarray data set using the Human HG_U133A Affymetrix GeneChip and RMA normalization there is significant bias in the expression levels, the fold change and the correlations between expression levels. These effects grow more pronounced as the number of G-stack probes in a probe set increases. Approximately 14% of the probe sets are directly affected. The analysis was repeated for a number of other normalization pipelines and two, FARMS and PLIER, minimized the bias to some extent. We estimate that ∼15% of the data sets deposited in the GEO database are susceptible to the effect. The inclusion of G-stack probes in the affected data sets can bias key parameters used in the selection and clustering of genes. The elimination of these probes from any analysis in such affected data sets outweighs the increase of noise in the signal.
Collapse
Affiliation(s)
- Hugh P Shanahan
- Department of Computer Science, Royal Holloway, University of London, Egham, Surrey, TW20 0EX, UK.
| | | | | | | |
Collapse
|
28
|
Liu Z, Niu Y, Li C, Yang Y, Gao C. Integrating multiple microarray datasets on oral squamous cell carcinoma to reveal dysregulated networks. Head Neck 2011; 34:1789-97. [PMID: 22179951 DOI: 10.1002/hed.22013] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/29/2011] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Oral squamous cell carcinoma (OSCC) is the sixth most common type of carcinoma worldwide. The pathogenic pathways involved in this cancer are mostly unknown; therefore, a better characterization of the OSCC gene expression profile would represent a considerable advance. The public availability of gene expression datasets was meant to obtain new insights on biological processes. METHODS We integrated 4 public microarray datasets on OSCC to evaluate the degree of consistency among the biological results obtained in these different studies and to identify common regulatory pathways that could be responsible for tumor growth. RESULTS Twelve altered cellular pathways implicated in OSCC and 4 genes altered in the extracellular matrix (ECM) receptor pathway were validated by quantitative real-time polymerase chain reaction (qRT-PCR). CONCLUSION Using 4 expression array datasets, we have developed a robust method for analyzing pathways altered in OSCC.
Collapse
Affiliation(s)
- Zhongyu Liu
- Anal-Colorectal Surgery Institute, No. 150 Central Hospital of PLA, Luoyang, China 471031
| | | | | | | | | |
Collapse
|
29
|
Howell GR, Walton DO, King BL, Libby RT, John SWM. Datgan, a reusable software system for facile interrogation and visualization of complex transcription profiling data. BMC Genomics 2011; 12:429. [PMID: 21864367 PMCID: PMC3171729 DOI: 10.1186/1471-2164-12-429] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2011] [Accepted: 08/24/2011] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND We introduce Glaucoma Discovery Platform (GDP), an online environment for facile visualization and interrogation of complex transcription profiling datasets for glaucoma. We also report the availability of Datgan, the suite of scripts that was developed to construct GDP. This reusable software system complements existing repositories such as NCBI GEO or EBI ArrayExpress as it allows the construction of searchable databases to maximize understanding of user-selected transcription profiling datasets. DESCRIPTION Datgan scripts were used to construct both the underlying data tables and the web interface that form GDP. GDP is populated using data from a mouse model of glaucoma. The data was generated using the DBA/2J strain, a widely used mouse model of glaucoma. The DBA/2J-Gpnmb+ strain provided a genetically matched control strain that does not develop glaucoma. We separately assessed both the retina and the optic nerve head, important tissues in glaucoma. We used hierarchical clustering to identify early molecular stages of glaucoma that could not be identified using morphological assessment of disease. GDP has two components. First, an interactive search and retrieve component provides the ability to assess gene(s) of interest in all identified stages of disease in both the retina and optic nerve head. The output is returned in graphical and tabular format with statistically significant differences highlighted for easy visual analysis. Second, a bulk download component allows lists of differentially expressed genes to be retrieved as a series of files compatible with Excel. To facilitate access to additional information available for genes of interest, GDP is linked to selected external resources including Mouse Genome Informatics and Online Medelian Inheritance in Man (OMIM). CONCLUSION Datgan-constructed databases allow user-friendly access to datasets that involve temporally ordered stages of disease or developmental stages. Datgan and GDP are available from http://glaucomadb.jax.org/glaucoma.
Collapse
|
30
|
Su Z, Li Z, Chen T, Li QZ, Fang H, Ding D, Ge W, Ning B, Hong H, Perkins RG, Tong W, Shi L. Comparing next-generation sequencing and microarray technologies in a toxicological study of the effects of aristolochic acid on rat kidneys. Chem Res Toxicol 2011; 24:1486-93. [PMID: 21834575 DOI: 10.1021/tx200103b] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
RNA-Seq has been increasingly used for the quantification and characterization of transcriptomes. The ongoing development of the technology promises the more accurate measurement of gene expression. However, its benefits over widely accepted microarray technologies have not been adequately assessed, especially in toxicogenomics studies. The goal of this study is to enhance the scientific community's understanding of the advantages and challenges of RNA-Seq in the quantification of gene expression by comparing analysis results from RNA-Seq and microarray data on a toxicogenomics study. A typical toxicogenomics study design was used to compare the performance of an RNA-Seq approach (Illumina Genome Analyzer II) to a microarray-based approach (Affymetrix Rat Genome 230 2.0 arrays) for detecting differentially expressed genes (DEGs) in the kidneys of rats treated with aristolochic acid (AA), a carcinogenic and nephrotoxic chemical most notably used for weight loss. We studied the comparability of the RNA-Seq and microarray data in terms of absolute gene expression, gene expression patterns, differentially expressed genes, and biological interpretation. We found that RNA-Seq was more sensitive in detecting genes with low expression levels, while similar gene expression patterns were observed for both platforms. Moreover, although the overlap of the DEGs was only 40-50%, the biological interpretation was largely consistent between the RNA-Seq and microarray data. RNA-Seq maintained a consistent biological interpretation with time-tested microarray platforms while generating more sensitive results. However, there is clearly a need for future investigations to better understand the advantages and limitations of RNA-Seq in toxicogenomics studies and environmental health research.
Collapse
Affiliation(s)
- Zhenqiang Su
- ICF International at FDA's National Center for Toxicological Research, 3900 NCTR Road, Jefferson, Arkansas 72079, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Jjingo D, Huda A, Gundapuneni M, Mariño-Ramírez L, Jordan IK. Effect of the transposable element environment of human genes on gene length and expression. Genome Biol Evol 2011; 3:259-71. [PMID: 21362639 PMCID: PMC3070429 DOI: 10.1093/gbe/evr015] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Independent lines of investigation have documented effects of both transposable elements (TEs) and gene length (GL) on gene expression. However, TE gene fractions are highly correlated with GL, suggesting that they cannot be considered independently. We evaluated the TE environment of human genes and GL jointly in an attempt to tease apart their relative effects. TE gene fractions and GL were compared with the overall level of gene expression and the breadth of expression across tissues. GL is strongly correlated with overall expression level but weakly correlated with the breadth of expression, confirming the selection hypothesis that attributes the compactness of highly expressed genes to selection for economy of transcription. However, TE gene fractions overall, and for the L1 family in particular, show stronger anticorrelations with expression level than GL, indicating that GL may not be the most important target of selection for transcriptional economy. These results suggest a specific mechanism, removal of TEs, by which highly expressed genes are selectively tuned for efficiency. MIR elements are the only family of TEs with gene fractions that show a positive correlation with tissue-specific expression, suggesting that they may provide regulatory sequences that help to control human gene expression. Consistent with this notion, MIR fractions are relatively enriched close to transcription start sites and associated with coexpression in specific sets of related tissues. Our results confirm the overall relevance of the TE environment to gene expression and point to distinct mechanisms by which different TE families may contribute to gene regulation.
Collapse
Affiliation(s)
- Daudi Jjingo
- School of Biology, Georgia Institute of Technology, GA, USA
| | | | | | | | | |
Collapse
|
32
|
Noriega NC, Kohama SG, Urbanski HF. Microarray analysis of relative gene expression stability for selection of internal reference genes in the rhesus macaque brain. BMC Mol Biol 2010; 11:47. [PMID: 20565976 PMCID: PMC2914640 DOI: 10.1186/1471-2199-11-47] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2010] [Accepted: 06/21/2010] [Indexed: 12/18/2022] Open
Abstract
Background Normalization of gene expression data refers to the comparison of expression values using reference standards that are consistent across all conditions of an experiment. In PCR studies, genes designated as "housekeeping genes" have been used as internal reference genes under the assumption that their expression is stable and independent of experimental conditions. However, verification of this assumption is rarely performed. Here we assess the use of gene microarray analysis to facilitate selection of internal reference sequences with higher expression stability across experimental conditions than can be expected using traditional selection methods. We recently demonstrated that relative gene expression from qRT-PCR data normalized using GAPDH, ALG9 and RPL13A expression values mirrored relative expression using quantile normalization in Robust Multichip Analysis (RMA) on the Affymetrix® GeneChip® rhesus Macaque Genome Array. Having shown that qRT-PCR and Affymetrix® GeneChip® data from the same hormone replacement therapy (HRT) study yielded concordant results, we used quantile-normalized gene microarray data to identify the most stably expressed among probe sets for prospective internal reference genes across three brain regions from the HRT study and an additional study of normally menstruating rhesus macaques (cycle study). Gene selection was limited to 575 previously published human "housekeeping" genes. Twelve animals were used per study, and three brain regions were analyzed from each animal. Gene expression stabilities were determined using geNorm, NormFinder and BestKeeper software packages. Results Sequences co-annotated for ribosomal protein S27a (RPS27A), and ubiquitin were among the most stably expressed under all conditions and selection criteria used for both studies. Higher annotation quality on the human GeneChip® facilitated more targeted analysis than could be accomplished using the rhesus GeneChip®. In the cycle study, multiple probe sets annotated for actin, gamma 1 (ACTG1) showed high signal intensity and were among the most stably expressed. Conclusions Using gene microarray analysis, we identified genes showing high expression stability under various sex-steroid environments in different regions of the rhesus macaque brain. Use of quantile-normalized microarray gene expression values represents an improvement over traditional methods of selecting internal reference genes for PCR analysis.
Collapse
Affiliation(s)
- Nigel C Noriega
- Division of Neuroscience, Oregon National Primate Research Center, 505 NW 185th Avenue, Beaverton, OR 97006, USA.
| | | | | |
Collapse
|
33
|
Ballester B, Johnson N, Proctor G, Flicek P. Consistent annotation of gene expression arrays. BMC Genomics 2010; 11:294. [PMID: 20459806 PMCID: PMC2894801 DOI: 10.1186/1471-2164-11-294] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2009] [Accepted: 05/11/2010] [Indexed: 02/03/2023] Open
Abstract
Background Gene expression arrays are valuable and widely used tools for biomedical research. Today's commercial arrays attempt to measure the expression level of all of the genes in the genome. Effectively translating the results from the microarray into a biological interpretation requires an accurate mapping between the probesets on the array and the genes that they are targeting. Although major array manufacturers provide annotations of their gene expression arrays, the methods used by various manufacturers are different and the annotations are difficult to keep up to date in the rapidly changing world of biological sequence databases. Results We have created a consistent microarray annotation protocol applicable to all of the major array manufacturers. We constantly keep our annotations updated with the latest Ensembl Gene predictions, and thus cross-referenced with a large number of external biomedical sequence database identifiers. We show that these annotations are accurate and address in detail reasons for the minority of probesets that cannot be annotated. Annotations are publicly accessible through the Ensembl Genome Browser and programmatically through the Ensembl Application Programming Interface. They are also seamlessly integrated into the BioMart data-mining tool and the biomaRt package of BioConductor. Conclusions Consistent, accurate and updated gene expression array annotations remain critical for biological research. Our annotations facilitate accurate biological interpretation of gene expression profiles.
Collapse
Affiliation(s)
- Benoît Ballester
- European Bioinformatics Institute EMBL, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | | | |
Collapse
|
34
|
Rata IA, Li Y, Jakobsson E. Backbone statistical potential from local sequence-structure interactions in protein loops. J Phys Chem B 2010; 114:1859-69. [PMID: 20070091 DOI: 10.1021/jp909874g] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Native proteins have been optimized by evolution simultaneously for structure and sequence. Structural databases reflect this interdependency. In this paper, we present a new statistical potential for a reduced backbone representation that has both structure and sequence characteristics as variables. We use information from structural data available in the Protein Coil Library, selected on the basis of resolution and refinement factor. In these structures, the nonlocal interactions are randomly distributed and, thus, average out in statistics, so structural propensities due to local backbone-based interactions can be studied separately. We collect data in the form of local sequence-specific phi-psi backbone dihedral pairs. From these data, we construct dihedral probability density functions (DPDFs) that quantify any adjacent phi-psi pair distribution in the context of all possible combinations of local residue types. We use a probabilistic analysis to deduce how the correlations encoded in the various DPDFs as well as in residue frequencies propagate along the sequence and can be cumulated in a statistical potential capable of efficiently scoring a loop by its backbone conformation and sequence only. Our potential is able to identify with high accuracy the native structure of a loop with a given sequence among possible alternative conformations from sets of well-constructed decoys. Conversely, the potential can also be used for sequence prediction problems and is shown to score the native sequence of a given loop structure among the most fit of the possible sequence combinations. Applications for both structure prediction and sequence design are discussed.
Collapse
Affiliation(s)
- Ionel A Rata
- Department of Molecular and Integrative Physiology, UIUC Program in Biophysics, National Center for Supercomputing Applications, and Beckman Institute, University of Illinois, Urbana, Illinois 61801, USA.
| | | | | |
Collapse
|
35
|
Thorrez L, Tranchevent LC, Chang HJ, Moreau Y, Schuit F. Detection of novel 3' untranslated region extensions with 3' expression microarrays. BMC Genomics 2010; 11:205. [PMID: 20346121 PMCID: PMC2858751 DOI: 10.1186/1471-2164-11-205] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2009] [Accepted: 03/26/2010] [Indexed: 11/24/2022] Open
Abstract
Background The 3' untranslated regions (UTRs) of transcripts are not well characterized for many genes and often extend beyond the annotated regions. Since Affymetrix 3' expression arrays were designed based on expressed sequence tags, many probesets map to intergenic regions downstream of genes. We used expression information from these probesets to predict transcript extension beyond currently known boundaries. Results Based on our dataset encompassing expression in 22 different murine tissues, we identified 845 genes with predicted 3'UTR extensions. These extensions have a similar conservation as known 3'UTRs, which is distinctly higher than intergenic regions. We verified 8 of the predictions by PCR and found all of the predicted regions to be expressed. The method can be extended to other 3' expression microarray platforms as we demonstrate with human data. Additional confirming evidence was obtained from public paired end read data. Conclusions We show that many genes have 3'UTR regions extending beyond currently known gene regions and provide a method to identify such regions based on microarray expression data. Since 3' UTR contain microRNA binding sites and other stability determining regions, identification of the full length 3' UTR is important to elucidate posttranscriptional regulation.
Collapse
Affiliation(s)
- Lieven Thorrez
- Gene Expression Unit, Department of Molecular Cell Biology, Katholieke Universiteit Leuven, Leuven, Belgium
| | | | | | | | | |
Collapse
|
36
|
Robinson TJ, Dinan MA, Dewhirst M, Garcia-Blanco MA, Pearson JL. SplicerAV: a tool for mining microarray expression data for changes in RNA processing. BMC Bioinformatics 2010; 11:108. [PMID: 20184770 PMCID: PMC2838864 DOI: 10.1186/1471-2105-11-108] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2009] [Accepted: 02/25/2010] [Indexed: 12/22/2022] Open
Abstract
Background Over the past two decades more than fifty thousand unique clinical and biological samples have been assayed using the Affymetrix HG-U133 and HG-U95 GeneChip microarray platforms. This substantial repository has been used extensively to characterize changes in gene expression between biological samples, but has not been previously mined en masse for changes in mRNA processing. We explored the possibility of using HG-U133 microarray data to identify changes in alternative mRNA processing in several available archival datasets. Results Data from these and other gene expression microarrays can now be mined for changes in transcript isoform abundance using a program described here, SplicerAV. Using in vivo and in vitro breast cancer microarray datasets, SplicerAV was able to perform both gene and isoform specific expression profiling within the same microarray dataset. Our reanalysis of Affymetrix U133 plus 2.0 data generated by in vitro over-expression of HRAS, E2F3, beta-catenin (CTNNB1), SRC, and MYC identified several hundred oncogene-induced mRNA isoform changes, one of which recognized a previously unknown mechanism of EGFR family activation. Using clinical data, SplicerAV predicted 241 isoform changes between low and high grade breast tumors; with changes enriched among genes coding for guanyl-nucleotide exchange factors, metalloprotease inhibitors, and mRNA processing factors. Isoform changes in 15 genes were associated with aggressive cancer across the three breast cancer datasets. Conclusions Using SplicerAV, we identified several hundred previously uncharacterized isoform changes induced by in vitro oncogene over-expression and revealed a previously unknown mechanism of EGFR activation in human mammary epithelial cells. We analyzed Affymetrix GeneChip data from over 400 human breast tumors in three independent studies, making this the largest clinical dataset analyzed for en masse changes in alternative mRNA processing. The capacity to detect RNA isoform changes in archival microarray data using SplicerAV allowed us to carry out the first analysis of isoform specific mRNA changes directly associated with cancer survival.
Collapse
Affiliation(s)
- Timothy J Robinson
- Molecular Cancer Biology Program, Duke University Medical Center, Durham, USA
| | | | | | | | | |
Collapse
|
37
|
Becanovic K, Pouladi MA, Lim RS, Kuhn A, Pavlidis P, Luthi-Carter R, Hayden MR, Leavitt BR. Transcriptional changes in Huntington disease identified using genome-wide expression profiling and cross-platform analysis. Hum Mol Genet 2010; 19:1438-52. [PMID: 20089533 DOI: 10.1093/hmg/ddq018] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Evaluation of transcriptional changes in the striatum may be an effective approach to understanding the natural history of changes in expression contributing to the pathogenesis of Huntington disease (HD). We have performed genome-wide expression profiling of the YAC128 transgenic mouse model of HD at 12 and 24 months of age using two platforms in parallel: Affymetrix and Illumina. The data from these two powerful platforms were integrated to create a combined rank list, thereby revealing the identity of additional genes that proved to be differentially expressed between YAC128 and control mice. Using this approach, we identified 13 genes to be differentially expressed between YAC128 and controls which were validated by quantitative real-time PCR in independent cohorts of animals. In addition, we analyzed additional time points relevant to disease pathology: 3, 6 and 9 months of age. Here we present data showing the evolution of changes in the expression of selected genes: Wt1, Pcdh20 and Actn2 RNA levels change as early as 3 months of age, whereas Gsg1l, Sfmbt2, Acy3, Polr2a and Ppp1r9a RNA expression levels are affected later, at 12 and 24 months of age. We also analyzed the expression of these 13 genes in human HD and control brain, thereby revealing changes in SLC45A3, PCDH20, ACTN2, DDAH1 and PPP1R9A RNA expression. Further study of these genes may unravel novel pathways contributing to HD pathogenesis. DDBJ/EMBL/GenBank accession no: GSE19677.
Collapse
Affiliation(s)
- Kristina Becanovic
- Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada V5Z 4H4
| | | | | | | | | | | | | | | |
Collapse
|
38
|
Upton GJG, Sanchez-Graillet O, Rowsell J, Arteaga-Salas JM, Graham NS, Stalteri MA, Memon FN, May ST, Harrison AP. On the causes of outliers in Affymetrix GeneChip data. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2009; 8:199-212. [PMID: 19734302 DOI: 10.1093/bfgp/elp027] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
We describe various types of outliers seen in Affymetrix GeneChip data. We have been able to utilise the data in the Gene Expression Omnibus to screen GeneChips across a range of scales, from single probes, to spatially adjacent fractions of arrays, to whole arrays, to whole experiments. In this review we describe a number of causes for why some reported intensities might be misleading on GeneChips.
Collapse
Affiliation(s)
- Graham J G Upton
- University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, UK
| | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Li J, Fan Y, Chen J, Yao KT, Huang ZX. Microarray analysis of differentially expressed genes between nasopharyngeal carcinoma cell lines 5-8F and 6-10B. ACTA ACUST UNITED AC 2009; 196:23-30. [PMID: 19963132 DOI: 10.1016/j.cancergencyto.2009.08.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2009] [Revised: 07/14/2009] [Accepted: 08/02/2009] [Indexed: 12/13/2022]
Abstract
Nasopharyngeal carcinoma (NPC) cell lines 5-8F (high tumorigenic and metastatic) and 6-10B (low tumorigenic and metastatic) are subclones of SUNE1. To address their biological differences, three biologic repeats of expression microarray analysis were performed. Only 60 differently expressed genes were identified between the two cell lines. These genes were randomly distributed on all the chromosomes. Gene ontology analysis showed that most of these genes participated in cellular and metabolic processes, and the primary molecular functions of each were catalytic activity, ion binding, and protein binding. Literature mining revealed that these genes were specifically related to apoptosis, cell cycle, metastasis, chemokines, and immunoediting, but not cancer, NPC, stem cells, lymphangiogenesis, angiogenesis, inflammation, nor proliferation. In particular, 42/60 genes have established metastatic functions (P < 0.00001), while 11 out of those 42 genes formed gene networks related to metastasis (P = 0.013). Thus, the 60 genes identified by this microarray experiment most likely represent a core set of genes that comprise shared metastatic gene networks between the two cell lines and mediate their differential metastatic characteristics. Among the gene networks identified, the PTHLH gene was of particular interest. Predicted to regulate the WNT pathway through the DKK1 gene, the PTHLH gene may affect metastasis and apoptosis of NPC and merits further study.
Collapse
Affiliation(s)
- Jia Li
- Cancer Institute, Southern Medical University, Guangzhou, 510515, China
| | | | | | | | | |
Collapse
|
40
|
Fulci V, Colombo T, Chiaretti S, Messina M, Citarella F, Tavolaro S, Guarini A, Foà R, Macino G. Characterization of B- and T-lineage acute lymphoblastic leukemia by integrated analysis of MicroRNA and mRNA expression profiles. Genes Chromosomes Cancer 2009; 48:1069-82. [DOI: 10.1002/gcc.20709] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
|
41
|
Wang D, Wang C, Zhang L, Xiao H, Shen X, Ren L, Zhao W, Hong G, Zhang Y, Zhu J, Zhang M, Yang D, Ma W, Guo Z. Evaluation of cDNA microarray data by multiple clones mapping to the same transcript. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2009; 13:493-9. [PMID: 19715395 DOI: 10.1089/omi.2009.0077] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Although novel technologies are rapidly emerging, the cDNA microarray data accumulated is still and will be an important source for bioinformatics and biological studies. Thus, the reliability and applicability of the cDNA microarray data warrants further evaluation. In cDNA microarrays, multiple clones are measured for a transcript, which can be exploited to evaluate the consistency of microarray data. We show that even for pairs of RCs, the average Pearson correlation coefficient of their measurements is not high. However, this low consistency could largely be explained by random noise signals for a fraction of unexpressed genes and/or low signal-to-noise ratios for low abundance transcripts. Encouragingly, a large fraction of inconsistent data will be filtered out in the procedure of selecting differentially expressed genes (DEGs). Therefore, although cDNA microarray data are of low consistency, applications based on DEGs selections could still reach correct biological results, especially at the functional modules level.
Collapse
Affiliation(s)
- Dong Wang
- School of Life Science and Bioinformatics Centre, University of Electronic Science and Technology of China , Chengdu, 610054, People's Republic of China
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Noriega NC, Kohama SG, Urbanski HF. Gene expression profiling in the rhesus macaque: methodology, annotation and data interpretation. Methods 2009; 49:42-9. [PMID: 19467334 PMCID: PMC2739830 DOI: 10.1016/j.ymeth.2009.05.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2009] [Accepted: 05/18/2009] [Indexed: 12/12/2022] Open
Abstract
Gene microarray analyses represent potentially effective means for high-throughput gene expression profiling in non-human primates. In the companion article, we emphasize effective experimental design based on the in vivo physiology of the rhesus macaque, whereas this article emphasizes considerations for gene annotation and data interpretation using gene microarray platforms from Affymetrix. Initial annotation of the rhesus genome array was based on Affymetrix human GeneChips. However, annotation revisions improve the precision with which rhesus transcripts are identified. Annotation of the rhesus GeneChip is under continuous revision with large percentages of probesets under multiple annotation systems having undergone multiple reassignments between March 2007 and November 2008. It is also important to consider that quantitation and comparison of gene expression levels across multiple chips requires appropriate normalization. External corroboration of microarray results using PCR-based methodology also requires validation of appropriate internal reference genes for normalization of expression values. Many tools are now freely available to aid investigators with microarray normalization and selection of internal reference genes to be used for independent corroboration of microarray results.
Collapse
Affiliation(s)
- Nigel C Noriega
- Division of Neuroscience, Oregon National Primate Research Center, 505 NW 185th Avenue, Beaverton, OR 97006, USA.
| | | | | |
Collapse
|
43
|
Langdon WB, Upton GJG, Harrison AP. Probes containing runs of guanines provide insights into the biophysics and bioinformatics of Affymetrix GeneChips. Brief Bioinform 2009; 10:259-77. [PMID: 19359259 DOI: 10.1093/bib/bbp018] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The reliable interpretation of Affymetrix GeneChip data is a multi-faceted problem. The interplay between biophysics, bioinformatics and mining of GeneChip surveys is leading to new insights into how best to analyse the data. Many of the molecular processes occurring on the surfaces of GeneChips result from the high surface density of probes. Interactions between neighbouring adjacent probes affect their rate and strength of hybridization to targets. Competing targets may hybridize to the same probe, and targets may partially bind to more than one probe. The formation of these partial hybrids results in a number of probes not reaching thermodynamic equilibrium during hybridization. Moreover, some targets fold up, or cross-hybridize to other targets. Furthermore, probes may fold and can undergo chemical saturation. There are also sequence-dependent differences in the rates of target desorption during the washing stage. Improvements in the mappings between probe sequence and biological databases are leading to more accurate gene expression profiles. Moreover, algorithms that combine the intensities of multiple probes into single measures of expression are increasingly dependent upon models of the hybridization processes occurring on GeneChips. The large repositories of GeneChip data can be searched for systematic effects across many experiments. This data mining has led to the discovery of a family of thousands of probes, which show correlated expression across thousands of GeneChip experiments. These probes contain runs of guanines, suggesting that G-quadruplexes are able to form on GeneChips. We discuss the impact of these structures on the interpretation of data from GeneChip experiments.
Collapse
Affiliation(s)
- William B Langdon
- Department of Mathematical Sciences and Department of Biological Sciences, University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, UK
| | | | | |
Collapse
|
44
|
Baralla A, Mentzen WI, De La Fuente A. Inferring Gene Networks: Dream or Nightmare? Ann N Y Acad Sci 2009; 1158:246-56. [DOI: 10.1111/j.1749-6632.2008.04099.x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
45
|
Langdon WB, Harrison AP. Evolving DNA motifs to predict GeneChip probe performance. Algorithms Mol Biol 2009; 4:6. [PMID: 19298675 PMCID: PMC2679018 DOI: 10.1186/1748-7188-4-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2008] [Accepted: 03/19/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Affymetrix High Density Oligonuclotide Arrays (HDONA) simultaneously measure expression of thousands of genes using millions of probes. We use correlations between measurements for the same gene across 6685 human tissue samples from NCBI's GEO database to indicated the quality of individual HG-U133A probes. Low correlation indicates a poor probe. RESULTS Regular expressions can be automatically created from a Backus-Naur form (BNF) context-free grammar using strongly typed genetic programming. CONCLUSION The automatically produced motif is better at predicting poor DNA sequences than an existing human generated RE, suggesting runs of Cytosine and Guanine and mixtures should all be avoided.
Collapse
Affiliation(s)
- WB Langdon
- Department of Computer Science, King's College London, Strand, London, WC2R 2LS, UK
| | - AP Harrison
- Biological Sciences, University of Essex, Wivenhoe Park, Colchester, CO4 3SQ, UK
| |
Collapse
|
46
|
Cui X, Loraine AE. Consistency analysis of redundant probe sets on affymetrix three-prime expression arrays and applications to differential mRNA processing. PLoS One 2009; 4:e4229. [PMID: 19165320 PMCID: PMC2621337 DOI: 10.1371/journal.pone.0004229] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2008] [Accepted: 11/11/2008] [Indexed: 11/19/2022] Open
Abstract
Affymetrix three-prime expression microarrays contain thousands of redundant probe sets that interrogate different regions of the same gene. Differential expression analysis methods rarely consider probe redundancy, which can lead to inaccurate inference about overall gene expression or cause investigators to overlook potentially valuable information about differential regulation of variant mRNA products. We investigated the behaviour and consistency of redundant probe sets in a publicly-available data set containing samples from mouse brain amygdala and hippocampus and asked how applying filtering methods to the data affected consistency of results obtained from redundant probe sets. A genome-based filter that screens and groups probe sets according to their overlapping genomic alignments significantly improved redundant probe set consistency. Screening based on qualitative Present-Absent calls from MAS5 also improved consistency. However, even after applying these filters, many redundant probe sets showed significant fold-change differences relative to each other, suggesting differential regulation of alternative transcript production. Visual inspection of these loci using an interactive genome visualization tool (igb.bioviz.org) exposed thirty putative examples of differential regulation of alternative splicing or polyadenylation across brain regions in mouse. This work demonstrates how P/A-call and genome-based filtering can improve consistency among redundant probe sets while at the same time exposing possible differential regulation of RNA processing pathways across sample types.
Collapse
Affiliation(s)
- Xiangqin Cui
- Section on Statistical Genetics, Department of Biostatistics, University of Alabama, Birmingham, Alabama, United States of America
| | - Ann E. Loraine
- Department of Bioinformatics and Genomics, North Carolina Research Campus, University of North Carolina at Charlotte, Charlotte, North Carolina, United States of America
- * E-mail:
| |
Collapse
|
47
|
Vanhoutte K, de Asmundis C, Francesconi A, Figysl J, Steurs G, Boussy T, Roos M, Mueller A, Massimo L, Paparella G, Van Caelenberg K, Chierchia GB, Sarkozy A, Terradellas PBY, Zizi M. Leaving out control groups: an internal contrast analysis of gene expression profiles in atrial fibrillation patients--a systems biology approach to clinical categorization. Bioinformation 2009; 3:275-8. [PMID: 19255648 PMCID: PMC2649423 DOI: 10.6026/97320630003275] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2008] [Accepted: 09/14/2008] [Indexed: 12/25/2022] Open
Abstract
Atrial fibrillation (AF) is a frequent chronic dysrythmia with an incidence that increases with age (>40). Because of its medical and socio-economic impacts it is expected to become an increasing burden on most health care systems. AF is a multi-factorial disease for which the identification of subtypes is warranted. Novel approaches based on the broad concepts of systems biology may overcome the blurred notion of normal and pathological phenotype, which is inherent to high throughput molecular arrays analysis. Here we apply an internal contrast algorithm on AF patient data with an analytical focus on potential entry pathways into the disease. We used a RMA (Robust Multichip Average) normalized Affymetrix micro-array data set from 10 AF patients (geo_accession #GSE2240). Four series of probes were selected based on physiopathogenic links with AF entryways: apoptosis (remodeling), MAP kinase (cell remodeling), OXPHOS (ability to sustain hemodynamic workload) and glycolysis (ischemia). Annotated probe lists were polled with Bioconductor packages in R (version 2.7.1). Genetic profile contrasts were analysed with hierarchical clustering and principal component analysis. The analysis revealed distinct patient groups for all probe sets. A substantial part (54% till 67%) of the variance is explained in the first 2 principal components. Genes in PC1/2 with high discriminatory value were selected and analyzed in detail. We aim for reliable molecular stratification of AF. We show that stratification is possible based on physiologically relevant gene sets. Genes with high contrast value are likely to give pathophysiological insight into permanent AF subtypes.
Collapse
Affiliation(s)
- Kurt Vanhoutte
- Faculty of Medicine and Pharmacy, Dept of Physiology, Vrije Universiteit Brussel.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Upton GJ, Langdon WB, Harrison AP. G-spots cause incorrect expression measurement in Affymetrix microarrays. BMC Genomics 2008; 9:613. [PMID: 19094220 PMCID: PMC2628396 DOI: 10.1186/1471-2164-9-613] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2008] [Accepted: 12/18/2008] [Indexed: 02/05/2023] Open
Abstract
Background High Density Oligonucleotide arrays (HDONAs), such as the Affymetrix HG-U133A GeneChip, use sets of probes chosen to match specified genes, with the expectation that if a particular gene is highly expressed then all the probes in that gene's probe set will provide a consistent message signifying the gene's presence. However, probes that contain a G-spot (a sequence of four or more guanines) behave abnormally and it has been suggested that these probes are responding to some biochemical effect such as the formation of G-quadruplexes. Results We have tested this expectation by examining the correlation coefficients between pairs of probes using the data on thousands of arrays that are available in the NCBI Gene Expression Omnibus (GEO) repository. We confirm the finding that G-spot probes are poorly correlated with others in their probesets and reveal that, by contrast, they are highly correlated with one another. We demonstrate that the correlation is most marked when the G-spot is at the 5' end of the probe. Conclusion Since these G-spot probes generally show little correlation with the other members of their probesets they are not fit for purpose and their values should be excluded when calculating gene expression values. This has serious implications, since more than 40% of the probesets in the HG-U133A GeneChip contain at least one such probe. Future array designs should avoid these untrustworthy probes.
Collapse
Affiliation(s)
- Graham Jg Upton
- Departments of Mathematical and Biological Sciences, University of Essex, Wivenhoe Park, Colchester, Essex CO43SQ, UK.
| | | | | |
Collapse
|
49
|
Arakaki AK, Mezencev R, Bowen NJ, Huang Y, McDonald JF, Skolnick J. Identification of metabolites with anticancer properties by computational metabolomics. Mol Cancer 2008; 7:57. [PMID: 18559081 PMCID: PMC2453147 DOI: 10.1186/1476-4598-7-57] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2008] [Accepted: 06/17/2008] [Indexed: 01/27/2023] Open
Abstract
Background Certain endogenous metabolites can influence the rate of cancer cell growth. For example, diacylglycerol, ceramides and sphingosine, NAD+ and arginine exert this effect by acting as signaling molecules, while carrying out other important cellular functions. Metabolites can also be involved in the control of cell proliferation by directly regulating gene expression in ways that are signaling pathway-independent, e.g. by direct activation of transcription factors or by inducing epigenetic processes. The fact that metabolites can affect the cancer process on so many levels suggests that the change in concentration of some metabolites that occurs in cancer cells could have an active role in the progress of the disease. Results CoMet, a fully automated Computational Metabolomics method to predict changes in metabolite levels in cancer cells compared to normal references has been developed and applied to Jurkat T leukemia cells with the goal of testing the following hypothesis: Up or down regulation in cancer cells of the expression of genes encoding for metabolic enzymes leads to changes in intracellular metabolite concentrations that contribute to disease progression. All nine metabolites predicted to be lowered in Jurkat cells with respect to lymphoblasts that were examined (riboflavin, tryptamine, 3-sulfino-L-alanine, menaquinone, dehydroepiandrosterone, α-hydroxystearic acid, hydroxyacetone, seleno-L-methionine and 5,6-dimethylbenzimidazole), exhibited antiproliferative activity that has not been reported before, while only two (bilirubin and androsterone) of the eleven tested metabolites predicted to be increased or unchanged in Jurkat cells displayed significant antiproliferative activity. Conclusion These results: a) demonstrate that CoMet is a valuable method to identify potential compounds for experimental validation, b) indicate that cancer cell metabolism may be regulated to reduce the intracellular concentration of certain antiproliferative metabolites, leading to uninhibited cellular growth and c) suggest that many other endogenous metabolites with important roles in carcinogenesis are awaiting discovery.
Collapse
Affiliation(s)
- Adrian K Arakaki
- Center for the Study of Systems Biology, Georgia Institute of Technology, Atlanta, Georgia, USA.
| | | | | | | | | | | |
Collapse
|
50
|
The use of Affymetrix GeneChips as a tool for studying alternative forms of RNA. Biochem Soc Trans 2008; 36:511-3. [DOI: 10.1042/bst0360511] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
We are developing a computational pipeline to use surveys of Affymetrix GeneChips as a discovery tool for unravelling some of the biology associated with post-transcriptional processing of RNA. This work involves the integration of a number of bioinformatics resources, from comparing annotations to processing images to determining the structure of transcripts. The rapidly growing datasets of GeneChips available to the community puts us in a strong position to discover novel biology about post-transcriptional processing, and should enable us to determine the mechanisms by which some groups of genes make co-ordinated changes in their production of isoforms.
Collapse
|