1
|
Zhang L, Martini GD, Rube HT, Kribelbauer JF, Rastogi C, FitzPatrick VD, Houtman JC, Bussemaker HJ, Pufall MA. SelexGLM differentiates androgen and glucocorticoid receptor DNA-binding preference over an extended binding site. Genome Res 2017; 28:111-121. [PMID: 29196557 PMCID: PMC5749176 DOI: 10.1101/gr.222844.117] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Accepted: 11/22/2017] [Indexed: 11/28/2022]
Abstract
The DNA-binding interfaces of the androgen (AR) and glucocorticoid (GR) receptors are virtually identical, yet these transcription factors share only about a third of their genomic binding sites and regulate similarly distinct sets of target genes. To address this paradox, we determined the intrinsic specificities of the AR and GR DNA-binding domains using a refined version of SELEX-seq. We developed an algorithm, SelexGLM, that quantifies binding specificity over a large (31-bp) binding site by iteratively fitting a feature-based generalized linear model to SELEX probe counts. This analysis revealed that the DNA-binding preferences of AR and GR homodimers differ significantly, both within and outside the 15-bp core binding site. The relative preference between the two factors can be tuned over a wide range by changing the DNA sequence, with AR more sensitive to sequence changes than GR. The specificity of AR extends to the regions flanking the core 15-bp site, where isothermal calorimetry measurements reveal that affinity is augmented by enthalpy-driven readout of poly(A) sequences associated with narrowed minor groove width. We conclude that the increased specificity of AR is correlated with more enthalpy-driven binding than GR. The binding models help explain differences in AR and GR genomic binding and provide a biophysical rationale for how promiscuous binding by GR allows functional substitution for AR in some castration-resistant prostate cancers.
Collapse
Affiliation(s)
- Liyang Zhang
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, Iowa 52242, USA
| | - Gabriella D Martini
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA.,Department of Systems Biology, Columbia University Medical Center, New York, New York 10032, USA
| | - H Tomas Rube
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA.,Department of Systems Biology, Columbia University Medical Center, New York, New York 10032, USA
| | - Judith F Kribelbauer
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA.,Department of Systems Biology, Columbia University Medical Center, New York, New York 10032, USA
| | - Chaitanya Rastogi
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA.,Department of Systems Biology, Columbia University Medical Center, New York, New York 10032, USA
| | - Vincent D FitzPatrick
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA.,Department of Systems Biology, Columbia University Medical Center, New York, New York 10032, USA
| | - Jon C Houtman
- Department of Immunology, Carver College of Medicine, University of Iowa, Iowa City, Iowa 52242, USA
| | - Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA.,Department of Systems Biology, Columbia University Medical Center, New York, New York 10032, USA
| | - Miles A Pufall
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, Iowa 52242, USA
| |
Collapse
|
2
|
Ballén-Taborda C, Plata G, Ayling S, Rodríguez-Zapata F, Becerra Lopez-Lavalle LA, Duitama J, Tohme J. Identification of Cassava MicroRNAs under Abiotic Stress. Int J Genomics 2013; 2013:857986. [PMID: 24328029 PMCID: PMC3845235 DOI: 10.1155/2013/857986] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2013] [Accepted: 10/11/2013] [Indexed: 11/18/2022] Open
Abstract
The study of microRNAs (miRNAs) in plants has gained significant attention in recent years due to their regulatory role during development and in response to biotic and abiotic stresses. Although cassava (Manihot esculenta Crantz) is tolerant to drought and other adverse conditions, most cassava miRNAs have been predicted using bioinformatics alone or through sequencing of plants challenged by biotic stress. Here, we use high-throughput sequencing and different bioinformatics methods to identify potential cassava miRNAs expressed in different tissues subject to heat and drought conditions. We identified 60 miRNAs conserved in other plant species and 821 potential cassava-specific miRNAs. We also predicted 134 and 1002 potential target genes for these two sets of sequences. Using real time PCR, we verified the condition-specific expression of 5 cassava small RNAs relative to a non-stress control. We also found, using publicly available expression data, a significantly lower expression of the predicted target genes of conserved and nonconserved miRNAs under drought stress compared to other cassava genes. Gene Ontology enrichment analysis along with condition specific expression of predicted miRNA targets, allowed us to identify several interesting miRNAs which may play a role in stress-induced posttranscriptional regulation in cassava and other plants.
Collapse
Affiliation(s)
- Carolina Ballén-Taborda
- Agrobiodiversity and Biotechnology Project, International Center for Tropical Agriculture (CIAT), A.A. 6713, Cali, Colombia
| | - Germán Plata
- Department of Systems Biology, Columbia University, 1130 Saint Nicholas Avenue, New York, NY 10032, USA
| | - Sarah Ayling
- The Genome Analysis Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | - Fausto Rodríguez-Zapata
- Agrobiodiversity and Biotechnology Project, International Center for Tropical Agriculture (CIAT), A.A. 6713, Cali, Colombia
| | | | - Jorge Duitama
- Agrobiodiversity and Biotechnology Project, International Center for Tropical Agriculture (CIAT), A.A. 6713, Cali, Colombia
| | - Joe Tohme
- Agrobiodiversity and Biotechnology Project, International Center for Tropical Agriculture (CIAT), A.A. 6713, Cali, Colombia
| |
Collapse
|
3
|
Identification of macrophage genes responsive to extracellular acidification. Inflamm Res 2013; 62:399-406. [PMID: 23417272 DOI: 10.1007/s00011-013-0591-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2012] [Revised: 12/06/2012] [Accepted: 01/02/2013] [Indexed: 01/05/2023] Open
Abstract
OBJECTIVE A low pH microenvironment is a characteristic feature of inflammation loci and affects the functions of immune cells. In this study, we investigated the effect of extracellular acidification on macrophage gene expression. METHODS RAW264.7 macrophages were incubated in neutral (pH 7.4) or acidic (pH 6.8) medium for 4 h. Global mRNA expression levels were determined using Affymetrix genechips. RESULTS The mRNA expressions of 353 macrophage genes were significantly modified after incubation in acidic medium; 193 were up-regulated and 160 down-regulated. Differentially regulated genes were grouped into 13 classes based on the functions of the corresponding protein products. Pathway analysis revealed that differentially expressed genes are enriched in pathways related to inflammation and immune responses. Quantitative real-time PCR analysis confirmed that the expressions of CXCL10, CXCL14, IL-18, IL-4RA, ABCA1, CCL4, IL-7R, CXCR4, TLR7, and CCL3 mRNAs were regulated by extracellular acidification. CONCLUSION The results of this study provide insights into the effects of acidic extracellular environments on macrophage gene expression.
Collapse
|
4
|
Hettne KM, Boorsma A, van Dartel DAM, Goeman JJ, de Jong E, Piersma AH, Stierum RH, Kleinjans JC, Kors JA. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data. BMC Med Genomics 2013; 6:2. [PMID: 23356878 PMCID: PMC3572439 DOI: 10.1186/1755-8794-6-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2012] [Accepted: 01/25/2013] [Indexed: 11/10/2022] Open
Abstract
Background Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set analysis (GSA) methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. Methods We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human) and 588 (mouse) gene sets from the Comparative Toxicogenomics Database (CTD). We tested for significant differential expression (SDE) (false discovery rate -corrected p-values < 0.05) of the next-gen TM-derived gene sets and the CTD-derived gene sets in gene expression (GE) data sets of five chemicals (from experimental models). We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha (PPARA) knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture (WEC), and used principal component analysis (PCA) to discriminate triazoles from other chemicals. Results Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals. Conclusions Gene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and/or toxic effect.
Collapse
Affiliation(s)
- Kristina M Hettne
- Department of Toxicogenomics, Maastricht University, Maastricht, The Netherlands.
| | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Yu T, Bai Y. Analyzing LC/MS metabolic profiling data in the context of existing metabolic networks. ACTA ACUST UNITED AC 2012; 1:83-91. [PMID: 24010053 DOI: 10.2174/2213235x11301010084] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Metabolic profiling is the unbiased detection and quantification of low molecular-weight metabolites in a living system. It is rapidly developing in biological and translational research, contributing to disease mechanism elucidation, environmental chemical surveillance, biomarker detection, and health outcome prediction. Recent developments in experimental and computational technology allow more and more known metabolites to be detected and quantified from complex samples. As the coverage of the metabolic network improves, it has become feasible to examine metabolic profiling data from a systems perspective, i.e. interpreting the data and performing statistical inference in the context of pathways and genome-scale metabolic networks. Recently a number of methods have been developed in this area, and much improvement in algorithms and databases are still needed. In this review, we survey some methods for the analysis of metabolic profiling data based on metabolic networks.
Collapse
Affiliation(s)
- Tianwei Yu
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA
| | | |
Collapse
|
6
|
Gene expression profile reveals that STAT2 is involved in the immunosuppressive function of human bone marrow-derived mesenchymal stem cells. Gene 2012; 497:131-9. [PMID: 22523757 DOI: 10.1016/j.gene.2012.01.073] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Emerging evidence of the potent immunosuppressive activity of mesenchymal stem cells (MSCs) by modulation of both innate and adaptive immune responses enables MSCs to be developed as a promising therapeutic modality for immune-related or inflammatory diseases. However, it is not clearly understood how MSCs exert their immunosuppressive effects on immune cells under inflammatory conditions. Using human bone marrow (BM)-derived clonal MSCs (hcMSCs), we obtained and analyzed a differentially expressed gene profile when stimulated with the inflammatory cytokines interferon-γ (IFN-γ) and tumor necrosis factor-α (TNF-α) to find novel candidate factors responsible for MSC immunomodulation. Microarray analysis showed that 5650 genes were upregulated and 5862 genes were downregulated with the cutoff of 2-fold expression change. Among these, the ICOSLG and STAT2 genes were drastically upregulated 173-fold and 154-fold, respectively. Reverse transcription-polymerase chain reaction analysis confirmed the microarray data. To evaluate whether their increased expression is related to MSC-mediated immunosuppression,siRNA-induced ICOSLG- or STAT2-knockdown hcMSCs were assessed for their T cell suppressive activity. We demonstrated that STAT2 but not ICOSLG is functionally involved in the immunosuppressive activity of hcMSCs as a novel regulator under inflammatory conditions. Gene ontology and pathway analyses further support the immunomodulatory function of hcMSCs when inflammatory stimulation was provided.Taken together, this study provides an informative genome-wide gene expression profile and molecular evidence for understanding the mechanisms underlying the modulation of immune cells by human BM-derived MSCs under inflammatory conditions.
Collapse
|
7
|
Capturing changes in gene expression dynamics by gene set differential coordination analysis. Genomics 2011; 98:469-77. [PMID: 21971296 DOI: 10.1016/j.ygeno.2011.09.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2011] [Revised: 09/01/2011] [Accepted: 09/16/2011] [Indexed: 12/31/2022]
Abstract
Analyzing gene expression data at the gene set level greatly improves feature extraction and data interpretation. Currently most efforts in gene set analysis are focused on differential expression analysis--finding gene sets whose genes show first-order relationship with the clinical outcome. However the regulation of the biological system is complex, and much of the change in gene expression dynamics do not manifest in the form of differential expression. At the gene set level, capturing the change in expression dynamics is difficult due to the complexity and heterogeneity of the gene sets. Here we report a systematic approach to detect gene sets that show differential coordination patterns with the rest of the transcriptome, as well as pairs of gene sets that are differentially coordinated with each other. We demonstrate that the method can identify biologically relevant gene sets, many of which do not show first-order relationship with the clinical outcome.
Collapse
|
8
|
Kerwin RE, Jimenez-Gomez JM, Fulop D, Harmer SL, Maloof JN, Kliebenstein DJ. Network quantitative trait loci mapping of circadian clock outputs identifies metabolic pathway-to-clock linkages in Arabidopsis. THE PLANT CELL 2011; 23:471-85. [PMID: 21343415 PMCID: PMC3077772 DOI: 10.1105/tpc.110.082065] [Citation(s) in RCA: 107] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2010] [Revised: 01/19/2011] [Accepted: 01/30/2011] [Indexed: 05/18/2023]
Abstract
Modern systems biology permits the study of complex networks, such as circadian clocks, and the use of complex methodologies, such as quantitative genetics. However, it is difficult to combine these approaches due to factorial expansion in experiments when networks are examined using complex methods. We developed a genomic quantitative genetic approach to overcome this problem, allowing us to examine the function(s) of the plant circadian clock in different populations derived from natural accessions. Using existing microarray data, we defined 24 circadian time phase groups (i.e., groups of genes with peak phases of expression at particular times of day). These groups were used to examine natural variation in circadian clock function using existing single time point microarray experiments from a recombinant inbred line population. We identified naturally variable loci that altered circadian clock outputs and linked these circadian quantitative trait loci to preexisting metabolomics quantitative trait loci, thereby identifying possible links between clock function and metabolism. Using single-gene isogenic lines, we found that circadian clock output was altered by natural variation in Arabidopsis thaliana secondary metabolism. Specifically, genetic manipulation of a secondary metabolic enzyme led to altered free-running rhythms. This represents a unique and valuable approach to the study of complex networks using quantitative genetics.
Collapse
Affiliation(s)
- Rachel E. Kerwin
- Department of Plant Sciences, University of California, Davis, California 95616
| | - Jose M. Jimenez-Gomez
- Department of Plant Biology, University of California, Davis, California 95616
- Max Planck Institute for Plant Breeding Research, Plant Breeding and Genetics Department, 50829 Cologne, Germany
| | - Daniel Fulop
- Department of Plant Biology, University of California, Davis, California 95616
| | - Stacey L. Harmer
- Department of Plant Biology, University of California, Davis, California 95616
| | - Julin N. Maloof
- Department of Plant Biology, University of California, Davis, California 95616
| | - Daniel J. Kliebenstein
- Department of Plant Sciences, University of California, Davis, California 95616
- Address correspondence to
| |
Collapse
|
9
|
Lelandais G, Devaux F. Comparative Functional Genomics of Stress Responses in Yeasts. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2010; 14:501-15. [DOI: 10.1089/omi.2010.0029] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Gaëlle Lelandais
- Dynamique des Structures et Interactions des Macromolécules Biologiques (DSIMB), INSERM UMR-S 665, Université Paris Diderot, Paris France
| | - Frédéric Devaux
- Laboratoire de génomique des microorganismes, CNRS FRE3214, Université Pierre et Marie Curie, Institut des Cordeliers, Paris, France
| |
Collapse
|
10
|
Kelder T, Conklin BR, Evelo CT, Pico AR. Finding the right questions: exploratory pathway analysis to enhance biological discovery in large datasets. PLoS Biol 2010; 8. [PMID: 20824171 PMCID: PMC2930872 DOI: 10.1371/journal.pbio.1000472] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
This Essay discusses the role of pathways for exploratory data analysis in present-day biology.
Collapse
Affiliation(s)
- Thomas Kelder
- Department of Bioinformatics, BiGCaT, Maastricht University, Maastricht, The Netherlands.
| | | | | | | |
Collapse
|
11
|
Rabin SJ, Kim JMH, Baughn M, Libby RT, Kim YJ, Fan Y, Libby RT, La Spada A, Stone B, Ravits J. Sporadic ALS has compartment-specific aberrant exon splicing and altered cell-matrix adhesion biology. Hum Mol Genet 2009; 19:313-28. [PMID: 19864493 PMCID: PMC2796893 DOI: 10.1093/hmg/ddp498] [Citation(s) in RCA: 101] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease characterized by progressive weakness from loss of motor neurons. The fundamental pathogenic mechanisms are unknown and recent evidence is implicating a significant role for abnormal exon splicing and RNA processing. Using new comprehensive genomic technologies, we studied exon splicing directly in 12 sporadic ALS and 10 control lumbar spinal cords acquired by a rapid autopsy system that processed nervous systems specifically for genomic studies. ALS patients had rostral onset and caudally advancing disease and abundant residual motor neurons in this region. We created two RNA pools, one from motor neurons collected by laser capture microdissection and one from the surrounding anterior horns. From each, we isolated RNA, amplified mRNA, profiled whole-genome exon splicing, and applied advanced bioinformatics. We employed rigorous quality control measures at all steps and validated findings by qPCR. In the motor neuron enriched mRNA pool, we found two distinct cohorts of mRNA signals, most of which were up-regulated: 148 differentially expressed genes (P ≤ 10−3) and 411 aberrantly spliced genes (P ≤ 10−5). The aberrantly spliced genes were highly enriched in cell adhesion (P ≤ 10−57), especially cell–matrix as opposed to cell–cell adhesion. Most of the enriching genes encode transmembrane or secreted as opposed to nuclear or cytoplasmic proteins. The differentially expressed genes were not biologically enriched. In the anterior horn enriched mRNA pool, we could not clearly identify mRNA signals or biological enrichment. These findings, perturbed and up-regulated cell–matrix adhesion, suggest possible mechanisms for the contiguously progressive nature of motor neuron degeneration. Data deposition: GeneChip raw data (CEL-files) have been deposited for public access in the Gene Expression Omnibus (GEO), www.ncbi.nlm.nih.gov/geo, accession number GSE18920.
Collapse
Affiliation(s)
- Stuart J Rabin
- Benaroya Research Institute at Virginia Mason, Seattle, WA 98101, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Microarray analysis of gene expression profile by treatment of Cinnamomi Ramulus in lipopolysaccharide-stimulated BV-2 cells. Gene 2009; 443:83-90. [DOI: 10.1016/j.gene.2009.04.024] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2009] [Revised: 04/23/2009] [Accepted: 04/28/2009] [Indexed: 01/18/2023]
|
13
|
Luo W, Friedman MS, Shedden K, Hankenson KD, Woolf PJ. GAGE: generally applicable gene set enrichment for pathway analysis. BMC Bioinformatics 2009; 10:161. [PMID: 19473525 PMCID: PMC2696452 DOI: 10.1186/1471-2105-10-161] [Citation(s) in RCA: 970] [Impact Index Per Article: 60.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2008] [Accepted: 05/27/2009] [Indexed: 11/11/2022] Open
Abstract
Background Gene set analysis (GSA) is a widely used strategy for gene expression data analysis based on pathway knowledge. GSA focuses on sets of related genes and has established major advantages over individual gene analyses, including greater robustness, sensitivity and biological relevance. However, previous GSA methods have limited usage as they cannot handle datasets of different sample sizes or experimental designs. Results To address these limitations, we present a new GSA method called Generally Applicable Gene-set Enrichment (GAGE). We successfully apply GAGE to multiple microarray datasets with different sample sizes, experimental designs and profiling techniques. GAGE shows significantly better results when compared to two other commonly used GSA methods of GSEA and PAGE. We demonstrate this improvement in the following three aspects: (1) consistency across repeated studies/experiments; (2) sensitivity and specificity; (3) biological relevance of the regulatory mechanisms inferred. GAGE reveals novel and relevant regulatory mechanisms from both published and previously unpublished microarray studies. From two published lung cancer data sets, GAGE derived a more cohesive and predictive mechanistic scheme underlying lung cancer progress and metastasis. For a previously unpublished BMP6 study, GAGE predicted novel regulatory mechanisms for BMP6 induced osteoblast differentiation, including the canonical BMP-TGF beta signaling, JAK-STAT signaling, Wnt signaling, and estrogen signaling pathways–all of which are supported by the experimental literature. Conclusion GAGE is generally applicable to gene expression datasets with different sample sizes and experimental designs. GAGE consistently outperformed two most frequently used GSA methods and inferred statistically and biologically more relevant regulatory pathways. The GAGE method is implemented in R in the "gage" package, available under the GNU GPL from .
Collapse
Affiliation(s)
- Weijun Luo
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109, USA.
| | | | | | | | | |
Collapse
|
14
|
Beltrame L, Rizzetto L, Paola R, Rocca-Serra P, Gambineri L, Battaglia C, Cavalieri D. Using pathway signatures as means of identifying similarities among microarray experiments. PLoS One 2009; 4:e4128. [PMID: 19125200 PMCID: PMC2610483 DOI: 10.1371/journal.pone.0004128] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2008] [Accepted: 12/04/2008] [Indexed: 01/31/2023] Open
Abstract
Widespread use of microarrays has generated large amounts of data, the interrogation of the public microarray repositories, identifying similarities between microarray experiments is now one of the major challenges. Approaches using defined group of genes, such as pathways and cellular networks (pathway analysis), have been proposed to improve the interpretation of microarray experiments. We propose a novel method to compare microarray experiments at the pathway level, this method consists of two steps: first, generate pathway signatures, a set of descriptors recapitulating the biologically meaningful pathways related to some clinical/biological variable of interest, second, use these signatures to interrogate microarray databases. We demonstrate that our approach provides more reliable results than with gene-based approaches. While gene-based approaches tend to suffer from bias generated by the analytical procedures employed, our pathway based method successfully groups together similar samples, independently of the experimental design. The results presented are potentially of great interest to improve the ability to query and compare experiments in public repositories of microarray data. As a matter of fact, this method can be used to retrieve data from public microarray databases and perform comparisons at the pathway level.
Collapse
Affiliation(s)
- Luca Beltrame
- Department of Pharmacology, University of Firenze, Firenze, Italy
- Institute for Biomedical Technologies, National Research Council, Milano, Italy
| | - Lisa Rizzetto
- Department of Pharmacology, University of Firenze, Firenze, Italy
| | - Raffaele Paola
- Department of Pharmacology, University of Firenze, Firenze, Italy
| | | | | | - Cristina Battaglia
- Department of Science and Biomedical Technologies, University of Milano, Milano, Italy
| | - Duccio Cavalieri
- Department of Pharmacology, University of Firenze, Firenze, Italy
- * E-mail:
| |
Collapse
|
15
|
Boorsma A, Lu XJ, Zakrzewska A, Klis FM, Bussemaker HJ. Inferring condition-specific modulation of transcription factor activity in yeast through regulon-based analysis of genomewide expression. PLoS One 2008; 3:e3112. [PMID: 18769540 PMCID: PMC2518834 DOI: 10.1371/journal.pone.0003112] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2008] [Accepted: 08/07/2008] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND A key goal of systems biology is to understand how genomewide mRNA expression levels are controlled by transcription factors (TFs) in a condition-specific fashion. TF activity is frequently modulated at the post-translational level through ligand binding, covalent modification, or changes in sub-cellular localization. In this paper, we demonstrate how prior information about regulatory network connectivity can be exploited to infer condition-specific TF activity as a hidden variable from the genomewide mRNA expression pattern in the yeast Saccharomyces cerevisiae. METHODOLOGY/PRINCIPAL FINDINGS We first validate experimentally that by scoring differential expression at the level of gene sets or "regulons" comprised of the putative targets of a TF, we can accurately predict modulation of TF activity at the post-translational level. Next, we create an interactive database of inferred activities for a large number of TFs across a large number of experimental conditions in S. cerevisiae. This allows us to perform TF-centric analysis of the yeast regulatory network. CONCLUSIONS/SIGNIFICANCE We analyze the degree to which the mRNA expression level of each TF is predictive of its regulatory activity. We also organize TFs into "co-modulation networks" based on their inferred activity profile across conditions, and find that this reveals functional and mechanistic relationships. Finally, we present evidence that the PAC and rRPE motifs antagonize TBP-dependent regulation, and function as core promoter elements governed by the transcription regulator NC2. Regulon-based monitoring of TF activity modulation is a powerful tool for analyzing regulatory network function that should be applicable in other organisms. Tools and results are available online at http://bussemakerlab.org/RegulonProfiler/.
Collapse
Affiliation(s)
- André Boorsma
- Swammerdam Institute for Life Sciences, University of Amsterdam, BioCentrum Amsterdam, Amsterdam, The Netherlands
| | - Xiang-Jun Lu
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
| | - Anna Zakrzewska
- Swammerdam Institute for Life Sciences, University of Amsterdam, BioCentrum Amsterdam, Amsterdam, The Netherlands
| | - Frans M. Klis
- Swammerdam Institute for Life Sciences, University of Amsterdam, BioCentrum Amsterdam, Amsterdam, The Netherlands
| | - Harmen J. Bussemaker
- Department of Biological Sciences, Columbia University, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America
- * E-mail:
| |
Collapse
|
16
|
Ward LD, Bussemaker HJ. Predicting functional transcription factor binding through alignment-free and affinity-based analysis of orthologous promoter sequences. ACTA ACUST UNITED AC 2008; 24:i165-71. [PMID: 18586710 PMCID: PMC2718632 DOI: 10.1093/bioinformatics/btn154] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Motivation: The identification of transcription factor (TF) binding sites and the regulatory circuitry that they define is currently an area of intense research. Data from whole-genome chromatin immunoprecipitation (ChIP–chip), whole-genome expression microarrays, and sequencing of multiple closely related genomes have all proven useful. By and large, existing methods treat the interpretation of functional data as a classification problem (between bound and unbound DNA), and the analysis of comparative data as a problem of local alignment (to recover phylogenetic footprints of presumably functional elements). Both of these approaches suffer from the inability to model and detect low-affinity binding sites, which have recently been shown to be abundant and functional. Results: We have developed a method that discovers functional regulatory targets of TFs by predicting the total affinity of each promoter for those factors and then comparing that affinity across orthologous promoters in closely related species. At each promoter, we consider the minimum affinity among orthologs to be the fraction of the affinity that is functional. Because we calculate the affinity of the entire promoter, our method is independent of local alignment. By comparing with functional annotation information and gene expression data in Saccharomyces cerevisiae, we have validated that this biophysically motivated use of evolutionary conservation gives rise to dramatic improvement in prediction of regulatory connectivity and factor–factor interactions compared to the use of a single genome. We propose novel biological functions for several yeast TFs, including the factors Snt2 and Stb4, for which no function has been reported. Our affinity-based approach towards comparative genomics may allow a more quantitative analysis of the principles governing the evolution of non-coding DNA. Availability: The MatrixREDUCE software package is available from http://www.bussemakerlab.org/software/MatrixREDUCE Contact:Harmen.Bussemaker@columbia.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lucas D Ward
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA
| | | |
Collapse
|
17
|
Arndt PF, Vingron M. The Otto Warburg International Summer School and Workshop on Networks and Regulation. BMC Bioinformatics 2007. [PMCID: PMC1995547 DOI: 10.1186/1471-2105-8-s6-s1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|