201
|
Narlikar L, Gordân R, Hartemink AJ. A nucleosome-guided map of transcription factor binding sites in yeast. PLoS Comput Biol 2007; 3:e215. [PMID: 17997593 PMCID: PMC2065891 DOI: 10.1371/journal.pcbi.0030215] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2007] [Accepted: 09/20/2007] [Indexed: 11/18/2022] Open
Abstract
Finding functional DNA binding sites of transcription factors (TFs) throughout the genome is a crucial step in understanding transcriptional regulation. Unfortunately, these binding sites are typically short and degenerate, posing a significant statistical challenge: many more matches to known TF motifs occur in the genome than are actually functional. However, information about chromatin structure may help to identify the functional sites. In particular, it has been shown that active regulatory regions are usually depleted of nucleosomes, thereby enabling TFs to bind DNA in those regions. Here, we describe a novel motif discovery algorithm that employs an informative prior over DNA sequence positions based on a discriminative view of nucleosome occupancy. When a Gibbs sampling algorithm is applied to yeast sequence-sets identified by ChIP-chip, the correct motif is found in 52% more cases with our informative prior than with the commonly used uniform prior. This is the first demonstration that nucleosome occupancy information can be used to improve motif discovery. The improvement is dramatic, even though we are using only a statistical model to predict nucleosome occupancy; we expect our results to improve further as high-resolution genome-wide experimental nucleosome occupancy data becomes increasingly available.
Collapse
Affiliation(s)
- Leelavati Narlikar
- Department of Computer Science, Duke University, Durham, North Carolina, United States of America
| | - Raluca Gordân
- Department of Computer Science, Duke University, Durham, North Carolina, United States of America
| | - Alexander J Hartemink
- Department of Computer Science, Duke University, Durham, North Carolina, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
202
|
Ng JK, Ajikumar PK, Stephanopoulos G, Too HP. Profiling RNA polymerase-promoter interaction by using ssDNA-dsDNA probe on a surface addressable microarray. Chembiochem 2007; 8:1667-70. [PMID: 17705343 DOI: 10.1002/cbic.200700340] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Jin Kiat Ng
- MEBCS Program, Singapore-MIT Alliance, 4 Engineering Drive 3, Singapore 117576, Singapore
| | | | | | | |
Collapse
|
203
|
Mao G, Brody JP. Dynamic SPR monitoring of yeast nuclear protein binding to a cis-regulatory element. Biochem Biophys Res Commun 2007; 363:153-8. [PMID: 17850763 PMCID: PMC2699948 DOI: 10.1016/j.bbrc.2007.08.130] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2007] [Accepted: 08/22/2007] [Indexed: 11/19/2022]
Abstract
Gene expression is controlled by protein complexes binding to short specific sequences of DNA, called cis-regulatory elements. Expression of most eukaryotic genes is controlled by dozens of these elements. Comprehensive identification and monitoring of these elements is a major goal of genomics. In pursuit of this goal, we are developing a surface plasmon resonance (SPR) based assay to identify and monitor cis-regulatory elements. To test whether we could reliably monitor protein binding to a regulatory element, we immobilized a 16bp region of Saccharomyces cerevisiae chromosome 5 onto a gold surface. This 16bp region of DNA is known to bind several proteins and thought to control expression of the gene RNR1, which varies through the cell cycle. We synchronized yeast cell cultures, and then sampled these cultures at a regular interval. These samples were processed to purify nuclear lysate, which was then exposed to the sensor. We found that nuclear protein binds this particular element of DNA at a significantly higher rate (as compared to unsynchronized cells) during G1 phase. Other time points show levels of DNA-nuclear protein binding similar to the unsynchronized control. We also measured the apparent association complex of the binding to be 0.014s(-1). We conclude that (1) SPR-based assays can monitor DNA-nuclear protein binding and that (2) for this particular cis-regulatory element, maximum DNA-nuclear protein binding occurs during G1 phase.
Collapse
Affiliation(s)
- Grace Mao
- Department of Biomedical Engineering, University of California--Irvine, Irvine, CA 92697-2715, USA
| | | |
Collapse
|
204
|
Chen X, Hughes TR, Morris Q. RankMotif++: a motif-search algorithm that accounts for relative ranks of K-mers in binding transcription factors. ACTA ACUST UNITED AC 2007; 23:i72-9. [PMID: 17646348 DOI: 10.1093/bioinformatics/btm224] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION The sequence specificity of DNA-binding proteins is typically represented as a position weight matrix in which each base position contributes independently to relative affinity. Assessment of the accuracy and broad applicability of this representation has been limited by the lack of extensive DNA-binding data. However, new microarray techniques, in which preferences for all possible K-mers are measured, enable a broad comparison of both motif representation and methods for motif discovery. Here, we consider the problem of accounting for all of the binding data in such experiments, rather than the highest affinity binding data. We introduce the RankMotif++, an algorithm designed for finding motifs whenever sequences are associated with a semi-quantitative measure of protein-DNA-binding affinity. RankMotif++ learns motif models by maximizing the likelihood of a set of binding preferences under a probabilistic model of how sequence binding affinity translates into binding preference observations. Because RankMotif++ makes few assumptions about the relationship between binding affinity and the semi-quantitative readout, it is applicable to a wide variety of experimental assays of DNA-binding preference. RESULTS By several criteria, RankMotif++ predicts binding affinity better than two widely used motif finding algorithms (MDScan, MatrixREDUCE) or more recently developed algorithms (PREGO, Seed and Wobble), and its performance is comparable to a motif model that separately assigns affinities to 8-mers. Our results validate the PWM model and provide an approximation of the precision and recall that can be expected in a genomic scan. AVAILABILITY RankMotif++ is available upon request. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiaoyu Chen
- Banting and Best Department of Medical Research, University of Toronto, Toronto, ON, Canada
| | | | | |
Collapse
|
205
|
Cho BK, Charusanti P, Herrgård MJ, Palsson BO. Microbial regulatory and metabolic networks. Curr Opin Biotechnol 2007; 18:360-4. [PMID: 17719767 DOI: 10.1016/j.copbio.2007.07.002] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2007] [Accepted: 07/12/2007] [Indexed: 11/18/2022]
Abstract
Reconstruction of transcriptional regulatory and metabolic networks is the foundation of large-scale microbial systems and synthetic biology. An enormous amount of information including the annotated genomic sequences and the genomic locations of DNA-binding regulatory proteins can be used to define metabolic and regulatory networks in cells. In particular, advances in experimental methods to map regulatory networks in microbial cells have allowed reliable data-driven reconstruction of these networks. Recent work on metabolic engineering and experimental evolution of microbes highlights the key role of global regulatory networks in controlling specific metabolic processes and the need to consider the integrated function of multiple types of networks for both scientific and engineering purposes.
Collapse
Affiliation(s)
- Byung-Kwan Cho
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093-0412, USA
| | | | | | | |
Collapse
|
206
|
Bussemaker HJ, Foat BC, Ward LD. Predictive modeling of genome-wide mRNA expression: from modules to molecules. ACTA ACUST UNITED AC 2007; 36:329-47. [PMID: 17311525 DOI: 10.1146/annurev.biophys.36.040306.132725] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Various algorithms are available for predicting mRNA expression and modeling gene regulatory processes. They differ in whether they rely on the existence of modules of coregulated genes or build a model that applies to all genes, whether they represent regulatory activities as hidden variables or as mRNA levels, and whether they implicitly or explicitly model the complex cis-regulatory logic of multiple interacting transcription factors binding the same DNA. The fact that functional genomics data of different types reflect the same molecular processes provides a natural strategy for integrative computational analysis. One promising avenue toward an accurate and comprehensive model of gene regulation combines biophysical modeling of the interactions among proteins, DNA, and RNA with the use of large-scale functional genomics data to estimate regulatory network connectivity and activity parameters. As the ability of these models to represent complex cis-regulatory logic increases, the need for approaches based on cross-species conservation may diminish.
Collapse
Affiliation(s)
- Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA.
| | | | | |
Collapse
|
207
|
Aurell E, d'Hérouël AF, Malmnäs C, Vergassola M. Transcription factor concentrations versus binding site affinities in the yeast S. cerevisiae. Phys Biol 2007; 4:134-43. [PMID: 17664657 DOI: 10.1088/1478-3975/4/2/006] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Transcription regulation is largely governed by the profile and the dynamics of transcription factors' binding to DNA. Stochastic effects are intrinsic to this dynamics, and the binding to functional sites must be controlled with a certain specificity for living organisms to be able to elicit specific cellular responses. Specificity stems here from the interplay between binding affinity and cellular abundance of transcription factor proteins, and the binding of such proteins to DNA is thus controlled by their chemical potential. We combine large-scale protein abundance data in the budding yeast with binding affinities for all transcription factors with known DNA binding site sequences to assess the behavior of their chemical potentials in an exponential growth phase. A sizable fraction of transcription factors is apparently bound non-specifically to DNA, and the observed abundances are marginally sufficient to ensure high occupations of the functional sites. We argue that a biological cause of this feature is related to its noise-filtering consequences: abundances below physiological levels do not yield significant binding of functional targets and mis-expressions of regulated genes may thus be tamed.
Collapse
Affiliation(s)
- Erik Aurell
- Department of Computational Biology, KTH-Royal Institute of Technology, AlbaNova University Center, Stockholm, Sweden.
| | | | | | | |
Collapse
|
208
|
Hallikas O, Taipale J. High-throughput assay for determining specificity and affinity of protein-DNA binding interactions. Nat Protoc 2007; 1:215-22. [PMID: 17406235 DOI: 10.1038/nprot.2006.33] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Limited information exists for the binding specificities of many important transcription factors. To address this, we have previously developed a microwell-based assay for directly measuring the affinity of DNA-protein binding interactions. We describe here the detailed protocol for determining sequence specificities of DNA-binding proteins using this assay. The described method is rapid; after preparation of the reagents, the assay can be run in a single day, and its throughput can be increased further by automation. The method is quantitative but requires prior knowledge of one high-affinity binding site for the protein of interest. The protocol can be adapted for determining the effect of protein modifications and protein-protein interactions on DNA-binding specificity, and for engineering proteins with new DNA-binding specificities. In addition, the method is suitable for high-throughput screening to identify proteins or small molecules that modulate protein-DNA binding interactions.
Collapse
Affiliation(s)
- Outi Hallikas
- Molecular/Cancer Biology Program, Institute of Biomedicine, University of Helsinki, Finland
| | | |
Collapse
|
209
|
Choi Y, Qin Y, Berger MF, Ballow DJ, Bulyk ML, Rajkovic A. Microarray analyses of newborn mouse ovaries lacking Nobox. Biol Reprod 2007; 77:312-9. [PMID: 17494914 DOI: 10.1095/biolreprod.107.060459] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Nobox is a homeobox gene expressed in oocytes and critical in oogenesis. Nobox deficiency leads to rapid loss of postnatal oocytes. Early oocyte differentiation is poorly understood. We hypothesized that lack of Nobox perturbs global expression of genes preferentially expressed in oocytes as well as microRNAs. We compared Nobox knockout and wild-type ovaries using Affymetrix 430 2.0 microarray platform. We discovered that 28 (74%) of 38 of the genes downregulated more than 5-fold in the absence of Nobox were preferentially expressed in oocytes, whereas only 5 (15%) of 33 genes upregulated more than 5-fold in the absence of Nobox were preferentially expressed in oocytes. Protein-binding microarray helped identify nucleotide motifs that NOBOX binds and that several downregulated genes contain within putative promoter regions. MicroRNA population in newborn ovaries deficient of Nobox was largely unaffected. Genes whose proteins are predicted to be secreted but were previously unknown to be significantly expressed in early oogenesis were downregulated in Nobox knockouts and included astacin-like metalloendopeptidase (Astl), Jagged 1 (Jag1), oocyte-secreted protein 1 (Oosp1), fetuin beta (Fetub), and R-spondin 2 (Rspo2). In addition, pluripotency-associated genes Pou5f1 and Sall4 are drastically downregulated in Nobox-deficient ovaries, whereas testes-determining gene Dmrt1 is overexpressed. Our findings indicate that Nobox is likely an activator of oocyte-specific gene expression and suggest that the oocyte plays an important role in suppressing expression of male-determining genes, such as Dmrt1.
Collapse
Affiliation(s)
- Youngsok Choi
- Department of Obstetrics and Gynecology, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | | | | | | | |
Collapse
|
210
|
Smith JC, Lambert JP, Elisma F, Figeys D. Proteomics in 2005/2006: developments, applications and challenges. Anal Chem 2007; 79:4325-43. [PMID: 17477510 DOI: 10.1021/ac070741j] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Jeffrey C Smith
- Ottawa Institute of Systems Biology and Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, Ontario, Canada K1H 8M5
| | | | | | | |
Collapse
|
211
|
McCord RP, Berger MF, Philippakis AA, Bulyk ML. Inferring condition-specific transcription factor function from DNA binding and gene expression data. Mol Syst Biol 2007; 3:100. [PMID: 17437025 PMCID: PMC1865582 DOI: 10.1038/msb4100140] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2007] [Accepted: 02/11/2007] [Indexed: 11/26/2022] Open
Abstract
Numerous genomic and proteomic datasets are permitting the elucidation of transcriptional regulatory networks in the yeast Saccharomyces cerevisiae. However, predicting the condition dependence of regulatory network interactions has been challenging, because most protein–DNA interactions identified in vivo are from assays performed in one or a few cellular states. Here, we present a novel method to predict the condition-specific functions of S. cerevisiae transcription factors (TFs) by integrating 1327 microarray gene expression data sets and either comprehensive TF binding site data from protein binding microarrays (PBMs) or in silico motif data. Importantly, our method does not impose arbitrary thresholds for calling target regions ‘bound' or genes ‘differentially expressed', but rather allows all the information derived from a TF binding or gene expression experiment to be considered. We show that this method can identify environmental, physical, and genetic interactions, as well as distinct sets of genes that might be activated or repressed by a single TF under particular conditions. This approach can be used to suggest conditions for directed in vivo experimentation and to predict TF function.
Collapse
Affiliation(s)
- Rachel Patton McCord
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Harvard University Graduate Biophysics Program, Cambridge, MA, USA
| | - Michael F Berger
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Harvard University Graduate Biophysics Program, Cambridge, MA, USA
| | - Anthony A Philippakis
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Harvard University Graduate Biophysics Program, Cambridge, MA, USA
- Harvard/MIT Division of Health Sciences and Technology (HST), Harvard Medical School, Boston, MA, USA
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Harvard University Graduate Biophysics Program, Cambridge, MA, USA
- Harvard/MIT Division of Health Sciences and Technology (HST), Harvard Medical School, Boston, MA, USA
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Harvard Medical School New Research Building, Room 466D, 77 Avenue Louis Pasteur, Boston, MA 02115, USA. Tel.: +1 617 525 4725; Fax: +1 617 525 4705;
| |
Collapse
|
212
|
Morozov AV, Siggia ED. Connecting protein structure with predictions of regulatory sites. Proc Natl Acad Sci U S A 2007; 104:7068-73. [PMID: 17438293 PMCID: PMC1855371 DOI: 10.1073/pnas.0701356104] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A common task posed by microarray experiments is to infer the binding site preferences for a known transcription factor from a collection of genes that it regulates and to ascertain whether the factor acts alone or in a complex. The converse problem can also be posed: Given a collection of binding sites, can the regulatory factor or complex of factors be inferred? Both tasks are substantially facilitated by using relatively simple homology models for protein-DNA interactions, as well as the rapidly expanding protein structure database. For budding yeast, we are able to construct reliable structural models for 67 transcription factors and with them redetermine factor binding sites by using a Bayesian Gibbs sampling algorithm and an extensive protein localization data set. For 49 factors in common with a prior analysis of this data set (based largely on phylogenetic conservation), we find that half of the previously predicted binding motifs are in need of some revision. We also solve the inverse problem of ascertaining the factors from the binding sites by assigning a correct protein fold to 25 of the 49 cases from a previous study. Our approach is easily extended to other organisms, including higher eukaryotes. Our study highlights the utility of enlarging current structural genomics projects that exhaustively sample fold structure space to include all factors with significantly different DNA-binding specificities.
Collapse
Affiliation(s)
- Alexandre V Morozov
- Center for Studies in Physics and Biology, The Rockefeller University, 1230 York Avenue, New York, NY 10021, USA.
| | | |
Collapse
|
213
|
Hu Y, Rolfs A, Bhullar B, Murthy TVS, Zhu C, Berger MF, Camargo AA, Kelley F, McCarron S, Jepson D, Richardson A, Raphael J, Moreira D, Taycher E, Zuo D, Mohr S, Kane MF, Williamson J, Simpson A, Bulyk ML, Harlow E, Marsischky G, Kolodner RD, LaBaer J. Approaching a complete repository of sequence-verified protein-encoding clones for Saccharomyces cerevisiae. Genes Dev 2007; 17:536-43. [PMID: 17322287 PMCID: PMC1832101 DOI: 10.1101/gr.6037607] [Citation(s) in RCA: 85] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2006] [Accepted: 01/03/2007] [Indexed: 01/21/2023]
Abstract
The availability of an annotated genome sequence for the yeast Saccharomyces cerevisiae has made possible the proteome-scale study of protein function and protein-protein interactions. These studies rely on availability of cloned open reading frame (ORF) collections that can be used for cell-free or cell-based protein expression. Several yeast ORF collections are available, but their use and data interpretation can be hindered by reliance on now out-of-date annotations, the inflexible presence of N- or C-terminal tags, and/or the unknown presence of mutations introduced during the cloning process. High-throughput biochemical and genetic analyses would benefit from a "gold standard" (fully sequence-verified, high-quality) ORF collection, which allows for high confidence in and reproducibility of experimental results. Here, we describe Yeast FLEXGene, a S. cerevisiae protein-coding clone collection that covers over 5000 predicted protein-coding sequences. The clone set covers 87% of the current S. cerevisiae genome annotation and includes full sequencing of each ORF insert. Availability of this collection makes possible a wide variety of studies from purified proteins to mutation suppression analysis, which should contribute to a global understanding of yeast protein function.
Collapse
Affiliation(s)
- Yanhui Hu
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Andreas Rolfs
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Bhupinder Bhullar
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Tellamraju V. S. Murthy
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Cong Zhu
- Division of Genetics, Department of Medicine, Brigham & Women’s Hospital, Harvard Medical School, Boston, Masschusetts 02115, USA
| | - Michael F. Berger
- Division of Genetics, Department of Medicine, Brigham & Women’s Hospital, Harvard Medical School, Boston, Masschusetts 02115, USA
- Harvard University Graduate Biophysics Program, Cambridge, Massachusetts 02138, USA
| | | | - Fontina Kelley
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Seamus McCarron
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Daniel Jepson
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Aaron Richardson
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Jacob Raphael
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Donna Moreira
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Elena Taycher
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Dongmei Zuo
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Stephanie Mohr
- DF/HCC DNA Resource Core, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Michael F. Kane
- Ludwig Institute for Cancer Research, University of California San Diego, School of Medicine, La Jolla, California 92093, USA
| | - Janice Williamson
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Andrew Simpson
- Ludwig Institute for Cancer Research, New York, New York 10158, USA
| | - Martha L. Bulyk
- Division of Genetics, Department of Medicine, Brigham & Women’s Hospital, Harvard Medical School, Boston, Masschusetts 02115, USA
- Harvard University Graduate Biophysics Program, Cambridge, Massachusetts 02138, USA
- Department of Pathology, Brigham & Women’s Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
- Harvard-MIT Division of Health Sciences & Technology (HST), Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Edward Harlow
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Gerald Marsischky
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| | - Richard D. Kolodner
- Ludwig Institute for Cancer Research, University of California San Diego, School of Medicine, La Jolla, California 92093, USA
| | - Joshua LaBaer
- Harvard Institute of Proteomics, Harvard Medical School, Cambridge, Massachusetts 02141, USA
- DF/HCC DNA Resource Core, Harvard Medical School, Cambridge, Massachusetts 02141, USA
| |
Collapse
|
214
|
Hou P, Chen Z, Ji M, He N, Lu Z. Real-time PCR Assay for Ultrasensitive Quantification of DNA-Binding Proteins. Clin Chem 2007; 53:581-6. [PMID: 17289804 DOI: 10.1373/clinchem.2006.077503] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Abstract
Background: The specific binding of proteins to DNA is a key step for many cellular activities, such as transcription regulation, DNA replication, recombination, repair, and restriction. The detection of DNA-binding proteins, as well as the identification of specific binding sites, is therefore important to understand gene expression mechanisms and cellular function. We describe an ultrasensitive method for quantification of DNA-binding proteins.
Methods: We combined the common exonuclease III (ExoIII) footprinting assay and real-time PCR for quantification of DNA-binding proteins, for an assay that does not require antibodies against the target proteins. Double-strand DNA probes were designed to monitor the activities of DNA-binding protein. The protein-binding site is at the 5′ end of the forward primer. When a target protein is present, it will specifically bind to the protein-binding site and produce a physical hindrance to ExoIII, which protects the reverse DNA strand from digestion by ExoIII. The remaining single-strand DNA template can be quantitatively detected by real-time PCR. Conversely, in the absence of the target protein, the naked primer regions will be degraded by ExoIII, which then cannot be amplified by real-time PCR.
Results: We detected the binding of 10 different transcription factors in crude cell extracts. The assay quantitatively detected binding at femtomolar concentrations of protein.
Conclusions: This technique is customizable and easy to establish. It has potential applications in research, medical diagnosis, and drug discovery.
Collapse
Affiliation(s)
- Peng Hou
- Chien-Shiung Wu Laboratory, Department of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | | | | | | | | |
Collapse
|
215
|
Zhang L, Kasif S, Cantor ACR. Quantifying DNA-protein binding specificities by using oligonucleotide mass tags and mass spectroscopy. Proc Natl Acad Sci U S A 2007; 104:3061-6. [PMID: 17360609 PMCID: PMC1805538 DOI: 10.1073/pnas.0611075104] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
The ability to determine the relative binding affinity of different transcription-factors (TF) to their DNA binding sites is fundamentally important for a comprehensive understanding of gene regulation. Here we present a general approach for multiplex quantification of DNA-TF binding specificities in vitro using oligonucleotide mass tag (OMT) labeling and mass spectroscopic quantification. An OMT is a short nucleic acid sequence with a distinct mass that can be resolved by a mass spectrometer. Each putative binding sequence is labeled with a unique OMT, and PCR amplification of OMTs is performed after removing nonbound DNA. Subsequently, a primer extension reaction is carried out, and the extension products are quantified by MALDI-TOF mass spectroscopy. Using the TF NF-kappaB P50, we have quantified the binding specificities of up to 15 binding sequences in a single assay. The results from the multiplex assay are consistent with data from the traditional gel shift assay. The approach allows the competitive binding of multiple DNA sequences to the given protein in a homogeneous reaction. By using the commercially available homogeneous MassEXTEND platform (SEQUENOM), it is scalable for high-throughput DNA-TF binding applications, including genome-wide TF binding site mapping and analyses of SNPs in promoter regions.
Collapse
Affiliation(s)
- Lingang Zhang
- *Center for Advanced Biotechnology
- Department of Biomedical Engineering
| | - Simon Kasif
- Department of Biomedical Engineering
- Bioinformatics Program, and
- Center for Advanced Genomic Technology, Boston University, Boston, MA 02215; and
| | - and Charles R. Cantor
- *Center for Advanced Biotechnology
- Department of Biomedical Engineering
- SEQUENOM, Inc., 3595 John Hopkins Court, San Diego, CA 92121
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
216
|
Gustafsdottir SM, Schlingemann J, Rada-Iglesias A, Schallmeiner E, Kamali-Moghaddam M, Wadelius C, Landegren U. In vitro analysis of DNA-protein interactions by proximity ligation. Proc Natl Acad Sci U S A 2007; 104:3067-72. [PMID: 17360610 PMCID: PMC1805562 DOI: 10.1073/pnas.0611229104] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Protein-binding DNA sequence elements encode a variety of regulated functions of genomes. Information about such elements is currently in a state of rapid growth, but improved methods are required to characterize the sequence specificity of DNA-binding proteins. We have established an in vitro method for specific and sensitive solution-phase analysis of interactions between proteins and nucleic acids in nuclear extracts, based on the proximity ligation assay. The reagent consumption is very low, and the excellent sensitivity of the assay enables analysis of as few as 1-10 cells. We show that our results are highly reproducible, quantitative, and in good agreement with both EMSA and predictions obtained by using a motif finding software. This assay can be a valuable tool to characterize in-depth the sequence specificity of DNA-binding proteins and to evaluate effects of polymorphisms in known transcription factor binding sites.
Collapse
Affiliation(s)
- Sigrun M Gustafsdottir
- Rudbeck Laboratory, Department of Genetics and Pathology, Uppsala University, Dag Hammarskjöldsväg 20, SE-75185 Uppsala, Sweden.
| | | | | | | | | | | | | |
Collapse
|
217
|
Sasaki D, Kondo S, Maeda N, Gingeras TR, Hasegawa Y, Hayashizaki Y. Characteristics of oligonucleotide tiling arrays measured by hybridizing full-length cDNA clones: causes of signal variation and false positive signals. Genomics 2007; 89:541-51. [PMID: 17292583 DOI: 10.1016/j.ygeno.2006.12.013] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2006] [Revised: 11/14/2006] [Accepted: 12/29/2006] [Indexed: 10/23/2022]
Abstract
An assessment of the hybridization characteristics of oligonucleotide tiling arrays was carried out using 162 full-length sequenced cDNA clones in spike-in experiments. The properties of array probes that influence signal intensity were investigated, and their capability in the detection of the cDNA exons was evaluated. The signal intensities detected in exonic and nonexonic genomic regions were examined by focusing on the features of probe sequences that raise or lower the level of intensity and on the causes of false positive signals found in nonexonic regions. The effectiveness of measures used in published protocols to improve the separation between signal and background intensity distributions, including the use of replicates and threshold parameterization of signal intensity, was assessed. Sensitivity and specificity in the detection of exons were measured using various sets of threshold parameters, and the effects of each parameter on the detection efficiency and the rate of false positives were evaluated. It was also demonstrated that hybridization of full-length cDNA clones is an excellent method to investigate the characteristics of oligonucleotide tiling arrays.
Collapse
Affiliation(s)
- Daisuke Sasaki
- Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokohama Institute, Yokohama, Japan
| | | | | | | | | | | |
Collapse
|
218
|
Maerkl SJ, Quake SR. A systems approach to measuring the binding energy landscapes of transcription factors. Science 2007; 315:233-7. [PMID: 17218526 DOI: 10.1126/science.1131007] [Citation(s) in RCA: 399] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
A major goal of systems biology is to predict the function of biological networks. Although network topologies have been successfully determined in many cases, the quantitative parameters governing these networks generally have not. Measuring affinities of molecular interactions in high-throughput format remains problematic, especially for transient and low-affinity interactions. We describe a high-throughput microfluidic platform that measures such properties on the basis of mechanical trapping of molecular interactions. With this platform we characterized DNA binding energy landscapes for four eukaryotic transcription factors; these landscapes were used to test basic assumptions about transcription factor binding and to predict their in vivo function.
Collapse
Affiliation(s)
- Sebastian J Maerkl
- Biochemistry and Molecular Biophysics Option, California Institute of Technology, 1200 East California Boulevard, Pasadena, CA 91125, USA
| | | |
Collapse
|
219
|
Lee I, Narayanaswamy R, Marcotte EM. 24 Bioinformatic Prediction of Yeast Gene Function. J Microbiol Methods 2007. [DOI: 10.1016/s0580-9517(06)36024-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
220
|
Bulyk ML. Protein binding microarrays for the characterization of DNA-protein interactions. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2007; 104:65-85. [PMID: 17290819 PMCID: PMC2727742 DOI: 10.1007/10_025] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
A number of important cellular processes, such as transcriptional regulation, recombination, replication, repair, and DNA modification, are performed by DNA binding proteins. Of particular interest are transcription factors (TFs) which, through their sequence-specific interactions with DNA binding sites, modulate gene expression in a manner required for normal cellular growth and differentiation, and also for response to environmental stimuli. Despite their importance, the DNA binding specificities of most DNA binding proteins still remain unknown, since prior technologies aimed at identifying DNA-protein interactions have been laborious, not highly scalable, or have required limiting biological reagents. Recently a new DNA microarray-based technology, termed protein binding microarrays (PBMs), has been developed that allows rapid, high-throughput characterization of the in vitro DNA binding site sequence specificities of TFs, other DNA binding proteins, or synthetic compounds. DNA binding site data from PBMs combined with gene annotation data, comparative sequence analysis, and gene expression profiling, can be used to predict what genes are regulated by a given TF, what the functions are of a given TF and its predicted target genes, and how that TF may fit into the cell's transcriptional regulatory network.
Collapse
Affiliation(s)
- Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham & Women's Hospital and Harvard Medical School, Harvard Medical School New Research Bldg., Room 466D, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.
| |
Collapse
|
221
|
Zhu Q, Hong A, Sheng N, Zhang X, Matejko A, Jun KY, Srivannavit O, Gulari E, Gao X, Zhou X. microParaflo biochip for nucleic acid and protein analysis. Methods Mol Biol 2007; 382:287-312. [PMID: 18220239 DOI: 10.1007/978-1-59745-304-2_19] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
We describe in this chapter the use of oligonucleotide or peptide microarrays (arrays) based on microfluidic chips. Specifically, three major applications are presented: (1) microRNA/small RNA detection using a microRNA detection chip, (2) protein binding and function analysis using epitope, kinase substrate, or phosphopeptide chips, and (3) protein-binding analysis using oligonucleotide chips. These diverse categories of customizable arrays are based on the same biochip platform featuring a significant amount of flexibility in the sequence design to suit a wide range of research needs. The protocols of the array applications play a critical role in obtaining high quality and reliable results. Given the comprehensive and complex nature of the array experiments, the details presented in this chapter is intended merely as a useful information source of reference or a starting point for many researchers who are interested in genome- or proteome-scale studies of proteins and nucleic acids and their interactions.
Collapse
Affiliation(s)
- Qi Zhu
- Department of Biology and Biochemistry, University of Houston, TX, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
222
|
Bulyk ML. Protein binding microarrays for the characterization of DNA-protein interactions. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2007. [PMID: 17290819 DOI: 10.1007/10-025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 04/15/2023]
Abstract
A number of important cellular processes, such as transcriptional regulation, recombination, replication, repair, and DNA modification, are performed by DNA binding proteins. Of particular interest are transcription factors (TFs) which, through their sequence-specific interactions with DNA binding sites, modulate gene expression in a manner required for normal cellular growth and differentiation, and also for response to environmental stimuli. Despite their importance, the DNA binding specificities of most DNA binding proteins still remain unknown, since prior technologies aimed at identifying DNA-protein interactions have been laborious, not highly scalable, or have required limiting biological reagents. Recently a new DNA microarray-based technology, termed protein binding microarrays (PBMs), has been developed that allows rapid, high-throughput characterization of the in vitro DNA binding site sequence specificities of TFs, other DNA binding proteins, or synthetic compounds. DNA binding site data from PBMs combined with gene annotation data, comparative sequence analysis, and gene expression profiling, can be used to predict what genes are regulated by a given TF, what the functions are of a given TF and its predicted target genes, and how that TF may fit into the cell's transcriptional regulatory network.
Collapse
Affiliation(s)
- Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham & Women's Hospital and Harvard Medical School, Harvard Medical School New Research Bldg., Room 466D, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.
| |
Collapse
|
223
|
Kinney JB, Tkačik G, Callan CG. Precise physical models of protein-DNA interaction from high-throughput data. Proc Natl Acad Sci U S A 2006; 104:501-6. [PMID: 17197415 PMCID: PMC1766414 DOI: 10.1073/pnas.0609908104] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A cell's ability to regulate gene transcription depends in large part on the energy with which transcription factors (TFs) bind their DNA regulatory sites. Obtaining accurate models of this binding energy is therefore an important goal for quantitative biology. In this article, we present a principled likelihood-based approach for inferring physical models of TF-DNA binding energy from the data produced by modern high-throughput binding assays. Central to our analysis is the ability to assess the relative likelihood of different model parameters given experimental observations. We take a unique approach to this problem and show how to compute likelihood without any explicit assumptions about the noise that inevitably corrupts such measurements. Sampling possible choices for model parameters according to this likelihood function, we can then make probabilistic predictions for the identities of binding sites and their physical binding energies. Applying this procedure to previously published data on the Saccharomyces cerevisiae TF Abf1p, we find models of TF binding whose parameters are determined with remarkable precision. Evidence for the accuracy of these models is provided by an astonishing level of phylogenetic conservation in the predicted energies of putative binding sites. Results from in vivo and in vitro experiments also provide highly consistent characterizations of Abf1p, a result that contrasts with a previous analysis of the same data.
Collapse
Affiliation(s)
- Justin B. Kinney
- Physics Department and Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544
| | - Gašper Tkačik
- Physics Department and Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544
| | - Curtis G. Callan
- Physics Department and Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544
- *To whom correspondence should be addressed. E-mail:
| |
Collapse
|
224
|
Yarragudi A, Parfrey LW, Morse RH. Genome-wide analysis of transcriptional dependence and probable target sites for Abf1 and Rap1 in Saccharomyces cerevisiae. Nucleic Acids Res 2006; 35:193-202. [PMID: 17158163 PMCID: PMC1802568 DOI: 10.1093/nar/gkl1059] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Abf1 and Rap1 are general regulatory factors (GRFs) that contribute to transcriptional activation of a large number of genes, as well as to replication, silencing and telomere structure in yeast. In spite of their widespread roles in transcription, the scope of their functional targets genome-wide has not been previously determined. Here, we use microarrays to examine the contribution of these essential GRFs to transcription genome-wide, by using ts mutants that dissociate from their binding sites at 37°C. We then combine this data with published ChIP-chip studies and motif analysis to identify probable direct targets for Abf1 and Rap1. We also identify a substantial number of genes likely to bind Rap1 or Abf1, but not affected by loss of GRF binding. Interestingly, the results strongly suggest that Rap1 can contribute to gene activation from farther upstream than can Abf1. Also, consistent with previous work, more genes that bind Abf1 are unaffected by loss of binding than those that bind Rap1. Finally, we show for several such genes that the Abf1 C-terminal region, which contains the putative activation domain, is not needed to confer this peculiar ‘memory effect’ that allows continued transcription after loss of Abf1 binding.
Collapse
Affiliation(s)
- Arunadevi Yarragudi
- Laboratory of Developmental Genetics, Wadsworth Center, New York State Department of HealthAlbany, NY 12201-2002, USA
| | - Laura Wegener Parfrey
- Laboratory of Developmental Genetics, Wadsworth Center, New York State Department of HealthAlbany, NY 12201-2002, USA
| | - Randall H. Morse
- Laboratory of Developmental Genetics, Wadsworth Center, New York State Department of HealthAlbany, NY 12201-2002, USA
- Department of Biomedical Sciences, State University of New York at Albany School of Public HealthAlbany, NY 12201-2002, USA
- To whom correspondence should be addressed. Tel: +1 518 486 3116; Fax: +1 518 474 3181;
| |
Collapse
|
225
|
Liu LA, Bader JS. Decoding transcriptional regulatory interactions. PHYSICA D. NONLINEAR PHENOMENA 2006; 224:174-181. [PMID: 17364011 PMCID: PMC1827156 DOI: 10.1016/j.physd.2006.09.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Transcription factor proteins control the temporal and spatial expression of genes by binding specific regulatory elements, or motifs, in DNA. Mapping a transcription factor to its motif is an important step towards defining the structure of transcriptional regulatory networks and understanding their dynamics. The information to map a transcription factor to its DNA binding specificity is in principle contained in the protein sequence. Nevertheless, methods that map directly from protein sequence to target DNA sequence have been lacking, and generation of regulatory maps has required experimental data. Here we describe a purely computational method for predicting transcription factor binding. The method calculates the free energy of binding between a transcription factor and possible target DNA sequences using thermodynamic integration. Approximations of additivity (each DNA basepair contributes independently to the binding energy) and linear response (the DNA-protein and DNA-solvent couplings are linear in an effective reaction coordinate representing the basepair character at a specific position) make the computations feasible and can be verified by more detailed simulations. Results obtained for MAT-alpha2, a yeast homeodomain transcription factor, are in good agreement with known results. This method promises to provide a general, computationally feasible route from a genome sequence to a gene regulatory network.
Collapse
Affiliation(s)
| | - Joel S. Bader
- Email address: (L. Angela Liu and Joel S. Bader). URL:www.jhubiomed.org (L. Angela Liu and Joel S. Bader)
| |
Collapse
|
226
|
Buck MJ, Lieb JD. A chromatin-mediated mechanism for specification of conditional transcription factor targets. Nat Genet 2006; 38:1446-51. [PMID: 17099712 PMCID: PMC2756100 DOI: 10.1038/ng1917] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2006] [Accepted: 10/04/2006] [Indexed: 01/17/2023]
Abstract
Organisms respond to changes in their environment, and many such responses are initiated at the level of gene transcription. Here, we provide evidence for a previously undiscovered mechanism for directing transcriptional regulators to new binding targets in response to an environmental change. We show that repressor-activator protein 1 (Rap1), a master regulator of yeast metabolism, binds to an expanded target set after glucose depletion despite decreasing protein levels and no evidence of posttranslational modification. Computational analysis predicts that proteins capable of recruiting the chromatin regulator Tup1 act to restrict the binding distribution of Rap1 in the presence of glucose. Deletion of the gene(s) encoding Tup1, recruiters of Tup1 or chromatin regulators recruited by Tup1 cause Rap1 to bind specifically and inappropriately to low-glucose targets. These data, combined with whole-genome measurements of nucleosome occupancy and Tup1 distribution, provide evidence for a mechanism of dynamic target specification that coordinates the genome-wide distribution of intermediate-affinity DNA sequence motifs with chromatin-mediated regulation of accessibility to those sites.
Collapse
Affiliation(s)
- Michael J Buck
- Department of Biology and the Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA
| | | |
Collapse
|
227
|
Roider HG, Kanhere A, Manke T, Vingron M. Predicting transcription factor affinities to DNA from a biophysical model. Bioinformatics 2006; 23:134-41. [PMID: 17098775 DOI: 10.1093/bioinformatics/btl565] [Citation(s) in RCA: 152] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Theoretical efforts to understand the regulation of gene expression are traditionally centered around the identification of transcription factor binding sites at specific DNA positions. More recently these efforts have been supplemented by experimental data for relative binding affinities of proteins to longer intergenic sequences. The question arises to what extent these two approaches converge. In this paper, we adopt a physical binding model to predict the relative binding affinity of a transcription factor for a given sequence. RESULTS We find that a significant fraction of genome-wide binding data in yeast can be accounted for by simple count matrices and a physical model with only two parameters. We demonstrate that our approach is both conceptually and practically more powerful than traditional methods, which require selection of a cutoff. Our analysis yields biologically meaningful parameters, suitable for predicting relative binding affinities in the absence of experimental binding data. AVAILABILITY The C source code for our TRAP program is freely available for non-commercial use at http://www.molgen.mpg.de/~manke/papers/TFaffinities/
Collapse
Affiliation(s)
- Helge G Roider
- Max-Planck-Institute for Molecular Genetics Ihnestrasse 73, 14195 Berlin, Germany
| | | | | | | |
Collapse
|
228
|
Field S, Udalova I, Ragoussis J. Accuracy and reproducibility of protein-DNA microarray technology. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2006; 104:87-110. [PMID: 17290820 DOI: 10.1007/10_2006_035] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Microarray-based methods for understanding protein-DNA interactions have been developed in the last 6 years due to the need to introduce high-throughput technologies in this field. Protein-DNA microarrays utilise chips upon which a large number of DNA sequences may be printed or synthesised. Any DNA-binding protein may then be interrogated by applying either purified sample or cellular/nuclear extracts, subject to availability of a suitable detection system. Protein is simply added to the microarray slide surface, which is then washed and subjected to at least one further incubation with a labelled molecule which binds specifically to the protein of interest. The signal obtained is proportional to the level of DNA-binding protein bound to each DNA feature, enabling relative affinities to be calculated. Key factors for reproducible and accurate quantification of protein binding are: microarray surface chemistry; length of oligonucleotides; position of the binding site sequence; quality of the protein and antibodies; and hybridisation conditions.
Collapse
Affiliation(s)
- Simon Field
- Wellcome Trust Centre for Human Genetics, University of Oxford, 7 Roosevelt Drive, Oxford OX3 7BN, UK
| | | | | |
Collapse
|
229
|
Walhout AJM. Unraveling transcription regulatory networks by protein-DNA and protein-protein interaction mapping. Genome Res 2006; 16:1445-54. [PMID: 17053092 DOI: 10.1101/gr.5321506] [Citation(s) in RCA: 113] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Metazoan genomes contain thousands of protein-coding and noncoding RNA genes, most of which are differentially expressed, i.e., at different locations or at different times during development, function, or pathology of the organism. Differential gene expression is achieved in part by the action of regulatory transcription factors (TFs) that bind to cis-regulatory elements that are often located in or near their target genes. Each TF likely regulates many targets in the context of intricate transcription regulatory networks. Up to 10% of a genome may encode TFs, but only a handful of these have been studied in detail. Here, I will discuss the different steps involved in the mapping and analysis of transcription regulatory networks, including the identification of network nodes (TFs and their target sequences) and edges (TF-TF dimers and TF-DNA target interactions), integration with other data types, and network properties and emerging principles that provide insights into differential gene expression.
Collapse
Affiliation(s)
- Albertha J M Walhout
- Program in Gene Function and Expression, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA.
| |
Collapse
|
230
|
Elnitski L, Jin VX, Farnham PJ, Jones SJM. Locating mammalian transcription factor binding sites: a survey of computational and experimental techniques. Genome Res 2006; 16:1455-64. [PMID: 17053094 DOI: 10.1101/gr.4140006] [Citation(s) in RCA: 168] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Fields such as genomics and systems biology are built on the synergism between computational and experimental techniques. This type of synergism is especially important in accomplishing goals like identifying all functional transcription factor binding sites in vertebrate genomes. Precise detection of these elements is a prerequisite to deciphering the complex regulatory networks that direct tissue specific and lineage specific patterns of gene expression. This review summarizes approaches for in silico, in vitro, and in vivo identification of transcription factor binding sites. A variety of techniques useful for localized- and high-throughput analyses are discussed here, with emphasis on aspects of data generation and verification.
Collapse
Affiliation(s)
- Laura Elnitski
- Genomic Functional Analysis Section, National Human Genome Research Institute, National Institutes of Health, Rockville, Maryland 20878, USA.
| | | | | | | |
Collapse
|
231
|
Mintseris J, Eisen MB. Design of a combinatorial DNA microarray for protein-DNA interaction studies. BMC Bioinformatics 2006; 7:429. [PMID: 17018151 PMCID: PMC1635571 DOI: 10.1186/1471-2105-7-429] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2006] [Accepted: 10/03/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Discovery of precise specificity of transcription factors is an important step on the way to understanding the complex mechanisms of gene regulation in eukaryotes. Recently, double-stranded protein-binding microarrays were developed as a potentially scalable approach to tackle transcription factor binding site identification. RESULTS Here we present an algorithmic approach to experimental design of a microarray that allows for testing full specificity of a transcription factor binding to all possible DNA binding sites of a given length, with optimally efficient use of the array. This design is universal, works for any factor that binds a sequence motif and is not species-specific. Furthermore, simulation results show that data produced with the designed arrays is easier to analyze and would result in more precise identification of binding sites. CONCLUSION In this study, we present a design of a double stranded DNA microarray for protein-DNA interaction studies and show that our algorithm allows optimally efficient use of the arrays for this purpose. We believe such a design will prove useful for transcription factor binding site identification and other biological problems.
Collapse
Affiliation(s)
| | - Michael B Eisen
- Department of Genome Sciences, Lawrence Berkeley National Lab, Berkeley, CA, USA
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
| |
Collapse
|
232
|
Berger MF, Philippakis AA, Qureshi AM, He FS, Estep PW, Bulyk ML. Compact, universal DNA microarrays to comprehensively determine transcription-factor binding site specificities. Nat Biotechnol 2006; 24:1429-35. [PMID: 16998473 PMCID: PMC4419707 DOI: 10.1038/nbt1246] [Citation(s) in RCA: 536] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2006] [Accepted: 07/28/2006] [Indexed: 01/09/2023]
Abstract
Transcription factors (TFs) interact with specific DNA regulatory sequences to control gene expression throughout myriad cellular processes. However, the DNA binding specificities of only a small fraction of TFs are sufficiently characterized to predict the sequences that they can and cannot bind. We present a maximally compact, synthetic DNA sequence design for protein binding microarray (PBM) experiments that represents all possible DNA sequence variants of a given length k (that is, all 'k-mers') on a single, universal microarray. We constructed such all k-mer microarrays covering all 10-base pair (bp) binding sites by converting high-density single-stranded oligonucleotide arrays to double-stranded (ds) DNA arrays. Using these microarrays we comprehensively determined the binding specificities over a full range of affinities for five TFs of different structural classes from yeast, worm, mouse and human. The unbiased coverage of all k-mers permits high-throughput interrogation of binding site preferences, including nucleotide interdependencies, at unprecedented resolution.
Collapse
Affiliation(s)
- Michael F Berger
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| | | | | | | | | | | |
Collapse
|
233
|
de Haan G, Gerrits A, Bystrykh L. Modern genome-wide genetic approaches to reveal intrinsic properties of stem cells. Curr Opin Hematol 2006; 13:249-53. [PMID: 16755221 DOI: 10.1097/01.moh.0000231422.00407.be] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
PURPOSE OF REVIEW The clinical use of hematopoietic stem cells, which produce all mature blood cell lineages in the circulation, is continuously increasing. Identification of genes and gene networks specifying either stemness or commitment will not only be of major relevance for a fundamental understanding of developmental biology, but also for the emerging fields of tissue engineering and regenerative medicine. Our appreciation of the transcriptional machinery that distinguishes stem cells from their nonstem cell progeny is, however, rudimentary. State-of-the art genome-wide tools are now becoming available to elucidate intrinsic properties of stem cells. Here, we review recent progress that has been made in this field. RECENT FINDINGS Approaches to study stem cell-specific genes and gene networks include genetical genomics, mRNA and microRNA expression profiling of carefully selected cells, proteomics, chromatin studies using 'CHIP-on-chip' tools, genome-wide binding site analyses for transcription factors and chromatin-remodeling proteins, and tools to study the three-dimensional organization of gene loci. It is promising to see that the combined application of these tools has resulted in the identification of multiple novel genes that regulate stem cell self-renewal. SUMMARY Exploitation of the available technology and integrating the data by translation into a dynamic model of networks, operating in all four dimensions, will be essential to fully comprehend the elusive concept of 'stemness'. It is time to harvest.
Collapse
Affiliation(s)
- Gerald de Haan
- Department of Cell Biology, Stem Cell Biology, University Medical Center Groningen, Groningen, The Netherlands.
| | | | | |
Collapse
|
234
|
Hahn MW. Detecting natural selection on cis-regulatory DNA. Genetica 2006; 129:7-18. [PMID: 16955334 DOI: 10.1007/s10709-006-0029-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2004] [Accepted: 06/25/2005] [Indexed: 10/24/2022]
Abstract
Changes in transcriptional regulation play an important role in the genetic basis for evolutionary change. Here I review a growing body of literature that seeks to determine the forces governing the non-coding regulatory sequences underlying these changes. I address the challenges present in studying natural selection without the familiar structure and regularity of protein-coding sequences, but show that most tests of neutrality that have been used for coding regions are applicable to non-coding regions, albeit with some caveats. While some experimental investment is necessary to identify heritable regulatory variation, the most basic inferences about selection require very little functional information. A growing body of research on cis-regulatory variation has uncovered all the forms of selection common to coding regions, in addition to novel forms of selection. An emerging pattern seems to be the ubiquity of local adaptation and balancing selection, possibly due to the greater freedom organisms have to fine-tune gene expression without changing protein function. It is clear from multiple single locus and whole genome studies of non-coding regulatory DNA that the effects of natural selection reach far beyond the start and stop codons.
Collapse
Affiliation(s)
- Matthew W Hahn
- Department of Biology and School of Informatics, Indiana University, Bloomington, IN, 47405, USA.
| |
Collapse
|
235
|
Zaslavsky E, Singh M. A combinatorial optimization approach for diverse motif finding applications. Algorithms Mol Biol 2006; 1:13. [PMID: 16916460 PMCID: PMC1570465 DOI: 10.1186/1748-7188-1-13] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2006] [Accepted: 08/17/2006] [Indexed: 11/13/2022] Open
Abstract
Background Discovering approximately repeated patterns, or motifs, in biological sequences is an important and widely-studied problem in computational molecular biology. Most frequently, motif finding applications arise when identifying shared regulatory signals within DNA sequences or shared functional and structural elements within protein sequences. Due to the diversity of contexts in which motif finding is applied, several variations of the problem are commonly studied. Results We introduce a versatile combinatorial optimization framework for motif finding that couples graph pruning techniques with a novel integer linear programming formulation. Our approach is flexible and robust enough to model several variants of the motif finding problem, including those incorporating substitution matrices and phylogenetic distances. Additionally, we give an approach for determining statistical significance of uncovered motifs. In testing on numerous DNA and protein datasets, we demonstrate that our approach typically identifies statistically significant motifs corresponding to either known motifs or other motifs of high conservation. Moreover, in most cases, our approach finds provably optimal solutions to the underlying optimization problem. Conclusion Our results demonstrate that a combined graph theoretic and mathematical programming approach can be the basis for effective and powerful techniques for diverse motif finding applications.
Collapse
Affiliation(s)
- Elena Zaslavsky
- Department of Computer Science & Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Mona Singh
- Department of Computer Science & Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| |
Collapse
|
236
|
Ponjavic J, Lenhard B, Kai C, Kawai J, Carninci P, Hayashizaki Y, Sandelin A. Transcriptional and structural impact of TATA-initiation site spacing in mammalian core promoters. Genome Biol 2006; 7:R78. [PMID: 16916456 PMCID: PMC1779604 DOI: 10.1186/gb-2006-7-8-r78] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2006] [Revised: 06/19/2006] [Accepted: 08/17/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The TATA box, one of the most well studied core promoter elements, is associated with induced, context-specific expression. The lack of precise transcription start site (TSS) locations linked with expression information has impeded genome-wide characterization of the interaction between TATA and the pre-initiation complex. RESULTS Using a comprehensive set of 5.66 x 10(6) sequenced 5' cDNA ends from diverse tissues mapped to the mouse genome, we found that the TATA-TSS distance is correlated with the tissue specificity of the downstream transcript. To achieve tissue-specific regulation, the TATA box position relative to the TSS is constrained to a narrow window (-32 to -29), where positions -31 and -30 are the optimal positions for achieving high tissue specificity. Slightly larger spacings can be accommodated only when there is no optimally spaced initiation signal; in contrast, the TATA box like motifs found downstream of position -28 are generally nonfunctional. The strength of the TATA binding protein-DNA interaction plays a subordinate role to spacing in terms of tissue specificity. Furthermore, promoters with different TATA-TSS spacings have distinct features in terms of consensus sequence around the initiation site and distribution of alternative TSSs. Unexpectedly, promoters that have two dominant, consecutive TSSs are TATA depleted and have a novel GGG initiation site consensus. CONCLUSION In this report we present the most comprehensive characterization of TATA-TSS spacing and functionality to date. The coupling of spacing to tissue specificity at the transcriptome level provides important clues as to the function of core promoters and the choice of TSS by the pre-initiation complex.
Collapse
Affiliation(s)
- Jasmina Ponjavic
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, 230-0045, Japan
- MRC Functional Genetics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3QX, UK
| | - Boris Lenhard
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, 230-0045, Japan
- Computational Biology Unit, Bergen Center for Computational Science, University of Bergen, HIB, Thormøhlensgate 55, N-5008 Bergen, Norway
| | - Chikatoshi Kai
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, 230-0045, Japan
| | - Jun Kawai
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, 230-0045, Japan
- Genome Science Laboratory, Discovery and Research Institute, RIKEN Wako Institute, Wako, Saitama, 351-0198, Japan
| | - Piero Carninci
- Genome Science Laboratory, Discovery and Research Institute, RIKEN Wako Institute, Wako, Saitama, 351-0198, Japan
| | - Yoshihide Hayashizaki
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, 230-0045, Japan
- Genome Science Laboratory, Discovery and Research Institute, RIKEN Wako Institute, Wako, Saitama, 351-0198, Japan
| | - Albin Sandelin
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, Yokohama, Kanagawa, 230-0045, Japan
| |
Collapse
|
237
|
Berger MF, Bulyk ML. Protein binding microarrays (PBMs) for rapid, high-throughput characterization of the sequence specificities of DNA binding proteins. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2006; 338:245-60. [PMID: 16888363 PMCID: PMC2690637 DOI: 10.1385/1-59745-097-9:245] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
DNA binding proteins play a number of key roles in cells, in processes including transcriptional regulation, recombination, genome rearrangements, and DNA replication, repair, and modification. Of particular interest are the interactions between transcription factors and their DNA binding sites, as they are an integral part of the transcriptional regulatory networks that control gene expression. Despite their importance, the DNA binding specificities of most DNA binding proteins remain unknown, as earlier technologies aimed at characterizing DNA-protein interactions have been time consuming and not highly scalable. We have developed a new DNA microarray-based technology, termed protein binding microarrays (PBMs), that allows rapid, high-throughput characterization of the in vitro DNA binding site sequence specificities of transcription factors in a single day. The resulting DNA binding site data can be used in a number of ways, including for the prediction of the genes regulated by a given transcription factor, annotation of transcription factor function, and functional annotation of the predicted target genes.
Collapse
Affiliation(s)
- Michael F. Berger
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115
- Harvard University Graduate Biophysics Program, Harvard Medical School, Boston, MA 02115
| | - Martha L. Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115
- Department of Pathology Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115
- Harvard/MIT Division of Health Sciences and Technology (HST), Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115
- Harvard University Graduate Biophysics Program, Harvard Medical School, Boston, MA 02115
- Correspondence should be addressed to: Martha L. Bulyk, Ph.D., Harvard Medical School New Research Building, Room 466D, 77 Avenue Louis Pasteur, Boston, MA, 02115. Phone: (617) 525-4725. Fax: (617)525-4705.
| |
Collapse
|
238
|
Deplancke B, Mukhopadhyay A, Ao W, Elewa AM, Grove CA, Martinez NJ, Sequerra R, Doucette-Stamm L, Reece-Hoyes JS, Hope IA, Tissenbaum HA, Mango SE, Walhout AJM. A gene-centered C. elegans protein-DNA interaction network. Cell 2006; 125:1193-205. [PMID: 16777607 DOI: 10.1016/j.cell.2006.04.038] [Citation(s) in RCA: 192] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2005] [Revised: 02/27/2006] [Accepted: 04/12/2006] [Indexed: 11/16/2022]
Abstract
Transcription regulatory networks consist of physical and functional interactions between transcription factors (TFs) and their target genes. The systematic mapping of TF-target gene interactions has been pioneered in unicellular systems, using "TF-centered" methods (e.g., chromatin immunoprecipitation). However, metazoan systems are less amenable to such methods. Here, we used "gene-centered" high-throughput yeast one-hybrid (Y1H) assays to identify 283 interactions between 72 C. elegans digestive tract gene promoters and 117 proteins. The resulting protein-DNA interaction (PDI) network is highly connected and enriched for TFs that are expressed in the digestive tract. We provide functional annotations for approximately 10% of all worm TFs, many of which were previously uncharacterized, and find ten novel putative TFs, illustrating the power of a gene-centered approach. We provide additional in vivo evidence for multiple PDIs and illustrate how the PDI network provides insights into metazoan differential gene expression at a systems level.
Collapse
Affiliation(s)
- Bart Deplancke
- Program in Gene Function and Expression and Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, 01605, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
239
|
Chua G, Morris QD, Sopko R, Robinson MD, Ryan O, Chan ET, Frey BJ, Andrews BJ, Boone C, Hughes TR. Identifying transcription factor functions and targets by phenotypic activation. Proc Natl Acad Sci U S A 2006; 103:12045-50. [PMID: 16880382 PMCID: PMC1567694 DOI: 10.1073/pnas.0605140103] [Citation(s) in RCA: 147] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Mapping transcriptional regulatory networks is difficult because many transcription factors (TFs) are activated only under specific conditions. We describe a generic strategy for identifying genes and pathways induced by individual TFs that does not require knowledge of their normal activation cues. Microarray analysis of 55 yeast TFs that caused a growth phenotype when overexpressed showed that the majority caused increased transcript levels of genes in specific physiological categories, suggesting a mechanism for growth inhibition. Induced genes typically included established targets and genes with consensus promoter motifs, if known, indicating that these data are useful for identifying potential new target genes and binding sites. We identified the sequence 5'-TCACGCAA as a binding sequence for Hms1p, a TF that positively regulates pseudohyphal growth and previously had no known motif. The general strategy outlined here presents a straightforward approach to discovery of TF activities and mapping targets that could be adapted to any organism with transgenic technology.
Collapse
Affiliation(s)
- Gordon Chua
- *Banting and Best Department of Medical Research, and Departments of
| | - Quaid D. Morris
- *Banting and Best Department of Medical Research, and Departments of
- Computer Science
- Electrical and Computer Engineering, and
| | - Richelle Sopko
- Medical Genetics and Microbiology, University of Toronto, 160 College Street, Toronto, ON, Canada M5S 1A8
| | - Mark D. Robinson
- *Banting and Best Department of Medical Research, and Departments of
- Electrical and Computer Engineering, and
| | - Owen Ryan
- Medical Genetics and Microbiology, University of Toronto, 160 College Street, Toronto, ON, Canada M5S 1A8
| | - Esther T. Chan
- Medical Genetics and Microbiology, University of Toronto, 160 College Street, Toronto, ON, Canada M5S 1A8
| | - Brendan J. Frey
- *Banting and Best Department of Medical Research, and Departments of
- Electrical and Computer Engineering, and
| | - Brenda J. Andrews
- *Banting and Best Department of Medical Research, and Departments of
- Medical Genetics and Microbiology, University of Toronto, 160 College Street, Toronto, ON, Canada M5S 1A8
| | - Charles Boone
- *Banting and Best Department of Medical Research, and Departments of
- Medical Genetics and Microbiology, University of Toronto, 160 College Street, Toronto, ON, Canada M5S 1A8
| | - Timothy R. Hughes
- *Banting and Best Department of Medical Research, and Departments of
- Medical Genetics and Microbiology, University of Toronto, 160 College Street, Toronto, ON, Canada M5S 1A8
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
240
|
Abstract
In the post-genome era, attention has focused on the functions of genome sequences and how they are regulated. The emerging epigenomic changes and the interactions between cis-acting elements and protein factors may play a central role in gene regulation. To understand the crosstalk between DNA and protein on a genome-wide scale, one emerging technique, called ChIP-chip, takes the strategy of combining chromatin immunoprecipitation with microarray. This new high-throughput strategy helps screen the targets of critical transcription factors and profile the genome-wide distribution of histone modifications, which will enable the feasibility of conducting a large-scale study, such as the Human Epigenome Project.
Collapse
Affiliation(s)
- Jiejun Wu
- Departments of Molecular Genetics and Molecular Virology, Immunology, and Medical Genetics, Division of Human Cancer Genetics, Comprehensive Cancer Center, Ohio State University, Columbus, Ohio 43210, USA
| | | | | | | |
Collapse
|
241
|
Bulyk ML. DNA microarray technologies for measuring protein-DNA interactions. Curr Opin Biotechnol 2006; 17:422-30. [PMID: 16839757 PMCID: PMC2727741 DOI: 10.1016/j.copbio.2006.06.015] [Citation(s) in RCA: 99] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2006] [Revised: 06/02/2006] [Accepted: 06/30/2006] [Indexed: 10/24/2022]
Abstract
DNA-binding proteins have key roles in many cellular processes, including transcriptional regulation and replication. Microarray-based technologies permit the high-throughput identification of binding sites and enable the functional roles of these binding proteins to be elucidated. In particular, microarray readout either of chromatin immunoprecipitated DNA-bound proteins (ChIP-chip) or of DNA adenine methyltransferase fusion proteins (DamID) enables the identification of in vivo genomic target sites of proteins. A complementary approach to analyse the in vitro binding of proteins directly to double-stranded DNA microarrays (protein binding microarrays; PBMs), permits rapid characterization of their DNA binding site sequence specificities. Recent advances in DNA microarray synthesis technologies have facilitated the definition of DNA-binding sites at much higher resolution and coverage, and advances in these and emerging technologies will further increase the efficiencies of these exciting new approaches.
Collapse
Affiliation(s)
- Martha L Bulyk
- Division of Genetics, Department of Medicine, Harvard/MIT Division of Health Sciences and Technology (HST), Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
242
|
White RB, Ziman MR. A comparative analysis of shotgun-cloning and tagged-random amplification-cloning of chromatin immunoprecipitation-isolated genome fragments. Biochem Biophys Res Commun 2006; 346:479-83. [PMID: 16762317 DOI: 10.1016/j.bbrc.2006.05.145] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2006] [Accepted: 05/19/2006] [Indexed: 10/24/2022]
Abstract
The cloning of transcription factor antibody-immunoprecipitated genomic fragments from chromatin immunoprecipitation (ChIP) experiments is a technically challenging procedure, especially when the input genomic DNA is isolated from whole tissues (in vivo) rather than cultured cells. Here we adapt a technique known as Tagged-Random PCR (T-PCR) to amplify ChIP-immunoprecipitated DNA from mouse embryonic tissue prior to cloning. Importantly, we then compare this technique with tandem shotgun-cloning experiments in terms of its capacity to identify target genes. We find that T-PCR dramatically increases the efficiency of cloning ChIP fragments without distortion of the relative location of cloned fragments to putative target genes. Thus, T-PCR is a simple procedure which greatly enhances the efficiency of cloning tissue-derived ChIP fragments.
Collapse
Affiliation(s)
- Robert B White
- School of Exercise, Biomedical and Health Science, Edith Cowan University, Joondalup Drive, WA, Australia
| | | |
Collapse
|
243
|
Abstract
Functional knowledge of individual genes encoding components of the cell signaling, metabolic and regulatory pathways is crucial to our understanding of physiology and pathophysiology. A central challenge in functional genomics is the creation of a working map delineating how eukaryotic cells coordinate and govern patterns of gene expression. This coordination is often depicted as an intertwined network or circuit of genes that alternately activate and repress each other. Multiple bioinformatic and high-throughput experimental approaches exist to aid in the reconstruction of gene networks. Albeit far from being complete, the ability to recreate gene networks from experimental data facilitates the systematic dissection of cell function at the molecular and genetic level. In this review, several different genomic technologies are discussed, and example studies that are promoting new discoveries and hypotheses are detailed.
Collapse
Affiliation(s)
- Norman H Lee
- The Institute for Genomic Research, Department of Functional Genomics, Rockville, MD 20850, USA.
| |
Collapse
|
244
|
Ho SW, Jona G, Chen CTL, Johnston M, Snyder M. Linking DNA-binding proteins to their recognition sequences by using protein microarrays. Proc Natl Acad Sci U S A 2006; 103:9940-5. [PMID: 16785442 PMCID: PMC1502558 DOI: 10.1073/pnas.0509185103] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Analyses of whole-genome sequences and experimental data sets have revealed a large number of DNA sequence motifs that are conserved in many species and may be functional. However, methods of sufficient scale to explore the roles of these elements are lacking. We describe the use of protein arrays to identify proteins that bind to DNA sequences of interest. A microarray of 282 known and potential yeast transcription factors was produced and probed with oligonucleotides of evolutionarily conserved sequences that are potentially functional. Transcription factors that bound to specific DNA sequences were identified. One previously uncharacterized DNA-binding protein, Yjl103, was characterized in detail. We defined the binding site for this protein and identified a number of its target genes, many of which are involved in stress response and oxidative phosphorylation. Protein microarrays offer a high-throughput method for determining DNA-protein interactions.
Collapse
Affiliation(s)
- Su-Wen Ho
- *Washington University School of Medicine, 4444 Forest Park Boulevard, St. Louis, MO 63108; and
| | - Ghil Jona
- Department of Molecular, Cellular, and Developmental Biology, Yale University, 266 Whitney Avenue, New Haven, CT 06520-8103
| | - Christina T. L. Chen
- *Washington University School of Medicine, 4444 Forest Park Boulevard, St. Louis, MO 63108; and
| | - Mark Johnston
- *Washington University School of Medicine, 4444 Forest Park Boulevard, St. Louis, MO 63108; and
- To whom correspondence may be addressed. E-mail:
or
| | - Michael Snyder
- Department of Molecular, Cellular, and Developmental Biology, Yale University, 266 Whitney Avenue, New Haven, CT 06520-8103
- To whom correspondence may be addressed. E-mail:
or
| |
Collapse
|
245
|
Lieb JD, Beck S, Bulyk ML, Farnham P, Hattori N, Henikoff S, Liu XS, Okumura K, Shiota K, Ushijima T, Greally JM. Applying whole-genome studies of epigenetic regulation to study human disease. Cytogenet Genome Res 2006; 114:1-15. [PMID: 16717444 PMCID: PMC2734277 DOI: 10.1159/000091922] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2005] [Accepted: 10/06/2005] [Indexed: 12/15/2022] Open
Affiliation(s)
- J D Lieb
- Department of Biology, Carolina Center for Genome Sciences, The University of North Carolina, Chapel Hill, NC, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
246
|
Chen M, Hancock LC, Lopes JM. Transcriptional regulation of yeast phospholipid biosynthetic genes. Biochim Biophys Acta Mol Cell Biol Lipids 2006; 1771:310-21. [PMID: 16854618 DOI: 10.1016/j.bbalip.2006.05.017] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2006] [Revised: 05/30/2006] [Accepted: 05/31/2006] [Indexed: 12/26/2022]
Abstract
The last several years have been witness to significant developments in understanding transcriptional regulation of the yeast phospholipid structural genes. The response of most phospholipid structural genes to inositol is now understood on a mechanistic level. The roles of specific activators and repressors are also well established. The knowledge of specific regulatory factors that bind the promoters of phospholipid structural genes serves as a foundation for understanding the role of chromatin modification complexes. Collectively, these findings present a complex picture for transcriptional regulation of the phospholipid biosynthetic genes. The INO1 gene is an ideal example of the complexity of transcriptional control and continues to serve as a model for studying transcription in general. Furthermore, transcription of the regulatory genes is also subject to complex and essential regulation. In addition, databases resulting from a plethora of genome-wide studies have identified regulatory signals that control one of the essential phospholipid biosynthetic genes, PIS1. These databases also provide significant clues for other regulatory signals that may affect phospholipid biosynthesis. Here, we have tried to present a complete summary of the transcription factors and mechanisms that regulate the phospholipid biosynthetic genes.
Collapse
Affiliation(s)
- Meng Chen
- Department of Biological Sciences, Wayne State University, 5047 Gullen Mall, Detroit, MI 48202, USA
| | | | | |
Collapse
|
247
|
Tang D, Yuan R, Chai Y. Electrochemical Immunosensing Strategies Based on Immobilization of Anti-IgC on Mixed Self-Assembly Monolayers Carrying Surface Amide or Carboxyl Groups. ANAL LETT 2006. [DOI: 10.1080/00032710600721332] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
248
|
Philippakis AA, Busser BW, Gisselbrecht SS, He FS, Estrada B, Michelson AM, Bulyk ML. Expression-guided in silico evaluation of candidate cis regulatory codes for Drosophila muscle founder cells. PLoS Comput Biol 2006; 2:e53. [PMID: 16733548 PMCID: PMC1464814 DOI: 10.1371/journal.pcbi.0020053] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2005] [Accepted: 04/05/2006] [Indexed: 01/20/2023] Open
Abstract
While combinatorial models of transcriptional regulation can be inferred for metazoan systems from a priori biological knowledge, validation requires extensive and time-consuming experimental work. Thus, there is a need for computational methods that can evaluate hypothesized cis regulatory codes before the difficult task of experimental verification is undertaken. We have developed a novel computational framework (termed “CodeFinder”) that integrates transcription factor binding site and gene expression information to evaluate whether a hypothesized transcriptional regulatory model (TRM; i.e., a set of co-regulating transcription factors) is likely to target a given set of co-expressed genes. Our basic approach is to simultaneously predict cis regulatory modules (CRMs) associated with a given gene set and quantify the enrichment for combinatorial subsets of transcription factor binding site motifs comprising the hypothesized TRM within these predicted CRMs. As a model system, we have examined a TRM experimentally demonstrated to drive the expression of two genes in a sub-population of cells in the developing Drosophila mesoderm, the somatic muscle founder cells. This TRM was previously hypothesized to be a general mode of regulation for genes expressed in this cell population. In contrast, the present analyses suggest that a modified form of this cis regulatory code applies to only a subset of founder cell genes, those whose gene expression responds to specific genetic perturbations in a similar manner to the gene on which the original model was based. We have confirmed this hypothesis by experimentally discovering six (out of 12 tested) new CRMs driving expression in the embryonic mesoderm, four of which drive expression in founder cells. Although genome sequences and much gene expression data are readily available, the determination of sets of transcription factors regulating particular gene expression patterns remains a problem of fundamental importance. Tissue-specific gene expression in developing animals is regulated through the combinatorial interactions of transcription factors with DNA regulatory elements termed cis regulatory modules (CRMs). Although genetic and biochemical experiments allow the identification of transcription factors and CRMs, those experiments are laborious and time-consuming. Philippakis et al. introduce a new approach (termed “CodeFinder”) for quantifying the enrichment for particular combinations of transcription factor binding site motifs within predicted CRMs associated with a given gene set of interest, identified from gene expression data. The authors' analyses allowed them to discover a specific combination of transcription factor binding site motifs that constitute a core cis regulatory code for expression of a particular subset of genes in muscle founder cells, an embryonic cell population in the developing fruit fly (Drosophila melanogaster) mesoderm, and also led them to the discovery and subsequent experimental validation of novel, tissue-specific CRMs. Importantly, the CodeFinder approach is generally applicable, and thus could be used to support, refute, or refine a known or hypothesized cis regulatory code for any biological system or genome of interest.
Collapse
Affiliation(s)
- Anthony A Philippakis
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Harvard/MIT Division of Health Sciences and Technology (HST), Harvard Medical School, Boston, Massachusetts, United States of America
- Harvard University Graduate Biophysics Program, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Brian W Busser
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Stephen S Gisselbrecht
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Fangxue Sherry He
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Harvard/MIT Division of Health Sciences and Technology (HST), Harvard Medical School, Boston, Massachusetts, United States of America
| | - Beatriz Estrada
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Alan M Michelson
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- * To whom correspondence should be addressed. E-mail: (AMM), (MLB)
| | - Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Harvard/MIT Division of Health Sciences and Technology (HST), Harvard Medical School, Boston, Massachusetts, United States of America
- Harvard University Graduate Biophysics Program, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- * To whom correspondence should be addressed. E-mail: (AMM), (MLB)
| |
Collapse
|
249
|
Chen Z, Ji M, Hou P, Lu Z. Exo-Dye-based assay for rapid, inexpensive, and sensitive detection of DNA-binding proteins. Biochem Biophys Res Commun 2006; 345:1254-63. [PMID: 16716262 DOI: 10.1016/j.bbrc.2006.05.012] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2006] [Accepted: 05/02/2006] [Indexed: 10/24/2022]
Abstract
We reported herein a rapid, inexpensive, and sensitive technique for detecting sequence-specific DNA-binding proteins. In this technique, the common exonuclease III (ExoIII) footprinting assay is coupled with simple SYBR Green I staining for monitoring the activities of DNA-binding proteins. We named this technique as ExoIII-Dye-based assay. In this assay, a duplex probe was designed to detect DNA-binding protein. One side of the probe contains one protein-binding site, and another side of it contains five protruding bases at 3' end for protection from ExoIII digestion. If a target protein is present, it will bind to binding sites of probe and produce a physical hindrance to ExoIII, which protects the duplex probe from digestion of ExoIII. SYBR Green I will bind to probe, which results in high fluorescence intensity. On the contrary, in the absence of the target protein, the naked duplex probe will be degraded by ExoIII. SYBR Green I will be released, which results in a low fluorescence intensity. In this study, we employed this technique to successfully detect transcription factor NF-kappaB in crude cell extracts. Moreover, it could also be used to evaluate the binding affinity of NF-kappaB. This technique has therefore wide potential application in research, medical diagnosis, and drug discovery.
Collapse
Affiliation(s)
- Zaozao Chen
- Chien-Shiung Wu Laboratory, Department of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China
| | | | | | | |
Collapse
|
250
|
|