1
|
Wu Q, Li Y, Wang Q, Zhao X, Sun D, Liu B. Identification of DNA motif pairs on paired sequences based on composite heterogeneous graph. Front Genet 2024; 15:1424085. [PMID: 38952710 PMCID: PMC11215013 DOI: 10.3389/fgene.2024.1424085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Accepted: 05/22/2024] [Indexed: 07/03/2024] Open
Abstract
Motivation The interaction between DNA motifs (DNA motif pairs) influences gene expression through partnership or competition in the process of gene regulation. Potential chromatin interactions between different DNA motifs have been implicated in various diseases. However, current methods for identifying DNA motif pairs rely on the recognition of single DNA motifs or probabilities, which may result in local optimal solutions and can be sensitive to the choice of initial values. A method for precisely identifying DNA motif pairs is still lacking. Results Here, we propose a novel computational method for predicting DNA Motif Pairs based on Composite Heterogeneous Graph (MPCHG). This approach leverages a composite heterogeneous graph model to identify DNA motif pairs on paired sequences. Compared with the existing methods, MPCHG has greatly improved the accuracy of motifs prediction. Furthermore, the predicted DNA motifs demonstrate heightened DNase accessibility than the background sequences. Notably, the two DNA motifs forming a pair exhibit functional consistency. Importantly, the interacting TF pairs obtained by predicted DNA motif pairs were significantly enriched with known interacting TF pairs, suggesting their potential contribution to chromatin interactions. Collectively, we believe that these identified DNA motif pairs held substantial implications for revealing gene transcriptional regulation under long-range chromatin interactions.
Collapse
Affiliation(s)
- Qiuqin Wu
- School of Mathematics, Shandong University, Jinan, China
| | - Yang Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Qi Wang
- School of Mathematics, Shandong University, Jinan, China
| | - Xiaoyu Zhao
- School of Mathematics, Shandong University, Jinan, China
| | - Duanchen Sun
- School of Mathematics, Shandong University, Jinan, China
| | - Bingqiang Liu
- School of Mathematics, Shandong University, Jinan, China
| |
Collapse
|
2
|
Recio PS, Mitra NJ, Shively CA, Song D, Jaramillo G, Lewis KS, Chen X, Mitra R. Zinc cluster transcription factors frequently activate target genes using a non-canonical half-site binding mode. Nucleic Acids Res 2023; 51:5006-5021. [PMID: 37125648 PMCID: PMC10250231 DOI: 10.1093/nar/gkad320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 04/11/2023] [Accepted: 04/14/2023] [Indexed: 05/02/2023] Open
Abstract
Gene expression changes are orchestrated by transcription factors (TFs), which bind to DNA to regulate gene expression. It remains surprisingly difficult to predict basic features of the transcriptional process, including in vivo TF occupancy. Existing thermodynamic models of TF function are often not concordant with experimental measurements, suggesting undiscovered biology. Here, we analyzed one of the most well-studied TFs, the yeast zinc cluster Gal4, constructed a Shea-Ackers thermodynamic model to describe its binding, and compared the results of this model to experimentally measured Gal4p binding in vivo. We found that at many promoters, the model predicted no Gal4p binding, yet substantial binding was observed. These outlier promoters lacked canonical binding motifs, and subsequent investigation revealed Gal4p binds unexpectedly to DNA sequences with high densities of its half site (CGG). We confirmed this novel mode of binding through multiple experimental and computational paradigms; we also found most other zinc cluster TFs we tested frequently utilize this binding mode, at 27% of their targets on average. Together, these results demonstrate a novel mode of binding where zinc clusters, the largest class of TFs in yeast, bind DNA sequences with high densities of half sites.
Collapse
Affiliation(s)
- Pamela S Recio
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Nikhil J Mitra
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Christian A Shively
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - David Song
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Grace Jaramillo
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Kristine Shady Lewis
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Xuhua Chen
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Robi D Mitra
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
- McDonnell Genome Institute, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| |
Collapse
|
3
|
Chen H, Yan C, Dhasarathy A, Kladde M, Bai L. Investigating pioneer factor activity and its coordination with chromatin remodelers using integrated synthetic oligo assay. STAR Protoc 2023; 4:102279. [PMID: 37289591 PMCID: PMC10323128 DOI: 10.1016/j.xpro.2023.102279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Revised: 03/24/2023] [Accepted: 04/07/2023] [Indexed: 06/10/2023] Open
Abstract
Chromatin accessibility is regulated by pioneer factors (PFs) and chromatin remodelers (CRs). Here, we present a protocol, based on integrated synthetic oligonucleotide libraries in yeast, to systematically interrogate the nucleosome-displacing activities of PFs and their coordination with CRs. We describe steps for designing oligonucleotide sequences, constructing yeast libraries, measuring nucleosome configurations, and data analyses. This approach potentially can be adapted for use in higher eukaryotes to investigate the activities of many types of chromatin-associated factors. For complete details on the use and execution of this protocol, please refer to Yan et al.,1 and Chen et al.2.
Collapse
Affiliation(s)
- Hengye Chen
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA.
| | - Chao Yan
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
| | - Archana Dhasarathy
- Department of Biomedical Sciences, University of North Dakota School of Medicine and Health Sciences, Grand Forks, ND 58201, USA
| | - Michael Kladde
- Department of Biochemistry and Molecular Biology, College of Medicine, University of Florida, Gainesville, FL 32610, USA; UF Health Cancer Center, University of Florida, Gainesville, FL 32610, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA; Department of Physics, The Pennsylvania State University, University Park, PA 16802, USA.
| |
Collapse
|
4
|
Hung PH, Liao CW, Ko FH, Tsai HK, Leu JY. Differential Hsp90-dependent gene expression is strain-specific and common among yeast strains. iScience 2023; 26:106635. [PMID: 37138775 PMCID: PMC10149407 DOI: 10.1016/j.isci.2023.106635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 02/21/2023] [Accepted: 04/05/2023] [Indexed: 05/05/2023] Open
Abstract
Enhanced phenotypic diversity increases a population's likelihood of surviving catastrophic conditions. Hsp90, an essential molecular chaperone and a central network hub in eukaryotes, has been observed to suppress or enhance the effects of genetic variation on phenotypic diversity in response to environmental cues. Because many Hsp90-interacting genes are involved in signaling transduction pathways and transcriptional regulation, we tested how common Hsp90-dependent differential gene expression is in natural populations. Many genes exhibited Hsp90-dependent strain-specific differential expression in five diverse yeast strains. We further identified transcription factors (TFs) potentially contributing to variable expression. We found that on Hsp90 inhibition or environmental stress, activities or abundances of Hsp90-dependent TFs varied among strains, resulting in differential strain-specific expression of their target genes, which consequently led to phenotypic diversity. We provide evidence that individual strains can readily display specific Hsp90-dependent gene expression, suggesting that the evolutionary impacts of Hsp90 are widespread in nature.
Collapse
Affiliation(s)
- Po-Hsiang Hung
- Genome and Systems Biology Degree Program, National Taiwan University and Academia Sinica, Taipei 115, Taiwan
- Institute of Molecular Biology, Academia Sinica, Taipei 115, Taiwan
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan
| | - Chia-Wei Liao
- Institute of Molecular Biology, Academia Sinica, Taipei 115, Taiwan
| | - Fu-Hsuan Ko
- Institute of Molecular Biology, Academia Sinica, Taipei 115, Taiwan
| | - Huai-Kuang Tsai
- Genome and Systems Biology Degree Program, National Taiwan University and Academia Sinica, Taipei 115, Taiwan
- Institute of Information Science, Academia Sinica, Taipei 115, Taiwan
- Corresponding author
| | - Jun-Yi Leu
- Genome and Systems Biology Degree Program, National Taiwan University and Academia Sinica, Taipei 115, Taiwan
- Institute of Molecular Biology, Academia Sinica, Taipei 115, Taiwan
- Corresponding author
| |
Collapse
|
5
|
Xu H, Li C, Xu C, Zhang J. Chance promoter activities illuminate the origins of eukaryotic intergenic transcriptions. Nat Commun 2023; 14:1826. [PMID: 37005399 PMCID: PMC10067814 DOI: 10.1038/s41467-023-37610-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 03/23/2023] [Indexed: 04/04/2023] Open
Abstract
It is debated whether the pervasive intergenic transcription from eukaryotic genomes has functional significance or simply reflects the promiscuity of RNA polymerases. We approach this question by comparing chance promoter activities with the expression levels of intergenic regions in the model eukaryote Saccharomyces cerevisiae. We build a library of over 105 strains, each carrying a 120-nucleotide, chromosomally integrated, completely random sequence driving the potential transcription of a barcode. Quantifying the RNA concentration of each barcode in two environments reveals that 41-63% of random sequences have significant, albeit usually low, promoter activities. Therefore, even in eukaryotes, where the presence of chromatin is thought to repress transcription, chance transcription is prevalent. We find that only 1-5% of yeast intergenic transcriptions are unattributable to chance promoter activities or neighboring gene expressions, and these transcriptions exhibit higher-than-expected environment-specificity. These findings suggest that only a minute fraction of intergenic transcription is functional in yeast.
Collapse
Affiliation(s)
- Haiqing Xu
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Chuan Li
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
- Microsoft, Redmond, WA, USA
| | - Chuan Xu
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
- Bio-X Institutes, Shanghai Jiao Tong University, Shanghai, China
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
6
|
Kharerin H, Bai L. Predicting nucleosome positioning using statistical equilibrium models in budding yeast. STAR Protoc 2023; 4:101926. [PMID: 36520634 PMCID: PMC10442889 DOI: 10.1016/j.xpro.2022.101926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 10/20/2022] [Accepted: 11/21/2022] [Indexed: 12/15/2022] Open
Abstract
We present a protocol using thermodynamic models to predict nucleosome positioning with transcription factors (TFs) and chromatin remodelers. We describe step-by-step approaches to annotate genome-wide nucleosome-depleted regions (NDRs), compute nucleosome and TF occupancy, optimize parameters, and evaluate model performance. These models identify nucleosome-displacing TFs in the budding yeast genome and predict the locations and sizes of NDRs solely based on DNA sequence and TF motifs. The protocol can be applied to all organisms with prior knowledge of TF motifs. For complete details on the use and execution of this protocol, please refer to Kharerin and Bai (2021).1.
Collapse
Affiliation(s)
- Hungyo Kharerin
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, USA.
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, USA; Department of Physics, The Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
7
|
Origin recognition complex harbors an intrinsic nucleosome remodeling activity. Proc Natl Acad Sci U S A 2022; 119:e2211568119. [PMID: 36215487 PMCID: PMC9586268 DOI: 10.1073/pnas.2211568119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Nucleosomes package the entire eukaryotic genome, yet enzymes need access to the DNA for numerous metabolic activities, such as replication and transcription. Eukaryotic origins of replication in Saccharomyces cerevisiae are AT rich and are generally nucleosome free for the binding of ORC (origin recognition complex). However, the nucleosome-free region often undergoes expansion during G1/S phase, presumably to make room for MCM double-hexamer formation that nucleates the 11-subunit helicase, CMG (Cdc45, Mcm2–7, Cdc45). While nucleosome remodelers could perform this function, in vitro studies indicate that nucleosome remodeling may be intrinsic to the replication machinery. Indeed, we find here that ORC contains an intrinsic nucleosome remodeling activity that is capable of ATP-stimulated removal of H2A-H2B from nucleosomes. Eukaryotic DNA replication is initiated at multiple chromosomal sites known as origins of replication that are specifically recognized by the origin recognition complex (ORC) containing multiple ATPase sites. In budding yeast, ORC binds to specific DNA sequences known as autonomously replicating sequences (ARSs) that are mostly nucleosome depleted. However, nucleosomes may still inhibit the licensing of some origins by occluding ORC binding and subsequent MCM helicase loading. Using purified proteins and single-molecule visualization, we find here that the ORC can eject histones from a nucleosome in an ATP-dependent manner. The ORC selectively evicts H2A-H2B dimers but leaves the (H3-H4)2 tetramer on DNA. It also discriminates canonical H2A from the H2A.Z variant, evicting the former while retaining the latter. Finally, the bromo-adjacent homology (BAH) domain of the Orc1 subunit is essential for ORC-mediated histone eviction. These findings suggest that the ORC is a bona fide nucleosome remodeler that functions to create a local chromatin environment optimal for origin activity.
Collapse
|
8
|
Liska O, Bohár B, Hidas A, Korcsmáros T, Papp B, Fazekas D, Ari E. TFLink: an integrated gateway to access transcription factor-target gene interactions for multiple species. Database (Oxford) 2022; 2022:6702175. [PMID: 36124642 PMCID: PMC9480832 DOI: 10.1093/database/baac083] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 08/06/2022] [Accepted: 09/06/2022] [Indexed: 12/01/2022]
Abstract
Analysis of transcriptional regulatory interactions and their comparisons across multiple species are crucial for progress in various fields in biology, from functional genomics to the evolution of signal transduction pathways. However, despite the rapidly growing body of data on regulatory interactions in several eukaryotes, no databases exist to provide curated high-quality information on transcription factor-target gene interactions for multiple species. Here, we address this gap by introducing the TFLink gateway, which uniquely provides experimentally explored and highly accurate information on transcription factor-target gene interactions (∼12 million), nucleotide sequences and genomic locations of transcription factor binding sites (∼9 million) for human and six model organisms: mouse, rat, zebrafish, fruit fly, worm and yeast by integrating 10 resources. TFLink provides user-friendly access to data on transcription factor-target gene interactions, interactive network visualizations and transcription factor binding sites, with cross-links to several other databases. Besides containing accurate information on transcription factors, with a clear labelling of the type/volume of the experiments (small-scale or high-throughput), the source database and the original publications, TFLink also provides a wealth of standardized regulatory data available for download in multiple formats. The database offers easy access to high-quality data for wet-lab researchers, supplies data for gene set enrichment analyses and facilitates systems biology and comparative gene regulation studies. Database URL https://tflink.net/.
Collapse
Affiliation(s)
- Orsolya Liska
- HCEMM-BRC Metabolic Systems Biology Research Group, Temesvári krt. 62, Szeged 6726, Hungary
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network (ELKH), Temesvári krt. 62, Szeged 6726, Hungary
- Department of Genetics, ELTE Eötvös Loránd University, Pázmány P. stny. 1/C, Budapest 1117, Hungary
- Doctoral School of Biology, University of Szeged, Közép fasor 52, Szeged 6726, Hungary
| | - Balázs Bohár
- Department of Genetics, ELTE Eötvös Loránd University, Pázmány P. stny. 1/C, Budapest 1117, Hungary
- Earlham Institute, Colney Ln, Norwich NR4 7UZ, UK
| | - András Hidas
- Department of Genetics, ELTE Eötvös Loránd University, Pázmány P. stny. 1/C, Budapest 1117, Hungary
- Institute of Aquatic Ecology, Centre for Ecological Research, Eötvös Loránd Research Network (ELKH), Karolina út 29, Budapest 1113, Hungary
| | - Tamás Korcsmáros
- Earlham Institute, Colney Ln, Norwich NR4 7UZ, UK
- Quadram Institute Bioscience, Norwich Research Park, Norwich NR4 7UQ, UK
- Faculty of Medicine, Imperial College London, South Kensington Campus, London SW7 2AZ, UK
| | - Balázs Papp
- HCEMM-BRC Metabolic Systems Biology Research Group, Temesvári krt. 62, Szeged 6726, Hungary
- Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network (ELKH), Temesvári krt. 62, Szeged 6726, Hungary
| | - Dávid Fazekas
- Department of Genetics, ELTE Eötvös Loránd University, Pázmány P. stny. 1/C, Budapest 1117, Hungary
- Earlham Institute, Colney Ln, Norwich NR4 7UZ, UK
| | - Eszter Ari
- *Corresponding author: Tel: +36 1 372 2500 ext: 8691
| |
Collapse
|
9
|
Nucleosome-directed replication origin licensing independent of a consensus DNA sequence. Nat Commun 2022; 13:4947. [PMID: 35999198 PMCID: PMC9399094 DOI: 10.1038/s41467-022-32657-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 08/09/2022] [Indexed: 02/08/2023] Open
Abstract
The numerous enzymes and cofactors involved in eukaryotic DNA replication are conserved from yeast to human, and the budding yeast Saccharomyces cerevisiae (S.c.) has been a useful model organism for these studies. However, there is a gap in our knowledge of why replication origins in higher eukaryotes do not use a consensus DNA sequence as found in S.c. Using in vitro reconstitution and single-molecule visualization, we show here that S.c. origin recognition complex (ORC) stably binds nucleosomes and that ORC-nucleosome complexes have the intrinsic ability to load the replicative helicase MCM double hexamers onto adjacent nucleosome-free DNA regardless of sequence. Furthermore, we find that Xenopus laevis nucleosomes can substitute for yeast ones in engaging with ORC. Combined with re-analyses of genome-wide ORC binding data, our results lead us to propose that the yeast origin recognition machinery contains the cryptic capacity to bind nucleosomes near a nucleosome-free region and license origins, and that this nucleosome-directed origin licensing paradigm generalizes to all eukaryotes. Most eukaryotes do not use a consensus DNA sequence as binding sites for the origin recognition complex (ORC) to initiate DNA replication, however budding yeast do. Here the authors show S. cerevisiae ORC can bind nucleosomes near nucleosome-free regions and recruit replicative helicases to form a pre-replication complex independent of the DNA sequence.
Collapse
|
10
|
Kang Y, Jung WJ, Brent MR. Predicting which genes will respond to transcription factor perturbations. G3 (BETHESDA, MD.) 2022; 12:jkac144. [PMID: 35666184 PMCID: PMC9339286 DOI: 10.1093/g3journal/jkac144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Accepted: 05/25/2022] [Indexed: 11/13/2022]
Abstract
The ability to predict which genes will respond to the perturbation of a transcription factor serves as a benchmark for our systems-level understanding of transcriptional regulatory networks. In previous work, machine learning models have been trained to predict static gene expression levels in a biological sample by using data from the same or similar samples, including data on their transcription factor binding locations, histone marks, or DNA sequence. We report on a different challenge-training machine learning models to predict which genes will respond to the perturbation of a transcription factor without using any data from the perturbed cells. We find that existing transcription factor location data (ChIP-seq) from human cells have very little detectable utility for predicting which genes will respond to perturbation of a transcription factor. Features of genes, including their preperturbation expression level and expression variation, are very useful for predicting responses to perturbation of any transcription factor. This shows that some genes are poised to respond to transcription factor perturbations and others are resistant, shedding light on why it has been so difficult to predict responses from binding locations. Certain histone marks, including H3K4me1 and H3K4me3, have some predictive power when located downstream of the transcription start site. However, the predictive power of histone marks is much less than that of gene expression level and expression variation. Sequence-based or epigenetic properties of genes strongly influence their tendency to respond to direct transcription factor perturbations, partially explaining the oft-noted difficulty of predicting responsiveness from transcription factor binding location data. These molecular features are largely reflected in and summarized by the gene's expression level and expression variation. Code is available at https://github.com/BrentLab/TFPertRespExplainer.
Collapse
Affiliation(s)
- Yiming Kang
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA
- Department of Computer Science and Engineering, Washington University, St. Louis, MO 63108, USA
| | - Wooseok J Jung
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA
- Department of Computer Science and Engineering, Washington University, St. Louis, MO 63108, USA
| | - Michael R Brent
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA
- Department of Computer Science and Engineering, Washington University, St. Louis, MO 63108, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|
11
|
Shih CH, Fay J. Cis-regulatory variants affect gene expression dynamics in yeast. eLife 2021; 10:e68469. [PMID: 34369376 PMCID: PMC8367379 DOI: 10.7554/elife.68469] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 08/06/2021] [Indexed: 12/14/2022] Open
Abstract
Evolution of cis-regulatory sequences depends on how they affect gene expression and motivates both the identification and prediction of cis-regulatory variants responsible for expression differences within and between species. While much progress has been made in relating cis-regulatory variants to expression levels, the timing of gene activation and repression may also be important to the evolution of cis-regulatory sequences. We investigated allele-specific expression (ASE) dynamics within and between Saccharomyces species during the diauxic shift and found appreciable cis-acting variation in gene expression dynamics. Within-species ASE is associated with intergenic variants, and ASE dynamics are more strongly associated with insertions and deletions than ASE levels. To refine these associations, we used a high-throughput reporter assay to test promoter regions and individual variants. Within the subset of regions that recapitulated endogenous expression, we identified and characterized cis-regulatory variants that affect expression dynamics. Between species, chimeric promoter regions generate novel patterns and indicate constraints on the evolution of gene expression dynamics. We conclude that changes in cis-regulatory sequences can tune gene expression dynamics and that the interplay between expression dynamics and other aspects of expression is relevant to the evolution of cis-regulatory sequences.
Collapse
Affiliation(s)
- Ching-Hua Shih
- Department of Biology, University of RochesterRochesterUnited States
| | - Justin Fay
- Department of Biology, University of RochesterRochesterUnited States
| |
Collapse
|
12
|
Ma CZ, Brent MR. Inferring TF activities and activity regulators from gene expression data with constraints from TF perturbation data. Bioinformatics 2021; 37:1234-1245. [PMID: 33135076 PMCID: PMC8189679 DOI: 10.1093/bioinformatics/btaa947] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 09/26/2020] [Accepted: 10/27/2020] [Indexed: 12/20/2022] Open
Abstract
Motivation The activity of a transcription factor (TF) in a sample of cells is the extent to which it is exerting its regulatory potential. Many methods of inferring TF activity from gene expression data have been described, but due to the lack of appropriate large-scale datasets, systematic and objective validation has not been possible until now. Results We systematically evaluate and optimize the approach to TF activity inference in which a gene expression matrix is factored into a condition-independent matrix of control strengths and a condition-dependent matrix of TF activity levels. We find that expression data in which the activities of individual TFs have been perturbed are both necessary and sufficient for obtaining good performance. To a considerable extent, control strengths inferred using expression data from one growth condition carry over to other conditions, so the control strength matrices derived here can be used by others. Finally, we apply these methods to gain insight into the upstream factors that regulate the activities of yeast TFs Gcr2, Gln3, Gcn4 and Msn2. Availability and implementation Evaluation code and data are available at https://doi.org/10.5281/zenodo.4050573. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cynthia Z Ma
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA.,Department of Computer Science and Engineering, Washington University, St. Louis, MO 63130, USA
| | - Michael R Brent
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA.,Department of Computer Science and Engineering, Washington University, St. Louis, MO 63130, USA.,Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|
13
|
Kharerin H, Bai L. Thermodynamic modeling of genome-wide nucleosome depleted regions in yeast. PLoS Comput Biol 2021; 17:e1008560. [PMID: 33428627 PMCID: PMC7822557 DOI: 10.1371/journal.pcbi.1008560] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Revised: 01/22/2021] [Accepted: 11/25/2020] [Indexed: 01/09/2023] Open
Abstract
Nucleosome positioning in the genome is essential for the regulation of many nuclear processes. We currently have limited capability to predict nucleosome positioning in vivo, especially the locations and sizes of nucleosome depleted regions (NDRs). Here, we present a thermodynamic model that incorporates the intrinsic affinity of histones, competitive binding of sequence-specific factors, and nucleosome remodeling to predict nucleosome positioning in budding yeast. The model shows that the intrinsic affinity of histones, at near-saturating histone concentration, is not sufficient in generating NDRs in the genome. However, the binding of a few factors, especially RSC towards GC-rich and poly(A/T) sequences, allows us to predict ~ 66% of genome-wide NDRs. The model also shows that nucleosome remodeling activity is required to predict the correct NDR sizes. The validity of the model was further supported by the agreement between the predicted and the measured nucleosome positioning upon factor deletion or on exogenous sequences introduced into yeast. Overall, our model quantitatively evaluated the impact of different genetic components on NDR formation and illustrated the vital roles of sequence-specific factors and nucleosome remodeling in this process. Nucleosome is the basic unit of chromatin, containing 147 base-pairs of DNA wrapped around a histone core. The positioning of nucleosomes, i.e., which parts of DNA are inside nucleosome and which parts are nucleosome-free, is highly regulated. In particular, regulatory sequences tend to be exposed in nucleosome-depleted regions (NDRs), and such exposure is crucial for a variety of processes including DNA replication, repair, and gene expression. Here, we used a thermodynamics model to predict nucleosome positioning on the yeast genome. The model shows that the intrinsic sequence preference of histones is not sufficient in generating NDRs. In contrast, binding of a few transcription factors, especially RSC, is largely responsible for NDR formation. Nucleosome remodeling activity is also required in the model to recapitulate the NDR sizes. This model contributes to our understanding of the mechanisms that regulate nucleosome positioning. It can also be used to predict nucleosome positioning in mutant yeast or on novel DNA sequences.
Collapse
Affiliation(s)
- Hungyo Kharerin
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Department of Physics, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|
14
|
Aditham AK, Markin CJ, Mokhtari DA, DelRosso N, Fordyce PM. High-Throughput Affinity Measurements of Transcription Factor and DNA Mutations Reveal Affinity and Specificity Determinants. Cell Syst 2020; 12:112-127.e11. [PMID: 33340452 DOI: 10.1016/j.cels.2020.11.012] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 08/08/2020] [Accepted: 11/24/2020] [Indexed: 01/28/2023]
Abstract
Transcription factors (TFs) bind regulatory DNA to control gene expression, and mutations to either TFs or DNA can alter binding affinities to rewire regulatory networks and drive phenotypic variation. While studies have profiled energetic effects of DNA mutations extensively, we lack similar information for TF variants. Here, we present STAMMP (simultaneous transcription factor affinity measurements via microfluidic protein arrays), a high-throughput microfluidic platform enabling quantitative characterization of hundreds of TF variants simultaneously. Measured affinities for ∼210 mutants of a model yeast TF (Pho4) interacting with 9 oligonucleotides (>1,800 Kds) reveal that many combinations of mutations to poorly conserved TF residues and nucleotides flanking the core binding site alter but preserve physiological binding, providing a mechanism by which combinations of mutations in cis and trans could modulate TF binding to tune occupancies during evolution. Moreover, biochemical double-mutant cycles across the TF-DNA interface reveal molecular mechanisms driving recognition, linking sequence to function. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.
Collapse
Affiliation(s)
- Arjun K Aditham
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA; Stanford ChEM-H, Stanford University, Stanford, CA 94305, USA
| | - Craig J Markin
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Daniel A Mokhtari
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Nicole DelRosso
- Graduate Program in Biophysics, Stanford University, Stanford, CA 94305, USA
| | - Polly M Fordyce
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA; Stanford ChEM-H, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA; Chan Zuckerberg Biohub, San Francisco, CA 94110, USA.
| |
Collapse
|
15
|
Renganaath K, Chong R, Day L, Kosuri S, Kruglyak L, Albert FW. Systematic identification of cis-regulatory variants that cause gene expression differences in a yeast cross. eLife 2020; 9:e62669. [PMID: 33179598 PMCID: PMC7685706 DOI: 10.7554/elife.62669] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 11/11/2020] [Indexed: 02/06/2023] Open
Abstract
Sequence variation in regulatory DNA alters gene expression and shapes genetically complex traits. However, the identification of individual, causal regulatory variants is challenging. Here, we used a massively parallel reporter assay to measure the cis-regulatory consequences of 5832 natural DNA variants in the promoters of 2503 genes in the yeast Saccharomyces cerevisiae. We identified 451 causal variants, which underlie genetic loci known to affect gene expression. Several promoters harbored multiple causal variants. In five promoters, pairs of variants showed non-additive, epistatic interactions. Causal variants were enriched at conserved nucleotides, tended to have low derived allele frequency, and were depleted from promoters of essential genes, which is consistent with the action of negative selection. Causal variants were also enriched for alterations in transcription factor binding sites. Models integrating these features provided modest, but statistically significant, ability to predict causal variants. This work revealed a complex molecular basis for cis-acting regulatory variation.
Collapse
Affiliation(s)
- Kaushik Renganaath
- Department of Genetics, Cell Biology, & Development, University of MinnesotaMinneapolisUnited States
| | - Rockie Chong
- Department of Chemistry & Biochemistry, University of California, Los AngelesLos AngelesUnited States
| | - Laura Day
- Department of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Department of Biological Chemistry, University of California, Los AngelesLos AngelesUnited States
- Howard Hughes Medical Institute, University of California, Los AngelesLos AngelesUnited States
| | - Sriram Kosuri
- Department of Chemistry & Biochemistry, University of California, Los AngelesLos AngelesUnited States
| | - Leonid Kruglyak
- Department of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Department of Biological Chemistry, University of California, Los AngelesLos AngelesUnited States
- Howard Hughes Medical Institute, University of California, Los AngelesLos AngelesUnited States
| | - Frank W Albert
- Department of Genetics, Cell Biology, & Development, University of MinnesotaMinneapolisUnited States
| |
Collapse
|
16
|
The relation between crosstalk and gene regulation form revisited. PLoS Comput Biol 2020; 16:e1007642. [PMID: 32097416 PMCID: PMC7059967 DOI: 10.1371/journal.pcbi.1007642] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Revised: 03/06/2020] [Accepted: 01/08/2020] [Indexed: 01/11/2023] Open
Abstract
Genes differ in the frequency at which they are expressed and in the form of regulation used to control their activity. In particular, positive or negative regulation can lead to activation of a gene in response to an external signal. Previous works proposed that the form of regulation of a gene correlates with its frequency of usage: positive regulation when the gene is frequently expressed and negative regulation when infrequently expressed. Such network design means that, in the absence of their regulators, the genes are found in their least required activity state, hence regulatory intervention is often necessary. Due to the multitude of genes and regulators, spurious binding and unbinding events, called “crosstalk”, could occur. To determine how the form of regulation affects the global crosstalk in the network, we used a mathematical model that includes multiple regulators and multiple target genes. We found that crosstalk depends non-monotonically on the availability of regulators. Our analysis showed that excess use of regulation entailed by the formerly suggested network design caused high crosstalk levels in a large part of the parameter space. We therefore considered the opposite ‘idle’ design, where the default unregulated state of genes is their frequently required activity state. We found, that ‘idle’ design minimized the use of regulation and thus minimized crosstalk. In addition, we estimated global crosstalk of S. cerevisiae using transcription factors binding data. We demonstrated that even partial network data could suffice to estimate its global crosstalk, suggesting its applicability to additional organisms. We found that S. cerevisiae estimated crosstalk is lower than that of a random network, suggesting that natural selection reduces crosstalk. In summary, our study highlights a new type of protein production cost which is typically overlooked: that of regulatory interference caused by the presence of excess regulators in the cell. It demonstrates the importance of whole-network descriptions, which could show effects missed by single-gene models. Genes differ in the frequency at which they are expressed and in the form of regulation used to control their activity. The basic level of regulation is mediated by different types of DNA-binding proteins, where each type regulates particular gene(s). We distinguish between two basic forms of regulation: positive—if a gene is activated by the binding of its regulatory protein, and negative—if it is active unless bound by its regulatory protein. Due to the multitude of genes and regulators, spurious binding and unbinding events, called “crosstalk”, could occur. How does the form of regulation, positive or negative, affect the extent of regulatory crosstalk? To address this question, we used a mathematical model integrating many genes and many regulators. As intuition suggests, we found that in most of the parameter space, crosstalk increased with the availability of regulators. We propose, that crosstalk is usually reduced when networks are designed such that minimal regulation is needed, which we call the ‘idle’ design. In other words: a frequently needed gene will use negative regulation and conversely, a scarcely needed gene will employ positive regulation. In both cases, the requirement for the regulators is minimized. In addition, we demonstrate how crosstalk can be calculated from available datasets and discuss the technical challenges in such calculation, specifically data incompleteness.
Collapse
|
17
|
Homotypic cooperativity and collective binding are determinants of bHLH specificity and function. Proc Natl Acad Sci U S A 2019; 116:16143-16152. [PMID: 31341088 DOI: 10.1073/pnas.1818015116] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Eukaryotic cells express transcription factor (TF) paralogues that bind to nearly identical DNA sequences in vitro but bind at different genomic loci and perform different functions in vivo. Predicting how 2 paralogous TFs bind in vivo using DNA sequence alone is an important open problem. Here, we analyzed 2 yeast bHLH TFs, Cbf1p and Tye7p, which have highly similar binding preferences in vitro, yet bind at almost completely nonoverlapping target loci in vivo. We dissected the determinants of specificity for these 2 proteins by making a number of chimeric TFs in which we swapped different domains of Cbf1p and Tye7p and determined the effects on in vivo binding and cellular function. From these experiments, we learned that the Cbf1p dimer achieves its specificity by binding cooperatively with other Cbf1p dimers bound nearby. In contrast, we found that Tye7p achieves its specificity by binding cooperatively with 3 other DNA-binding proteins, Gcr1p, Gcr2p, and Rap1p. Remarkably, most promoters (63%) that are bound by Tye7p do not contain a consensus Tye7p binding site. Using this information, we were able to build simple models to accurately discriminate bound and unbound genomic loci for both Cbf1p and Tye7p. We then successfully reprogrammed the human bHLH NPAS2 to bind Cbf1p in vivo targets and a Tye7p target intergenic region to be bound by Cbf1p. These results demonstrate that the genome-wide binding targets of paralogous TFs can be discriminated using sequence information, and provide lessons about TF specificity that can be applied across the phylogenetic tree.
Collapse
|
18
|
Datta V, Hannenhalli S, Siddharthan R. ChIPulate: A comprehensive ChIP-seq simulation pipeline. PLoS Comput Biol 2019; 15:e1006921. [PMID: 30897079 PMCID: PMC6445533 DOI: 10.1371/journal.pcbi.1006921] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Revised: 04/02/2019] [Accepted: 03/04/2019] [Indexed: 12/17/2022] Open
Abstract
ChIP-seq (Chromatin Immunoprecipitation followed by sequencing) is a high-throughput technique to identify genomic regions that are bound in vivo by a particular protein, e.g., a transcription factor (TF). Biological factors, such as chromatin state, indirect and cooperative binding, as well as experimental factors, such as antibody quality, cross-linking, and PCR biases, are known to affect the outcome of ChIP-seq experiments. However, the relative impact of these factors on inferences made from ChIP-seq data is not entirely clear. Here, via a detailed ChIP-seq simulation pipeline, ChIPulate, we assess the impact of various biological and experimental sources of variation on several outcomes of a ChIP-seq experiment, viz., the recoverability of the TF binding motif, accuracy of TF-DNA binding detection, the sensitivity of inferred TF-DNA binding strength, and number of replicates needed to confidently infer binding strength. We find that the TF motif can be recovered despite poor and non-uniform extraction and PCR amplification efficiencies. The recovery of the motif is, however, affected to a larger extent by the fraction of sites that are either cooperatively or indirectly bound. Importantly, our simulations reveal that the number of ChIP-seq replicates needed to accurately measure in vivo occupancy at high-affinity sites is larger than the recommended community standards. Our results establish statistical limits on the accuracy of inferences of protein-DNA binding from ChIP-seq and suggest that increasing the mean extraction efficiency, rather than amplification efficiency, would better improve sensitivity. The source code and instructions for running ChIPulate can be found at https://github.com/vishakad/chipulate. DNA-binding proteins perform many key roles in biology, such as transcriptional regulation of gene expression and chromatin modification. ChIP-seq (Chromatin immunoprecipitation followed by high-throughput sequencing) is a widely used experimental technique to identify DNA-binding sites of specific proteins of interest, within cells, genome-wide. DNA fragments from genomic regions that are bound by a protein of interest, often a transcription factor (TF), are selectively extracted using specific antibodies, amplified using PCR, and sequenced. The sequences are mapped to the reference genome. Regions where many sequences map, called “peaks”, are used to infer the location of TF-bound loci (peaks), in vivo occupancy at those loci, and the sequence pattern (motif) to which the TF shows a binding affinity. But measurements of TF occupancy and motif inference are vulnerable to several biological and experimental sources of variation that are poorly understood and difficult to assess directly. Here, we simulate key steps of the ChIP-seq protocol with the aim of estimating the relative effects of various sources of variations on motif inference and binding affinity estimations. Besides providing specific insights and recommendations, we provide a general framework to simulate sequence reads in a ChIP-seq experiment, which should considerably aid in the development of software aimed at analyzing ChIP-seq data.
Collapse
Affiliation(s)
- Vishaka Datta
- Simons Centre for the Study of Living Machines, National Centre for Biological Sciences, TIFR, Bengaluru, Karnataka, India
- * E-mail:
| | - Sridhar Hannenhalli
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| | - Rahul Siddharthan
- The Institute of Mathematical Sciences/HBNI, Taramani, Chennai, India
| |
Collapse
|
19
|
Sorrells TR, Johnson AN, Howard CJ, Britton CS, Fowler KR, Feigerle JT, Weil PA, Johnson AD. Intrinsic cooperativity potentiates parallel cis-regulatory evolution. eLife 2018; 7:37563. [PMID: 30198843 PMCID: PMC6173580 DOI: 10.7554/elife.37563] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Accepted: 09/09/2018] [Indexed: 12/27/2022] Open
Abstract
Convergent evolutionary events in independent lineages provide an opportunity to understand why evolution favors certain outcomes over others. We studied such a case where a large set of genes-those coding for the ribosomal proteins-gained cis-regulatory sequences for a particular transcription regulator (Mcm1) in independent fungal lineages. We present evidence that these gains occurred because Mcm1 shares a mechanism of transcriptional activation with an ancestral regulator of the ribosomal protein genes, Rap1. Specifically, we show that Mcm1 and Rap1 have the inherent ability to cooperatively activate transcription through contacts with the general transcription factor TFIID. Because the two regulatory proteins share a common interaction partner, the presence of one ancestral cis-regulatory sequence can 'channel' random mutations into functional sites for the second regulator. At a genomic scale, this type of intrinsic cooperativity can account for a pattern of parallel evolution involving the fixation of hundreds of substitutions.
Collapse
Affiliation(s)
- Trevor R Sorrells
- Department of Biochemistry and Biophysics, Tetrad Graduate Program, University of California, San Francisco, United States.,Department of Microbiology and Immunology, University of California, San Francisco, United States
| | - Amanda N Johnson
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Conor J Howard
- Department of Biochemistry and Biophysics, Tetrad Graduate Program, University of California, San Francisco, United States.,Department of Microbiology and Immunology, University of California, San Francisco, United States
| | - Candace S Britton
- Department of Biochemistry and Biophysics, Tetrad Graduate Program, University of California, San Francisco, United States.,Department of Microbiology and Immunology, University of California, San Francisco, United States
| | - Kyle R Fowler
- Department of Biochemistry and Biophysics, Tetrad Graduate Program, University of California, San Francisco, United States.,Department of Microbiology and Immunology, University of California, San Francisco, United States
| | - Jordan T Feigerle
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - P Anthony Weil
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Alexander D Johnson
- Department of Biochemistry and Biophysics, Tetrad Graduate Program, University of California, San Francisco, United States.,Department of Microbiology and Immunology, University of California, San Francisco, United States
| |
Collapse
|
20
|
Detection of cooperatively bound transcription factor pairs using ChIP-seq peak intensities and expectation maximization. PLoS One 2018; 13:e0199771. [PMID: 30016330 PMCID: PMC6049898 DOI: 10.1371/journal.pone.0199771] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Accepted: 06/13/2018] [Indexed: 11/19/2022] Open
Abstract
Transcription factors (TFs) often work cooperatively, where the binding of one TF to DNA enhances the binding affinity of a second TF to a nearby location. Such cooperative binding is important for activating gene expression from promoters and enhancers in both prokaryotic and eukaryotic cells. Existing methods to detect cooperative binding of a TF pair rely on analyzing the sequence that is bound. We propose a method that uses, instead, only ChIP-seq peak intensities and an expectation maximization (CPI-EM) algorithm. We validate our method using ChIP-seq data from cells where one of a pair of TFs under consideration has been genetically knocked out. Our algorithm relies on our observation that cooperative TF-TF binding is correlated with weak binding of one of the TFs, which we demonstrate in a variety of cell types, including E. coli, S. cerevisiae and M. musculus cells. We show that this method performs significantly better than a predictor based only on the ChIP-seq peak distance of the TFs under consideration. This suggests that peak intensities contain information that can help detect the cooperative binding of a TF pair. CPI-EM also outperforms an existing sequence-based algorithm in detecting cooperative binding. The CPI-EM algorithm is available at https://github.com/vishakad/cpi-em.
Collapse
|
21
|
Systematic Study of Nucleosome-Displacing Factors in Budding Yeast. Mol Cell 2018; 71:294-305.e4. [PMID: 30017582 DOI: 10.1016/j.molcel.2018.06.017] [Citation(s) in RCA: 66] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2017] [Revised: 05/04/2018] [Accepted: 06/07/2018] [Indexed: 12/11/2022]
Abstract
Nucleosomes present a barrier for the binding of most transcription factors (TFs). However, special TFs known as nucleosome-displacing factors (NDFs) can access embedded sites and cause the depletion of the local nucleosomes as well as repositioning of the neighboring nucleosomes. Here, we developed a novel high-throughput method in yeast to identify NDFs among 104 TFs and systematically characterized the impact of orientation, affinity, location, and copy number of their binding motifs on the nucleosome occupancy. Using this assay, we identified 29 NDF motifs and divided the nuclear TFs into three groups with strong, weak, and no nucleosome-displacing activities. Further studies revealed that tight DNA binding is the key property that underlies NDF activity, and the NDFs may partially rely on the DNA replication to compete with nucleosome. Overall, our study presents a framework to functionally characterize NDFs and elucidate the mechanism of nucleosome invasion.
Collapse
|
22
|
Freddolino PL, Yang J, Momen-Roknabadi A, Tavazoie S. Stochastic tuning of gene expression enables cellular adaptation in the absence of pre-existing regulatory circuitry. eLife 2018; 7:e31867. [PMID: 29620524 PMCID: PMC5919758 DOI: 10.7554/elife.31867] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2017] [Accepted: 04/04/2018] [Indexed: 12/12/2022] Open
Abstract
Cells adapt to familiar changes in their environment by activating predefined regulatory programs that establish adaptive gene expression states. These hard-wired pathways, however, may be inadequate for adaptation to environments never encountered before. Here, we reveal evidence for an alternative mode of gene regulation that enables adaptation to adverse conditions without relying on external sensory information or genetically predetermined cis-regulation. Instead, individual genes achieve optimal expression levels through a stochastic search for improved fitness. By focusing on improving the overall health of the cell, the proposed stochastic tuning mechanism discovers global gene expression states that are fundamentally new and yet optimized for novel environments. We provide experimental evidence for stochastic tuning in the adaptation of Saccharomyces cerevisiae to laboratory-engineered environments that are foreign to its native gene-regulatory network. Stochastic tuning operates locally at individual gene promoters, and its efficacy is modulated by perturbations to chromatin modification machinery.
Collapse
Affiliation(s)
- Peter L Freddolino
- Department of Systems BiologyColumbia UniversityNew York CityUnited States
- Department of Biochemistry and Molecular BiophysicsColumbia UniversityNew York CityUnited States
| | - Jamie Yang
- Department of Systems BiologyColumbia UniversityNew York CityUnited States
- Department of Biochemistry and Molecular BiophysicsColumbia UniversityNew York CityUnited States
| | - Amir Momen-Roknabadi
- Department of Systems BiologyColumbia UniversityNew York CityUnited States
- Department of Biochemistry and Molecular BiophysicsColumbia UniversityNew York CityUnited States
| | - Saeed Tavazoie
- Department of Systems BiologyColumbia UniversityNew York CityUnited States
- Department of Biochemistry and Molecular BiophysicsColumbia UniversityNew York CityUnited States
| |
Collapse
|
23
|
Comprehensive, high-resolution binding energy landscapes reveal context dependencies of transcription factor binding. Proc Natl Acad Sci U S A 2018; 115:E3702-E3711. [PMID: 29588420 PMCID: PMC5910820 DOI: 10.1073/pnas.1715888115] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Transcription factors (TFs) are primary regulators of gene expression in cells, where they bind specific genomic target sites to control transcription. Quantitative measurements of TF-DNA binding energies can improve the accuracy of predictions of TF occupancy and downstream gene expression in vivo and shed light on how transcriptional networks are rewired throughout evolution. Here, we present a sequencing-based TF binding assay and analysis pipeline (BET-seq, for Binding Energy Topography by sequencing) capable of providing quantitative estimates of binding energies for more than one million DNA sequences in parallel at high energetic resolution. Using this platform, we measured the binding energies associated with all possible combinations of 10 nucleotides flanking the known consensus DNA target interacting with two model yeast TFs, Pho4 and Cbf1. A large fraction of these flanking mutations change overall binding energies by an amount equal to or greater than consensus site mutations, suggesting that current definitions of TF binding sites may be too restrictive. By systematically comparing estimates of binding energies output by deep neural networks (NNs) and biophysical models trained on these data, we establish that dinucleotide (DN) specificities are sufficient to explain essentially all variance in observed binding behavior, with Cbf1 binding exhibiting significantly more nonadditivity than Pho4. NN-derived binding energies agree with orthogonal biochemical measurements and reveal that dynamically occupied sites in vivo are both energetically and mutationally distant from the highest affinity sites.
Collapse
|
24
|
Rao S, Chiu TP, Kribelbauer JF, Mann RS, Bussemaker HJ, Rohs R. Systematic prediction of DNA shape changes due to CpG methylation explains epigenetic effects on protein-DNA binding. Epigenetics Chromatin 2018; 11:6. [PMID: 29409522 PMCID: PMC5800008 DOI: 10.1186/s13072-018-0174-4] [Citation(s) in RCA: 51] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Accepted: 01/15/2018] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND DNA shape analysis has demonstrated the potential to reveal structure-based mechanisms of protein-DNA binding. However, information about the influence of chemical modification of DNA is limited. Cytosine methylation, the most frequent modification, represents the addition of a methyl group at the major groove edge of the cytosine base. In mammalian genomes, cytosine methylation most frequently occurs at CpG dinucleotides. In addition to changing the chemical signature of C/G base pairs, cytosine methylation can affect DNA structure. Since the original discovery of DNA methylation, major efforts have been made to understand its effect from a sequence perspective. Compared to unmethylated DNA, however, little structural information is available for methylated DNA, due to the limited number of experimentally determined structures. To achieve a better mechanistic understanding of the effect of CpG methylation on local DNA structure, we developed a high-throughput method, methyl-DNAshape, for predicting the effect of cytosine methylation on DNA shape. RESULTS Using our new method, we found that CpG methylation significantly altered local DNA shape. Four DNA shape features-helix twist, minor groove width, propeller twist, and roll-were considered in this analysis. Distinct distributions of effect size were observed for different features. Roll and propeller twist were the DNA shape features most strongly affected by CpG methylation with an effect size depending on the local sequence context. Methylation-induced changes in DNA shape were predictive of the measured rate of cleavage by DNase I and suggest a possible mechanism for some of the methylation sensitivities that were recently observed for human Pbx-Hox complexes. CONCLUSIONS CpG methylation is an important epigenetic mark in the mammalian genome. Understanding its role in protein-DNA recognition can further our knowledge of gene regulation. Our high-throughput methyl-DNAshape method can be used to predict the effect of cytosine methylation on DNA shape and its subsequent influence on protein-DNA interactions. This approach overcomes the limited availability of experimental DNA structures that contain 5-methylcytosine.
Collapse
Affiliation(s)
- Satyanarayan Rao
- Computational Biology and Bioinformatics Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Tsu-Pei Chiu
- Computational Biology and Bioinformatics Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA
| | - Judith F Kribelbauer
- Department of Biological Sciences, Columbia University, New York, NY, 10027, USA.,Department of Systems Biology, Columbia University, New York, NY, 10032, USA.,Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, 10032, USA
| | - Richard S Mann
- Department of Systems Biology, Columbia University, New York, NY, 10032, USA.,Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, 10032, USA.,Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, 10027, USA.,Department of Neuroscience, Columbia University, New York, NY, 10027, USA
| | - Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, NY, 10027, USA. .,Department of Systems Biology, Columbia University, New York, NY, 10032, USA.
| | - Remo Rohs
- Computational Biology and Bioinformatics Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA, 90089, USA. .,Department of Chemistry, University of Southern California, Los Angeles, CA, 90089, USA. .,Department of Physics & Astronomy, University of Southern California, Los Angeles, CA, 90089, USA. .,Department of Computer Science, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
25
|
Jiang M, Gao Z, Wang J, Nurminsky DI. Evidence for a hierarchical transcriptional circuit in Drosophila male germline involving testis-specific TAF and two gene-specific transcription factors, Mod and Acj6. FEBS Lett 2017; 592:46-59. [PMID: 29235675 DOI: 10.1002/1873-3468.12937] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Revised: 11/30/2017] [Accepted: 12/05/2017] [Indexed: 01/04/2023]
Abstract
To analyze transcription factors involved in gene regulation by testis-specific TAF (tTAF), tTAF-dependent promoters were mapped and analyzed in silico. Core promoters show decreased AT content, paucity of classical promoter motifs, and enrichment with translation control element CAAAATTY. Scanning of putative regulatory regions for known position frequency matrices identified 19 transcription regulators possibly contributing to tTAF-driven gene expression. Decreased male fertility associated with mutation in one of the regulators, Acj6, indicates its involvement in male reproduction. Transcriptome study of testes from male mutants for tTAF, Acj6, and previously characterized tTAF-interacting factor Modulo implies the existence of a regulatory hierarchy of tTAF, Modulo and Acj6, in which Modulo and/or Acj6 regulate one-third of tTAF-dependent genes.
Collapse
Affiliation(s)
- Mei Jiang
- Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai, China.,Department of Biochemistry and Molecular Biology, School of Medicine, University of Maryland, Baltimore, MD, USA
| | - Zhengliang Gao
- Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai, China.,Advanced Institute of Translational Medicine, Tongji University School of Medicine, Shanghai, China
| | - Jian Wang
- Key Laboratory of Aquaculture Resources and Utilization, Ministry of Education, College of Fisheries and Life Science, Shanghai Ocean University, China
| | - Dmitry I Nurminsky
- Department of Biochemistry and Molecular Biology, School of Medicine, University of Maryland, Baltimore, MD, USA
| |
Collapse
|
26
|
Zhou S, Sternglanz R, Neiman AM. Developmentally regulated internal transcription initiation during meiosis in budding yeast. PLoS One 2017; 12:e0188001. [PMID: 29136644 PMCID: PMC5685637 DOI: 10.1371/journal.pone.0188001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2017] [Accepted: 10/30/2017] [Indexed: 02/07/2023] Open
Abstract
Sporulation of budding yeast is a developmental process in which cells undergo meiosis to generate stress-resistant progeny. The dynamic nature of the budding yeast meiotic transcriptome has been well established by a number of genome-wide studies. Here we develop an analysis pipeline to systematically identify novel transcription start sites that reside internal to a gene. Application of this pipeline to data from a synchronized meiotic time course reveals over 40 genes that display specific internal initiations in mid-sporulation. Consistent with the time of induction, motif analysis on upstream sequences of these internal transcription start sites reveals a significant enrichment for the binding site of Ndt80, the transcriptional activator of middle sporulation genes. Further examination of one gene, MRK1, demonstrates the Ndt80 binding site is necessary for internal initiation and results in the expression of an N-terminally truncated protein isoform. When the MRK1 paralog RIM11 is downregulated, the MRK1 internal transcript promotes efficient sporulation, indicating functional significance of the internal initiation. Our findings suggest internal transcriptional initiation to be a dynamic, regulated process with potential functional impacts on development.
Collapse
Affiliation(s)
- Sai Zhou
- Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, NY, United States of America
- Graduate Program in Genetics, Stony Brook University, Stony Brook, NY, United States of America
| | - Rolf Sternglanz
- Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, NY, United States of America
| | - Aaron M. Neiman
- Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, NY, United States of America
- * E-mail:
| |
Collapse
|
27
|
Majewska M, Wysokińska H, Kuźma Ł, Szymczyk P. Eukaryotic and prokaryotic promoter databases as valuable tools in exploring the regulation of gene transcription: a comprehensive overview. Gene 2017; 644:38-48. [PMID: 29104165 DOI: 10.1016/j.gene.2017.10.079] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Revised: 07/26/2017] [Accepted: 10/27/2017] [Indexed: 01/02/2023]
Abstract
The complete exploration of the regulation of gene expression remains one of the top-priority goals for researchers. As the regulation is mainly controlled at the level of transcription by promoters, study on promoters and findings are of great importance. This review summarizes forty selected databases that centralize experimental and theoretical knowledge regarding the organization of promoters, interacting transcription factors (TFs) and microRNAs (miRNAs) in many eukaryotic and prokaryotic species. The presented databases offer researchers valuable support in elucidating the regulation of gene transcription.
Collapse
Affiliation(s)
- Małgorzata Majewska
- Department of Biology and Pharmaceutical Botany, Medical University of Lodz, 90-151 Lodz, Poland.
| | - Halina Wysokińska
- Department of Biology and Pharmaceutical Botany, Medical University of Lodz, 90-151 Lodz, Poland
| | - Łukasz Kuźma
- Department of Biology and Pharmaceutical Botany, Medical University of Lodz, 90-151 Lodz, Poland
| | - Piotr Szymczyk
- Department of Pharmaceutical Biotechnology, Medical University of Lodz, 90-151 Lodz, Poland
| |
Collapse
|
28
|
Wong KC. MotifHyades: expectation maximization for de novo DNA motif pair discovery on paired sequences. Bioinformatics 2017. [DOI: 10.1093/bioinformatics/btx381] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Affiliation(s)
- Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong
| |
Collapse
|
29
|
|
30
|
van Dijk D, Sharon E, Lotan-Pompan M, Weinberger A, Segal E, Carey LB. Large-scale mapping of gene regulatory logic reveals context-dependent repression by transcriptional activators. Genome Res 2016; 27:87-94. [PMID: 27965290 PMCID: PMC5204347 DOI: 10.1101/gr.212316.116] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2016] [Accepted: 11/15/2016] [Indexed: 12/31/2022]
Abstract
Transcription factors (TFs) are key mediators that propagate extracellular and intracellular signals through to changes in gene expression profiles. However, the rules by which promoters decode the amount of active TF into target gene expression are not well understood. To determine the mapping between promoter DNA sequence, TF concentration, and gene expression output, we have conducted in budding yeast a large-scale measurement of the activity of thousands of designed promoters at six different levels of TF. We observe that maximum promoter activity is determined by TF concentration and not by the number of binding sites. Surprisingly, the addition of an activator site often reduces expression. A thermodynamic model that incorporates competition between neighboring binding sites for a local pool of TF molecules explains this behavior and accurately predicts both absolute expression and the amount by which addition of a site increases or reduces expression. Taken together, our findings support a model in which neighboring binding sites interact competitively when TF is limiting but otherwise act additively.
Collapse
Affiliation(s)
- David van Dijk
- Department of Biological Sciences, Department of Systems Biology, Columbia University, New York, New York 10027, USA.,Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, 76100 Rehovot, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, 76100 Rehovot, Israel
| | - Eilon Sharon
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, 76100 Rehovot, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, 76100 Rehovot, Israel
| | - Maya Lotan-Pompan
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, 76100 Rehovot, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, 76100 Rehovot, Israel
| | - Adina Weinberger
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, 76100 Rehovot, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, 76100 Rehovot, Israel
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, 76100 Rehovot, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, 76100 Rehovot, Israel
| | - Lucas B Carey
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Spain
| |
Collapse
|
31
|
Model-based transcriptome engineering promotes a fermentative transcriptional state in yeast. Proc Natl Acad Sci U S A 2016; 113:E7428-E7437. [PMID: 27810962 DOI: 10.1073/pnas.1603577113] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The ability to rationally manipulate the transcriptional states of cells would be of great use in medicine and bioengineering. We have developed an algorithm, NetSurgeon, which uses genome-wide gene-regulatory networks to identify interventions that force a cell toward a desired expression state. We first validated NetSurgeon extensively on existing datasets. Next, we used NetSurgeon to select transcription factor deletions aimed at improving ethanol production in Saccharomyces cerevisiae cultures that are catabolizing xylose. We reasoned that interventions that move the transcriptional state of cells using xylose toward that of cells producing large amounts of ethanol from glucose might improve xylose fermentation. Some of the interventions selected by NetSurgeon successfully promoted a fermentative transcriptional state in the absence of glucose, resulting in strains with a 2.7-fold increase in xylose import rates, a 4-fold improvement in xylose integration into central carbon metabolism, or a 1.3-fold increase in ethanol production rate. We conclude by presenting an integrated model of transcriptional regulation and metabolic flux that will enable future efforts aimed at improving xylose fermentation to prioritize functional regulators of central carbon metabolism.
Collapse
|
32
|
Xie B, Horecka J, Chu A, Davis RW, Becker E, Primig M. Ndt80 activates the meiotic ORC1 transcript isoform and SMA2 via a bi-directional middle sporulation element in Saccharomyces cerevisiae. RNA Biol 2016; 13:772-82. [PMID: 27362276 DOI: 10.1080/15476286.2016.1191738] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
Abstract
The origin of replication complex subunit ORC1 is important for DNA replication. The gene is known to encode a meiotic transcript isoform (mORC1) with an extended 5'-untranslated region (5'-UTR), which was predicted to inhibit protein translation. However, the regulatory mechanism that controls the mORC1 transcript isoform is unknown and no molecular biological evidence for a role of mORC1 in negatively regulating Orc1 protein during gametogenesis is available. By interpreting RNA profiling data obtained with growing and sporulating diploid cells, mitotic haploid cells, and a starving diploid control strain, we determined that mORC1 is a middle meiotic transcript isoform. Regulatory motif predictions and genetic experiments reveal that the activator Ndt80 and its middle sporulation element (MSE) target motif are required for the full induction of mORC1 and the divergently transcribed meiotic SMA2 locus. Furthermore, we find that the MSE-binding negative regulator Sum1 represses both mORC1 and SMA2 during mitotic growth. Finally, we demonstrate that an MSE deletion strain, which cannot induce mORC1, contains abnormally high Orc1 levels during post-meiotic stages of gametogenesis. Our results reveal the regulatory mechanism that controls mORC1, highlighting a novel developmental stage-specific role for the MSE element in bi-directional mORC1/SMA2 gene activation, and correlating mORC1 induction with declining Orc1 protein levels. Because eukaryotic genes frequently encode multiple transcripts possessing 5'-UTRs of variable length, our results are likely relevant for gene expression during development and disease in higher eukaryotes.
Collapse
Affiliation(s)
- Bingning Xie
- a Inserm U1085 IRSET, Université de Rennes 1 , Rennes , France
| | - Joe Horecka
- b Stanford Genome Technology Center , Palo Alto , CA , USA
| | - Angela Chu
- b Stanford Genome Technology Center , Palo Alto , CA , USA
| | - Ronald W Davis
- b Stanford Genome Technology Center , Palo Alto , CA , USA.,c Departments of Biochemistry and Genetics , Stanford University , Stanford , CA , USA
| | | | - Michael Primig
- a Inserm U1085 IRSET, Université de Rennes 1 , Rennes , France
| |
Collapse
|
33
|
Mostovoy Y, Thiemicke A, Hsu TY, Brem RB. The Role of Transcription Factors at Antisense-Expressing Gene Pairs in Yeast. Genome Biol Evol 2016; 8:1748-61. [PMID: 27190003 PMCID: PMC4943177 DOI: 10.1093/gbe/evw104] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Genes encoded close to one another on the chromosome are often coexpressed, by a mechanism and regulatory logic that remain poorly understood. We surveyed the yeast genome for tandem gene pairs oriented tail-to-head at which expression antisense to the upstream gene was conserved across species. The intergenic region at most such tandem pairs is a bidirectional promoter, shared by the downstream gene mRNA and the upstream antisense transcript. Genomic analyses of these intergenic loci revealed distinctive patterns of transcription factor regulation. Mutation of a given transcription factor verified its role as a regulator in trans of tandem gene pair loci, including the proximally initiating upstream antisense transcript and downstream mRNA and the distally initiating upstream mRNA. To investigate cis-regulatory activity at such a locus, we focused on the stress-induced NAD(P)H dehydratase YKL151C and its downstream neighbor, the metabolic enzyme GPM1. Previous work has implicated the region between these genes in regulation of GPM1 expression; our mutation experiments established its function in rich medium as a repressor in cis of the distally initiating YKL151C sense RNA, and an activator of the proximally initiating YKL151C antisense RNA. Wild-type expression of all three transcripts required the transcription factor Gcr2. Thus, at this locus, the intergenic region serves as a focal point of regulatory input, driving antisense expression and mediating the coordinated regulation of YKL151C and GPM1. Together, our findings implicate transcription factors in the joint control of neighboring genes specialized to opposing conditions and the antisense transcripts expressed between them.
Collapse
Affiliation(s)
- Yulia Mostovoy
- Department of Molecular and Cell Biology, University of California, Berkeley, California Present address: Cardiovascular Research Institute, University of California, San Francisco, CA
| | - Alexander Thiemicke
- Department of Molecular and Cell Biology, University of California, Berkeley, California Program in Molecular Medicine, Friedrich-Schiller-Universität, Jena, Germany Present address: Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN
| | - Tiffany Y Hsu
- Department of Molecular and Cell Biology, University of California, Berkeley, California Present address: Graduate Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA
| | - Rachel B Brem
- Department of Molecular and Cell Biology, University of California, Berkeley, California Present address: Buck Institute for Research on Aging, Novato, CA
| |
Collapse
|
34
|
Abstract
Transcriptional control of gene expression requires interactions between the cis-regulatory elements (CREs) controlling gene promoters. We developed a sensitive computational method to identify CRE combinations with conserved spacing that does not require genome alignments. When applied to seven sensu stricto and sensu lato Saccharomyces species, 80% of the predicted interactions displayed some evidence of combinatorial transcriptional behavior in several existing datasets including: (1) chromatin immunoprecipitation data for colocalization of transcription factors, (2) gene expression data for coexpression of predicted regulatory targets, and (3) gene ontology databases for common pathway membership of predicted regulatory targets. We tested several predicted CRE interactions with chromatin immunoprecipitation experiments in a wild-type strain and strains in which a predicted cofactor was deleted. Our experiments confirmed that transcription factor (TF) occupancy at the promoters of the CRE combination target genes depends on the predicted cofactor while occupancy of other promoters is independent of the predicted cofactor. Our method has the additional advantage of identifying regulatory differences between species. By analyzing the S. cerevisiae and S. bayanus genomes, we identified differences in combinatorial cis-regulation between the species and showed that the predicted changes in gene regulation explain several of the species-specific differences seen in gene expression datasets. In some instances, the same CRE combinations appear to regulate genes involved in distinct biological processes in the two different species. The results of this research demonstrate that (1) combinatorial cis-regulation can be inferred by multi-genome analysis and (2) combinatorial cis-regulation can explain differences in gene expression between species.
Collapse
|
35
|
ChEC-seq kinetics discriminates transcription factor binding sites by DNA sequence and shape in vivo. Nat Commun 2015; 6:8733. [PMID: 26490019 PMCID: PMC4618392 DOI: 10.1038/ncomms9733] [Citation(s) in RCA: 114] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 09/25/2015] [Indexed: 12/31/2022] Open
Abstract
Chromatin endogenous cleavage (ChEC) uses fusion of a protein of interest to micrococcal nuclease (MNase) to target calcium-dependent cleavage to specific genomic loci in vivo. Here we report the combination of ChEC with high-throughput sequencing (ChEC-seq) to map budding yeast transcription factor (TF) binding. Temporal analysis of ChEC-seq data reveals two classes of sites for TFs, one displaying rapid cleavage at sites with robust consensus motifs and the second showing slow cleavage at largely unique sites with low-scoring motifs. Sites with high-scoring motifs also display asymmetric cleavage, indicating that ChEC-seq provides information on the directionality of TF-DNA interactions. Strikingly, similar DNA shape patterns are observed regardless of motif strength, indicating that the kinetics of ChEC-seq discriminates DNA recognition through sequence and/or shape. We propose that time-resolved ChEC-seq detects both high-affinity interactions of TFs with consensus motifs and sites preferentially sampled by TFs during diffusion and sliding. In chromatin endogenous cleavage (ChEC), micrococcal nuclease (MNase) is fused to a protein of interest and its cleavage is thus targeted to specific genomic loci in vivo. Here, the authors show that time-resolved ChEC-seq (high-throughput sequencing after ChEC) can detect DNA shape patterns regardless of motif strength.
Collapse
|
36
|
Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast. PLoS Comput Biol 2015; 11:e1004418. [PMID: 26291518 PMCID: PMC4546298 DOI: 10.1371/journal.pcbi.1004418] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Accepted: 06/29/2015] [Indexed: 11/19/2022] Open
Abstract
Transcription factor (TF) binding is determined by the presence of specific sequence motifs (SM) and chromatin accessibility, where the latter is influenced by both chromatin state (CS) and DNA structure (DS) properties. Although SM, CS, and DS have been used to predict TF binding sites, a predictive model that jointly considers CS and DS has not been developed to predict either TF-specific binding or general binding properties of TFs. Using budding yeast as model, we found that machine learning classifiers trained with either CS or DS features alone perform better in predicting TF-specific binding compared to SM-based classifiers. In addition, simultaneously considering CS and DS further improves the accuracy of the TF binding predictions, indicating the highly complementary nature of these two properties. The contributions of SM, CS, and DS features to binding site predictions differ greatly between TFs, allowing TF-specific predictions and potentially reflecting different TF binding mechanisms. In addition, a "TF-agnostic" predictive model based on three DNA “intrinsic properties” (in silico predicted nucleosome occupancy, major groove geometry, and dinucleotide free energy) that can be calculated from genomic sequences alone has performance that rivals the model incorporating experiment-derived data. This intrinsic property model allows prediction of binding regions not only across TFs, but also across DNA-binding domain families with distinct structural folds. Furthermore, these predicted binding regions can help identify TF binding sites that have a significant impact on target gene expression. Because the intrinsic property model allows prediction of binding regions across DNA-binding domain families, it is TF agnostic and likely describes general binding potential of TFs. Thus, our findings suggest that it is feasible to establish a TF agnostic model for identifying functional regulatory regions in potentially any sequenced genome. Identification of transcription factor binding sites based on sequence motifs is typically accompanied by a high false positive rate. Increasing evidence suggests that there are many other factors besides DNA sequence that may affect the binding and interaction of TFs with DNA. Through the integration of sequence motif, chromatin state, and DNA structure properties, we show that TF binding can be better predicted. Moreover, considering chromatin state and DNA structure properties simultaneously yields a significant improvement. While the binding of some TFs can be readily predicted using either chromatin state information or DNA structure, other TFs need both. Thus, our findings provide insights on how different histone modifications and DNA structure properties may influence the binding of a particular TF and thus how TFs regulate gene expression. These features are referred to as sequence “intrinsic properties” because they can be predicted from sequences alone. These intrinsic properties can be used to build a TF binding prediction model that has a similar performance to considering all features. Moreover, the intrinsic property model allows TFBS predictions not only across TFs, but also across DNA-binding domain families that are present in most eukaryotes, suggesting that the model likely can be used across species.
Collapse
|
37
|
Coetzee SG, Coetzee GA, Hazelett DJ. motifbreakR: an R/Bioconductor package for predicting variant effects at transcription factor binding sites. Bioinformatics 2015; 31:3847-9. [PMID: 26272984 PMCID: PMC4653394 DOI: 10.1093/bioinformatics/btv470] [Citation(s) in RCA: 135] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Accepted: 08/06/2015] [Indexed: 11/13/2022] Open
Abstract
Summary: Functional annotation represents a key step toward the understanding and interpretation of germline and somatic variation as revealed by genome-wide association studies (GWAS) and The Cancer Genome Atlas (TCGA), respectively. GWAS have revealed numerous genetic risk variants residing in non-coding DNA associated with complex diseases. For sequences that lie within enhancers or promoters of transcription, it is not straightforward to assess the effects of variants on likely transcription factor binding sites. Consequently we introduce motifbreakR, which allows the biologist to judge whether the sequence surrounding a polymorphism or mutation is a good match, and how much information is gained or lost in one allele of the polymorphism or mutation relative to the other. MotifbreakR is flexible, giving a choice of algorithms for interrogation of genomes with motifs from many public sources that users can choose from. MotifbreakR can predict effects for novel or previously described variants in public databases, making it suitable for tasks beyond the scope of its original design. Lastly, it can be used to interrogate any genome curated within bioconductor. Availability and implementation:https://github.com/Simon-Coetzee/MotifBreakR, www.bioconductor.org. Contact:dennis.hazelett@cshs.org
Collapse
Affiliation(s)
- Simon G Coetzee
- Bioinformatics and Computational Biology Research Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA and
| | - Gerhard A Coetzee
- Department of Urology and Preventive Medicine, USC Norris Comprehensive Cancer Center, Los Angeles, CA, USA
| | - Dennis J Hazelett
- Bioinformatics and Computational Biology Research Center, Cedars-Sinai Medical Center, Los Angeles, CA, USA and
| |
Collapse
|
38
|
Global alterations of the transcriptional landscape during yeast growth and development in the absence of Ume6-dependent chromatin modification. Mol Genet Genomics 2015; 290:2031-46. [PMID: 25957495 DOI: 10.1007/s00438-015-1051-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Accepted: 04/17/2015] [Indexed: 10/23/2022]
Abstract
Chromatin modification enzymes are important regulators of gene expression and some are evolutionarily conserved from yeast to human. Saccharomyces cerevisiae is a major model organism for genome-wide studies that aim at the identification of target genes under the control of conserved epigenetic regulators. Ume6 interacts with the upstream repressor site 1 (URS1) and represses transcription by recruiting both the conserved histone deacetylase Rpd3 (through the co-repressor Sin3) and the chromatin-remodeling factor Isw2. Cells lacking Ume6 are defective in growth, stress response, and meiotic development. RNA profiling studies and in vivo protein-DNA binding assays identified mRNAs or transcript isoforms that are directly repressed by Ume6 in mitosis. However, a comprehensive understanding of the transcriptional alterations, which underlie the complex ume6Δ mutant phenotype during fermentation, respiration, or sporulation, is lacking. We report the protein-coding transcriptome of a diploid MAT a/α wild-type and ume6/ume6 mutant strains cultured in rich media with glucose or acetate as a carbon source, or sporulation-inducing medium. We distinguished direct from indirect effects on mRNA levels by combining GeneChip data with URS1 motif predictions and published high-throughput in vivo Ume6-DNA binding data. To gain insight into the molecular interactions between successive waves of Ume6-dependent meiotic genes, we integrated expression data with information on protein networks. Our work identifies novel Ume6 repressed genes during growth and development and reveals a strong effect of the carbon source on the derepression pattern of transcripts in growing and developmentally arrested ume6/ume6 mutant cells. Since yeast is a useful model organism for chromatin-mediated effects on gene expression, our results provide a rich source for further genetic and molecular biological work on the regulation of cell growth and cell differentiation in eukaryotes.
Collapse
|
39
|
Liu Y, Stuparevic I, Xie B, Becker E, Law MJ, Primig M. The conserved histone deacetylase Rpd3 and the DNA binding regulator Ume6 repressBOI1's meiotic transcript isoform during vegetative growth inSaccharomyces cerevisiae. Mol Microbiol 2015; 96:861-74. [DOI: 10.1111/mmi.12976] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/17/2015] [Indexed: 12/26/2022]
Affiliation(s)
- Yuchen Liu
- Inserm U1085 IRSET; Inserm; 35042 Rennes France
| | | | | | - Emmanuelle Becker
- Inserm U1085 IRSET; Inserm; 35042 Rennes France
- Departement des sciences de la vie et de l'environnement; Université de Rennes 1; 35042 Rennes France
| | - Michael J. Law
- School of Osteopathic Medicine; Rowan University; Stratford NJ 08084 USA
| | | |
Collapse
|
40
|
Maier EJ, Haynes BC, Gish SR, Wang ZA, Skowyra ML, Marulli AL, Doering TL, Brent MR. Model-driven mapping of transcriptional networks reveals the circuitry and dynamics of virulence regulation. Genome Res 2015; 25:690-700. [PMID: 25644834 PMCID: PMC4417117 DOI: 10.1101/gr.184101.114] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Accepted: 01/15/2015] [Indexed: 01/09/2023]
Abstract
Key steps in understanding a biological process include identifying genes that are involved and determining how they are regulated. We developed a novel method for identifying transcription factors (TFs) involved in a specific process and used it to map regulation of the key virulence factor of a deadly fungus—its capsule. The map, built from expression profiles of 41 TF mutants, includes 20 TFs not previously known to regulate virulence attributes. It also reveals a hierarchy comprising executive, midlevel, and “foreman” TFs. When grouped by temporal expression pattern, these TFs explain much of the transcriptional dynamics of capsule induction. Phenotypic analysis of TF deletion mutants revealed complex relationships among virulence factors and virulence in mice. These resources and analyses provide the first integrated, systems-level view of capsule regulation and biosynthesis. Our methods dramatically improve the efficiency with which transcriptional networks can be analyzed, making genomic approaches accessible to laboratories focused on specific physiological processes.
Collapse
Affiliation(s)
- Ezekiel J Maier
- Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, Missouri 63108, USA; Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, Missouri 63130, USA
| | - Brian C Haynes
- Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, Missouri 63108, USA; Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, Missouri 63130, USA
| | - Stacey R Gish
- Department of Molecular Microbiology, Washington University in St. Louis School of Medicine, St. Louis, Missouri 63110, USA
| | - Zhuo A Wang
- Department of Molecular Microbiology, Washington University in St. Louis School of Medicine, St. Louis, Missouri 63110, USA
| | - Michael L Skowyra
- Department of Molecular Microbiology, Washington University in St. Louis School of Medicine, St. Louis, Missouri 63110, USA
| | - Alyssa L Marulli
- Department of Molecular Microbiology, Washington University in St. Louis School of Medicine, St. Louis, Missouri 63110, USA
| | - Tamara L Doering
- Department of Molecular Microbiology, Washington University in St. Louis School of Medicine, St. Louis, Missouri 63110, USA
| | - Michael R Brent
- Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, Missouri 63108, USA; Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, Missouri 63130, USA; Department of Genetics, Washington University in St. Louis School of Medicine, St. Louis, Missouri 63110, USA
| |
Collapse
|
41
|
Lardenois A, Stuparevic I, Liu Y, Law MJ, Becker E, Smagulova F, Waern K, Guilleux MH, Horecka J, Chu A, Kervarrec C, Strich R, Snyder M, Davis RW, Steinmetz LM, Primig M. The conserved histone deacetylase Rpd3 and its DNA binding subunit Ume6 control dynamic transcript architecture during mitotic growth and meiotic development. Nucleic Acids Res 2014; 43:115-28. [PMID: 25477386 PMCID: PMC4288150 DOI: 10.1093/nar/gku1185] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
It was recently reported that the sizes of many mRNAs change when budding yeast cells exit mitosis and enter the meiotic differentiation pathway. These differences were attributed to length variations of their untranslated regions. The function of UTRs in protein translation is well established. However, the mechanism controlling the expression of distinct transcript isoforms during mitotic growth and meiotic development is unknown. In this study, we order developmentally regulated transcript isoforms according to their expression at specific stages during meiosis and gametogenesis, as compared to vegetative growth and starvation. We employ regulatory motif prediction, in vivo protein-DNA binding assays, genetic analyses and monitoring of epigenetic amino acid modification patterns to identify a novel role for Rpd3 and Ume6, two components of a histone deacetylase complex already known to repress early meiosis-specific genes in dividing cells, in mitotic repression of meiosis-specific transcript isoforms. Our findings classify developmental stage-specific early, middle and late meiotic transcript isoforms, and they point to a novel HDAC-dependent control mechanism for flexible transcript architecture during cell growth and differentiation. Since Rpd3 is highly conserved and ubiquitously expressed in many tissues, our results are likely relevant for development and disease in higher eukaryotes.
Collapse
Affiliation(s)
| | - Igor Stuparevic
- Inserm U1085-Irset, Université de Rennes 1, Rennes, F-35042, France
| | - Yuchen Liu
- Inserm U1085-Irset, Université de Rennes 1, Rennes, F-35042, France
| | - Michael J Law
- School of Osteopathic Medicine, Rowan University, Stratford, NJ 08084, USA
| | | | - Fatima Smagulova
- Inserm U1085-Irset, Université de Rennes 1, Rennes, F-35042, France
| | - Karl Waern
- Department of Genetics, Stanford University, Stanford, CA 94395, USA
| | | | - Joe Horecka
- Stanford Genome Technology Center, Palo Alto, CA 94304, USA
| | - Angela Chu
- Stanford Genome Technology Center, Palo Alto, CA 94304, USA
| | | | - Randy Strich
- School of Osteopathic Medicine, Rowan University, Stratford, NJ 08084, USA
| | - Mike Snyder
- Department of Genetics, Stanford University, Stanford, CA 94395, USA
| | - Ronald W Davis
- Stanford Genome Technology Center, Palo Alto, CA 94304, USA Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Lars M Steinmetz
- European Molecular Biology Laboratory, Heidelberg 69117, Germany
| | - Michael Primig
- Inserm U1085-Irset, Université de Rennes 1, Rennes, F-35042, France
| |
Collapse
|
42
|
Gopinath RK, You ST, Chien KY, Swamy KBS, Yu JS, Schuyler SC, Leu JY. The Hsp90-dependent proteome is conserved and enriched for hub proteins with high levels of protein-protein connectivity. Genome Biol Evol 2014; 6:2851-65. [PMID: 25316598 PMCID: PMC4224352 DOI: 10.1093/gbe/evu226] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Hsp90 is one of the most abundant and conserved proteins in the cell. Reduced levels or activity of Hsp90 causes defects in many cellular processes and also reveals genetic and nongenetic variation within a population. Despite information about Hsp90 protein–protein interactions, a global view of the Hsp90-regulated proteome in yeast is unavailable. To investigate the degree of dependency of individual yeast proteins on Hsp90, we used the “stable isotope labeling by amino acids in cell culture” method coupled with mass spectrometry to quantify around 4,000 proteins in low-Hsp90 cells. We observed that 904 proteins changed in their abundance by more than 1.5-fold. When compared with the transcriptome of the same population of cells, two-thirds of the misregulated proteins were observed to be affected posttranscriptionally, of which the majority were downregulated. Further analyses indicated that the downregulated proteins are highly conserved and assume central roles in cellular networks with a high number of protein interacting partners, suggesting that Hsp90 buffers genetic and nongenetic variation through regulating protein network hubs. The downregulated proteins were enriched for essential proteins previously not known to be Hsp90-dependent. Finally, we observed that downregulation of transcription factors and mating pathway components by attenuating Hsp90 function led to decreased target gene expression and pheromone response, respectively, providing a direct link between observed proteome regulation and cellular phenotypes.
Collapse
Affiliation(s)
- Rajaneesh Karimpurath Gopinath
- Molecular and Cell Biology, Taiwan International Graduate Program, Graduate Institute of Life Sciences, National Defense Medical Center and Academia Sinica Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan
| | - Shu-Ting You
- Molecular and Cell Biology, Taiwan International Graduate Program, Graduate Institute of Life Sciences, National Defense Medical Center and Academia Sinica Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan
| | - Kun-Yi Chien
- Molecular Medicine Research Center, Department of Biochemistry and Molecular Biology, College of Medicine, Chang Gung University, Tao-Yuan, Taiwan
| | | | - Jau-Song Yu
- Department of Cell and Molecular Biology, College of Medicine, Chang Gung University, Tao-Yuan, Taiwan
| | - Scott C Schuyler
- Department of Biomedical Sciences, College of Medicine, Chang Gung University, Tao-Yuan, Taiwan
| | - Jun-Yi Leu
- Institute of Molecular Biology, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
43
|
Mayhew D, Mitra RD. Transcription factor regulation and chromosome dynamics during pseudohyphal growth. Mol Biol Cell 2014; 25:2669-76. [PMID: 25009286 PMCID: PMC4148256 DOI: 10.1091/mbc.e14-04-0871] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
A multiplexed analysis of the transcriptional regulation of yeast pseudohyphal growth recorded the binding of 28 different transcription factors with barcoded transposons. A core set of target genes is identified, and a process of DNA looping at the FLO11 locus that provides transcriptional memory for expression of the gene is described. Pseudohyphal growth is a developmental pathway seen in some strains of yeast in which cells form multicellular filaments in response to environmental stresses. We used multiplexed transposon “Calling Cards” to record the genome-wide binding patterns of 28 transcription factors (TFs) in nitrogen-starved yeast. We identified TF targets relevant for pseudohyphal growth, producing a detailed map of its regulatory network. Using tools from graph theory, we identified 14 TFs that lie at the center of this network, including Flo8, Mss11, and Mfg1, which bind as a complex. Surprisingly, the DNA-binding preferences for these key TFs were unknown. Using Calling Card data, we predicted the in vivo DNA-binding motif for the Flo8-Mss11-Mfg1 complex and validated it using a reporter assay. We found that this complex binds several important targets, including FLO11, at both their promoter and termination sequences. We demonstrated that this binding pattern is the result of DNA looping, which regulates the transcription of these targets and is stabilized by an interaction with the nuclear pore complex. This looping provides yeast cells with a transcriptional memory, enabling them more rapidly to execute the filamentous growth program when nitrogen starved if they had been previously exposed to this condition.
Collapse
Affiliation(s)
- David Mayhew
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63108
| | - Robi D Mitra
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63108
| |
Collapse
|
44
|
Liu G, Marras A, Nielsen J. The future of genome-scale modeling of yeast through integration of a transcriptional regulatory network. QUANTITATIVE BIOLOGY 2014. [DOI: 10.1007/s40484-014-0027-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
45
|
Abstract
The term “transcriptional network” refers to the mechanism(s) that underlies coordinated expression of genes, typically involving transcription factors (TFs) binding to the promoters of multiple genes, and individual genes controlled by multiple TFs. A multitude of studies in the last two decades have aimed to map and characterize transcriptional networks in the yeast Saccharomyces cerevisiae. We review the methodologies and accomplishments of these studies, as well as challenges we now face. For most yeast TFs, data have been collected on their sequence preferences, in vivo promoter occupancy, and gene expression profiles in deletion mutants. These systematic studies have led to the identification of new regulators of numerous cellular functions and shed light on the overall organization of yeast gene regulation. However, many yeast TFs appear to be inactive under standard laboratory growth conditions, and many of the available data were collected using techniques that have since been improved. Perhaps as a consequence, comprehensive and accurate mapping among TF sequence preferences, promoter binding, and gene expression remains an open challenge. We propose that the time is ripe for renewed systematic efforts toward a complete mapping of yeast transcriptional regulatory mechanisms.
Collapse
|
46
|
Jia C, Carson MB, Wang Y, Lin Y, Lu H. A new exhaustive method and strategy for finding motifs in ChIP-enriched regions. PLoS One 2014; 9:e86044. [PMID: 24475069 PMCID: PMC3901781 DOI: 10.1371/journal.pone.0086044] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2013] [Accepted: 12/04/2013] [Indexed: 12/22/2022] Open
Abstract
ChIP-seq, which combines chromatin immunoprecipitation (ChIP) with next-generation parallel sequencing, allows for the genome-wide identification of protein-DNA interactions. This technology poses new challenges for the development of novel motif-finding algorithms and methods for determining exact protein-DNA binding sites from ChIP-enriched sequencing data. State-of-the-art heuristic, exhaustive search algorithms have limited application for the identification of short (l, d) motifs (l ≤ 10, d ≤ 2) contained in ChIP-enriched regions. In this work we have developed a more powerful exhaustive method (FMotif) for finding long (l, d) motifs in DNA sequences. In conjunction with our method, we have adopted a simple ChIP-enriched sampling strategy for finding these motifs in large-scale ChIP-enriched regions. Empirical studies on synthetic samples and applications using several ChIP data sets including 16 TF (transcription factor) ChIP-seq data sets and five TF ChIP-exo data sets have demonstrated that our proposed method is capable of finding these motifs with high efficiency and accuracy. The source code for FMotif is available at http://211.71.76.45/FMotif/.
Collapse
Affiliation(s)
- Caiyan Jia
- School of Computer and Information Technology & Beijing Key Lab of Traffic Data Analysis, Beijing Jiaotong University, Beijing, China
- Department of Bioengineering/Bioinformatics, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Matthew B. Carson
- Center for Healthcare Studies, Institute for Public Health and Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
| | - Yang Wang
- School of Computer and Information Technology & Beijing Key Lab of Traffic Data Analysis, Beijing Jiaotong University, Beijing, China
| | - Youfang Lin
- School of Computer and Information Technology & Beijing Key Lab of Traffic Data Analysis, Beijing Jiaotong University, Beijing, China
| | - Hui Lu
- Department of Bioengineering/Bioinformatics, University of Illinois at Chicago, Chicago, Illinois, United States of America
- Shanghai Institute of Medical Genetics, Shanghai Children’s Hospital, Shanghai JiaoTong University, Shanghai, China
| |
Collapse
|
47
|
Genetics of single-cell protein abundance variation in large yeast populations. Nature 2014; 506:494-7. [PMID: 24402228 PMCID: PMC4285441 DOI: 10.1038/nature12904] [Citation(s) in RCA: 109] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2013] [Accepted: 11/19/2013] [Indexed: 12/26/2022]
Abstract
Variation among individuals arises in part from differences in DNA sequences, but the genetic basis for variation in most traits, including common diseases, remains only partly understood. Many DNA variants influence phenotypes by altering the expression level of one or multiple genes. The effects of such variants can be detected as expression quantitative trait loci (eQTL) 1. Traditional eQTL mapping requires large-scale genotype and gene expression data for each individual in the study sample, which limits sample sizes to hundreds of individuals in both humans and model organisms and reduces statistical power 2–6. Consequently, many eQTL are likely missed, especially those with smaller effects 7. Further, most studies use mRNA rather than protein abundance as the measure of gene expression. Studies that have used mass-spectrometry proteomics 8–13 reported surprising differences between eQTL and protein QTL (pQTL) for the same genes 9,10, but these studies have been even more limited in scope. Here, we introduce a powerful method for identifying genetic loci that influence protein expression in the yeast Saccharomyes cerevisiae. We measure single-cell protein abundance through the use of green-fluorescent-protein tags in very large populations of genetically variable cells, and use pooled sequencing to compare allele frequencies across the genome in thousands of individuals with high vs. low protein abundance. We applied this method to 160 genes and detected many more loci per gene than previous studies. We also observed closer correspondence between loci that influence protein abundance and loci that influence mRNA abundance of a given gene. Most loci cluster at hotspot locations that influence multiple proteins—in some cases, more than half of those examined. The variants that underlie these hotspots have profound effects on the gene regulatory network and provide insights into genetic variation in cell physiology between yeast strains.
Collapse
|
48
|
Kasinathan S, Orsi GA, Zentner GE, Ahmad K, Henikoff S. High-resolution mapping of transcription factor binding sites on native chromatin. Nat Methods 2013; 11:203-9. [PMID: 24336359 PMCID: PMC3929178 DOI: 10.1038/nmeth.2766] [Citation(s) in RCA: 134] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Accepted: 10/28/2013] [Indexed: 12/20/2022]
Abstract
Sequence-specific DNA-binding proteins including transcription factors (TFs) are key determinants of gene regulation and chromatin architecture. Formaldehyde cross-linking and sonication followed by Chromatin ImmunoPrecipitation (X-ChIP) is widely used for profiling of TF binding, but is limited by low resolution and poor specificity and sensitivity. We present a simple protocol that starts with micrococcal nuclease-digested uncross-linked chromatin and is followed by affinity purification of TFs and paired-end sequencing. The resulting ORGANIC (Occupied Regions of Genomes from Affinity-purified Naturally Isolated Chromatin) profiles of Saccharomyces cerevisiae Abf1 and Reb1 provide highly accurate base-pair resolution maps that are not biased toward accessible chromatin, and do not require input normalization. We also demonstrate the high specificity of our method when applied to larger genomes by profiling Drosophila melanogaster GAGA Factor and Pipsqueak. Our results suggest that ORGANIC profiling is a widely applicable high-resolution method for sensitive and specific profiling of direct protein-DNA interactions.
Collapse
Affiliation(s)
- Sivakanthan Kasinathan
- 1] Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA. [2] Medical Scientist Training Program, University of Washington School of Medicine, Seattle, Washington, USA. [3] Molecular & Cellular Biology Graduate Program, University of Washington, Seattle, Washington, USA
| | - Guillermo A Orsi
- 1] Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, USA. [2] Centre National de la Recherche Scientifique UMR 218 and Institut Curie, Centre de Recherche, Paris, France
| | - Gabriel E Zentner
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Kami Ahmad
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, USA
| | - Steven Henikoff
- 1] Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA. [2] Howard Hughes Medical Institute, Seattle, Washington, USA
| |
Collapse
|
49
|
Zeigler RD, Cohen BA. Discrimination between thermodynamic models of cis-regulation using transcription factor occupancy data. Nucleic Acids Res 2013; 42:2224-34. [PMID: 24288374 PMCID: PMC3936720 DOI: 10.1093/nar/gkt1230] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Many studies have identified binding preferences for transcription factors (TFs), but few have yielded predictive models of how combinations of transcription factor binding sites generate specific levels of gene expression. Synthetic promoters have emerged as powerful tools for generating quantitative data to parameterize models of combinatorial cis-regulation. We sought to improve the accuracy of such models by quantifying the occupancy of TFs on synthetic promoters in vivo and incorporating these data into statistical thermodynamic models of cis-regulation. Using chromatin immunoprecipitation-seq, we measured the occupancy of Gcn4 and Cbf1 in synthetic promoter libraries composed of binding sites for Gcn4, Cbf1, Met31/Met32 and Nrg1. We measured the occupancy of these two TFs and the expression levels of all promoters in two growth conditions. Models parameterized using only expression data predicted expression but failed to identify several interactions between TFs. In contrast, models parameterized with occupancy and expression data predicted expression data, and also revealed Gcn4 self-cooperativity and a negative interaction between Gcn4 and Nrg1. Occupancy data also allowed us to distinguish between competing regulatory mechanisms for the factor Gcn4. Our framework for combining occupancy and expression data produces predictive models that better reflect the mechanisms underlying combinatorial cis-regulation of gene expression.
Collapse
Affiliation(s)
- Robert D Zeigler
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine in St. Louis, MO 63108, USA
| | | |
Collapse
|
50
|
Costanzo MC, Engel SR, Wong ED, Lloyd P, Karra K, Chan ET, Weng S, Paskov KM, Roe GR, Binkley G, Hitz BC, Cherry JM. Saccharomyces genome database provides new regulation data. Nucleic Acids Res 2013; 42:D717-25. [PMID: 24265222 PMCID: PMC3965049 DOI: 10.1093/nar/gkt1158] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the community resource for genomic, gene and protein information about the budding yeast Saccharomyces cerevisiae, containing a variety of functional information about each yeast gene and gene product. We have recently added regulatory information to SGD and present it on a new tabbed section of the Locus Summary entitled 'Regulation'. We are compiling transcriptional regulator-target gene relationships, which are curated from the literature at SGD or imported, with permission, from the YEASTRACT database. For nearly every S. cerevisiae gene, the Regulation page displays a table of annotations showing the regulators of that gene, and a graphical visualization of its regulatory network. For genes whose products act as transcription factors, the Regulation page also shows a table of their target genes, accompanied by a Gene Ontology enrichment analysis of the biological processes in which those genes participate. We additionally synthesize information from the literature for each transcription factor in a free-text Regulation Summary, and provide other information relevant to its regulatory function, such as DNA binding site motifs and protein domains. All of the regulation data are available for querying, analysis and download via YeastMine, the InterMine-based data warehouse system in use at SGD.
Collapse
Affiliation(s)
- Maria C Costanzo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|