1
|
Geisberg JV, Moqtaderi Z, Struhl K. Chromatin regulates alternative polyadenylation via the RNA polymerase II elongation rate. Proc Natl Acad Sci U S A 2024; 121:e2405827121. [PMID: 38748572 PMCID: PMC11127049 DOI: 10.1073/pnas.2405827121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 04/15/2024] [Indexed: 05/22/2024] Open
Abstract
The RNA polymerase II (Pol II) elongation rate influences poly(A) site selection, with slow and fast Pol II derivatives causing upstream and downstream shifts, respectively, in poly(A) site utilization. In yeast, depletion of either of the histone chaperones FACT or Spt6 causes an upstream shift of poly(A) site use that strongly resembles the poly(A) profiles of slow Pol II mutant strains. Like slow Pol II mutant strains, FACT- and Spt6-depleted cells exhibit Pol II processivity defects, indicating that both Spt6 and FACT stimulate the Pol II elongation rate. Poly(A) profiles of some genes show atypical downstream shifts; this subset of genes overlaps well for FACT- or Spt6-depleted strains but is different from the atypical genes in Pol II speed mutant strains. In contrast, depletion of histone H3 or H4 causes a downstream shift of poly(A) sites for most genes, indicating that nucleosomes inhibit the Pol II elongation rate in vivo. Thus, chromatin-based control of the Pol II elongation rate is a potential mechanism, distinct from direct effects on the cleavage/polyadenylation machinery, to regulate alternative polyadenylation in response to genetic or environmental changes.
Collapse
Affiliation(s)
- Joseph V. Geisberg
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA02115
| | - Zarmik Moqtaderi
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA02115
| | - Kevin Struhl
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA02115
| |
Collapse
|
2
|
Struhl K. How is polyadenylation restricted to 3'-untranslated regions? Yeast 2024; 41:186-191. [PMID: 38041485 PMCID: PMC11001523 DOI: 10.1002/yea.3915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 10/30/2023] [Accepted: 11/21/2023] [Indexed: 12/03/2023] Open
Abstract
Polyadenylation occurs at numerous sites within 3'-untranslated regions (3'-UTRs) but rarely within coding regions. How does Pol II travel through long coding regions without generating poly(A) sites, yet then permits promiscuous polyadenylation once it reaches the 3'-UTR? The cleavage/polyadenylation (CpA) machinery preferentially associates with 3'-UTRs, but it is unknown how its recruitment is restricted to 3'-UTRs during Pol II elongation. Unlike coding regions, 3'-UTRs have long AT-rich stretches of DNA that may be important for restricting polyadenylation to 3'-UTRs. Recognition of the 3'-UTR could occur at the DNA (AT-rich), RNA (AU-rich), or RNA:DNA hybrid (rU:dA- and/or rA:dT-rich) level. Based on the nucleic acid critical for 3'-UTR recognition, there are three classes of models, not mutually exclusive, for how the CpA machinery is selectively recruited to 3'-UTRs, thereby restricting where polyadenylation occurs: (1) RNA-based models suggest that the CpA complex directly (or indirectly through one or more intermediary proteins) binds long AU-rich stretches that are exposed after Pol II passes through these regions. (2) DNA-based models suggest that the AT-rich sequence affects nucleosome depletion or the elongating Pol II machinery, resulting in dissociation of some elongation factors and subsequent recruitment of the CpA machinery. (3) RNA:DNA hybrid models suggest that preferential destabilization of the Pol II elongation complex at rU:dA- and/or rA:dT-rich duplexes bridging the nucleotide addition and RNA exit sites permits preferential association of the CpA machinery with 3'-UTRs. Experiments to provide evidence for one or more of these models are suggested.
Collapse
Affiliation(s)
- Kevin Struhl
- Dept. Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115
| |
Collapse
|
3
|
Zhu Y, Vvedenskaya IO, Sze SH, Nickels BE, Kaplan CD. Quantitative analysis of transcription start site selection reveals control by DNA sequence, RNA polymerase II activity and NTP levels. Nat Struct Mol Biol 2024; 31:190-202. [PMID: 38177677 PMCID: PMC10928753 DOI: 10.1038/s41594-023-01171-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 11/03/2023] [Indexed: 01/06/2024]
Abstract
Transcription start site (TSS) selection is a key step in gene expression and occurs at many promoter positions over a wide range of efficiencies. Here we develop a massively parallel reporter assay to quantitatively dissect contributions of promoter sequence, nucleoside triphosphate substrate levels and RNA polymerase II (Pol II) activity to TSS selection by 'promoter scanning' in Saccharomyces cerevisiae (Pol II MAssively Systematic Transcript End Readout, 'Pol II MASTER'). Using Pol II MASTER, we measure the efficiency of Pol II initiation at 1,000,000 individual TSS sequences in a defined promoter context. Pol II MASTER confirms proposed critical qualities of S. cerevisiae TSS -8, -1 and +1 positions, quantitatively, in a controlled promoter context. Pol II MASTER extends quantitative analysis to surrounding sequences and determines that they tune initiation over a wide range of efficiencies. These results enabled the development of a predictive model for initiation efficiency based on sequence. We show that genetic perturbation of Pol II catalytic activity alters initiation efficiency mostly independently of TSS sequence, but selectively modulates preference for the initiating nucleotide. Intriguingly, we find that Pol II initiation efficiency is directly sensitive to guanosine-5'-triphosphate levels at the first five transcript positions and to cytosine-5'-triphosphate and uridine-5'-triphosphate levels at the second position genome wide. These results suggest individual nucleoside triphosphate levels can have transcript-specific effects on initiation, representing a cryptic layer of potential regulation at the level of Pol II biochemical properties. The results establish Pol II MASTER as a method for quantitative dissection of transcription initiation in eukaryotes.
Collapse
Affiliation(s)
- Yunye Zhu
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Irina O Vvedenskaya
- Department of Genetics and Waksman Institute, Rutgers University, Piscataway, NJ, USA
| | - Sing-Hoi Sze
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, USA
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA
| | - Bryce E Nickels
- Department of Genetics and Waksman Institute, Rutgers University, Piscataway, NJ, USA
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
4
|
Back G, Walther D. Predictions of DNA mechanical properties at a genomic scale reveal potentially new functional roles of DNA flexibility. NAR Genom Bioinform 2023; 5:lqad097. [PMID: 37954573 PMCID: PMC10632188 DOI: 10.1093/nargab/lqad097] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 09/28/2023] [Accepted: 10/25/2023] [Indexed: 11/14/2023] Open
Abstract
Mechanical properties of DNA have been implied to influence many of its biological functions. Recently, a new high-throughput method, called loop-seq, which allows measuring the intrinsic bendability of DNA fragments, has been developed. Using loop-seq data, we created a deep learning model to explore the biological significance of local DNA flexibility in a range of different species from different kingdoms. Consistently, we observed a characteristic and largely dinucleotide-composition-driven change of local flexibility near transcription start sites. In the presence of a TATA-box, a pronounced peak of high flexibility can be observed. Furthermore, depending on the transcription factor investigated, flanking-sequence-dependent DNA flexibility was identified as a potential factor influencing DNA binding. Compared to randomized genomic sequences, depending on species and taxa, actual genomic sequences were observed both with increased and lowered flexibility. Furthermore, in Arabidopsis thaliana, mutation rates, both de novo and fixed, were found to be associated with relatively rigid sequence regions. Our study presents a range of significant correlations between characteristic DNA mechanical properties and genomic features, the significance of which with regard to detailed molecular relevance awaits further theoretical and experimental exploration.
Collapse
Affiliation(s)
- Georg Back
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, Potsdam-Golm 14476, Germany
| | - Dirk Walther
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, Potsdam-Golm 14476, Germany
| |
Collapse
|
5
|
Yadav M, Zuiddam M, Schiessel H. The role of transcript regions and amino acid choice in nucleosome positioning. NAR Genom Bioinform 2023; 5:lqad080. [PMID: 37705829 PMCID: PMC10495542 DOI: 10.1093/nargab/lqad080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/19/2023] [Accepted: 08/30/2023] [Indexed: 09/15/2023] Open
Abstract
Eukaryotic DNA is organized and compacted in a string of nucleosomes, DNA-wrapped protein cylinders. The positions of nucleosomes along DNA are not random but show well-known base pair sequence preferences that result from the sequence-dependent elastic and geometric properties of the DNA double helix. Here, we focus on DNA around transcription start sites, which are known to typically attract nucleosomes in multicellular life forms through their high GC content. We aim to understand how these GC signals, as observed in genome-wide averages, are produced and encoded through different genomic regions (mainly 5' UTRs, coding exons, and introns). Our study uses a bioinformatics approach to decompose the genome-wide GC signal into between-region and within-region signals. We find large differences in GC signal contributions between vertebrates and plants and, remarkably, even between closely related species. Introns contribute most to the GC signal in vertebrates, while in plants the exons dominate. Further, we find signal strengths stronger on DNA than on mRNA, suggesting a biological function of GC signals along the DNA itself, as is the case for nucleosome positioning. Finally, we make the surprising discovery that both the choice of synonymous codons and amino acids contribute to the nucleosome positioning signal.
Collapse
Affiliation(s)
- Manish Yadav
- Cluster of Excellence Physics of Life, TU Dresden, 01062 Dresden, Germany
| | - Martijn Zuiddam
- Institute Lorentz for Theoretical Physics, Leiden University, Leiden, the Netherlands
| | - Helmut Schiessel
- Cluster of Excellence Physics of Life, TU Dresden, 01062 Dresden, Germany
- Institut für Theoretische Physik, Technische Universität Dresden, 01062 Dresden, Germany
| |
Collapse
|
6
|
Han GS, Li Q, Li Y. Nucleosome positioning based on DNA sequence embedding and deep learning. BMC Genomics 2022; 23:301. [PMID: 35418074 PMCID: PMC9006412 DOI: 10.1186/s12864-022-08508-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 03/28/2022] [Indexed: 11/25/2022] Open
Abstract
Background Nucleosome positioning is the precise determination of the location of nucleosomes on DNA sequence. With the continuous advancement of biotechnology and computer technology, biological data is showing explosive growth. It is of practical significance to develop an efficient nucleosome positioning algorithm. Indeed, convolutional neural networks (CNN) can capture local features in DNA sequences, but ignore the order of bases. While the bidirectional recurrent neural network can make up for CNN's shortcomings in this regard and extract the long-term dependent features of DNA sequence. Results In this work, we use word vectors to represent DNA sequences and propose three new deep learning models for nucleosome positioning, and the integrative model NP_CBiR reaches a better prediction performance. The overall accuracies of NP_CBiR on H. sapiens, C. elegans, and D. melanogaster datasets are 86.18%, 89.39%, and 85.55% respectively. Conclusions Benefited by different network structures, NP_CBiR can effectively extract local features and bases order features of DNA sequences, thus can be considered as a complementary tool for nucleosome positioning.
Collapse
Affiliation(s)
- Guo-Sheng Han
- Department of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, Hunan, China. .,Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, 411105, Hunan, China.
| | - Qi Li
- Department of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, Hunan, China.,Xiangtan Medicine Health Vocational College, Xiangtan, 411102, Hunan, China
| | - Ying Li
- Department of Mathematics and Computational Science, Xiangtan University, Xiangtan, 411105, Hunan, China.,Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan, 411105, Hunan, China
| |
Collapse
|
7
|
Gnan S, Matelot M, Weiman M, Arnaiz O, Guérin F, Sperling L, Bétermier M, Thermes C, Chen CL, Duharcourt S. GC content, but not nucleosome positioning, directly contributes to intron splicing efficiency in Paramecium. Genome Res 2022; 32:699-709. [PMID: 35264448 PMCID: PMC8997360 DOI: 10.1101/gr.276125.121] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 02/14/2022] [Indexed: 11/24/2022]
Abstract
Eukaryotic genes are interrupted by introns that must be accurately spliced from mRNA precursors. With an average length of 25 nt, the more than 90,000 introns of Paramecium tetraurelia stand among the shortest introns reported in eukaryotes. The mechanisms specifying the correct recognition of these tiny introns remain poorly understood. Splicing can occur cotranscriptionally, and it has been proposed that chromatin structure might influence splice site recognition. To investigate the roles of nucleosome positioning in intron recognition, we determined the nucleosome occupancy along the P. tetraurelia genome. We show that P. tetraurelia displays a regular nucleosome array with a nucleosome repeat length of ∼151 bp, among the smallest periodicities reported. Our analysis has revealed that introns are frequently associated with inter-nucleosomal DNA, pointing to an evolutionary constraint favoring introns at the AT-rich nucleosome edge sequences. Using accurate splicing efficiency data from cells depleted for nonsense-mediated decay effectors, we show that introns located at the edge of nucleosomes display higher splicing efficiency than those at the center. However, multiple regression analysis indicates that the low GC content of introns, rather than nucleosome positioning, is associated with high splicing efficiency. Our data reveal a complex link between GC content, nucleosome positioning, and intron evolution in Paramecium.
Collapse
Affiliation(s)
- Stefano Gnan
- Institut Curie, Université PSL, Sorbonne Université, CNRS UMR3244, Dynamics of Genetic Information, Paris, 75005 France
| | - Mélody Matelot
- Université Paris Cité, CNRS, Institut Jacques Monod, F-75013 Paris, France
| | - Marion Weiman
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Arnaiz
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Frédéric Guérin
- Université Paris Cité, CNRS, Institut Jacques Monod, F-75013 Paris, France
| | - Linda Sperling
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Mireille Bétermier
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Claude Thermes
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Chun-Long Chen
- Institut Curie, Université PSL, Sorbonne Université, CNRS UMR3244, Dynamics of Genetic Information, Paris, 75005 France
| | - Sandra Duharcourt
- Université Paris Cité, CNRS, Institut Jacques Monod, F-75013 Paris, France
| |
Collapse
|
8
|
Chaudhary S, Jabre I, Syed NH. Epigenetic differences in an identical genetic background modulate alternative splicing in A. thaliana. Genomics 2021; 113:3476-3486. [PMID: 34391867 DOI: 10.1016/j.ygeno.2021.08.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 08/02/2021] [Accepted: 08/10/2021] [Indexed: 11/19/2022]
Abstract
How stable and temperature-dependent variations in DNA methylation and nucleosome occupancy influence alternative splicing (AS) remains poorly understood in plants. To answer this, we generated transcriptome, whole-genome bisulfite, and MNase sequencing data for an epigenetic Recombinant Inbred Line (epiRIL) of A. thaliana at normal and cold temperature. For comparative analysis, the same data sets for the parental ecotype Columbia (Col-0) were also generated, whereas for DNA methylation, previously published high confidence methylation profiles of Col-0 were used. Significant epigenetic differences in an identical genetic background were observed between Col-0 and epiRIL lines under normal and cold temperatures. Our transcriptome data revealed that differential DNA methylation and nucleosome occupancy modulate expression levels of many genes and AS in response to cold. Collectively, DNA methylation and nucleosome levels exhibit characteristic patterns around intron-exon boundaries at normal and cold conditions, and any perturbation in them, in an identical genetic background is sufficient to modulate AS in Arabidopsis.
Collapse
Affiliation(s)
- Saurabh Chaudhary
- School of Psychology and Life Sciences, Canterbury Christ Church University, Canterbury CT1 1QU, UK; Cardiff School of Biosciences, Cardiff University, Cardiff CF10 3AX, UK.
| | - Ibtissam Jabre
- School of Psychology and Life Sciences, Canterbury Christ Church University, Canterbury CT1 1QU, UK; Department of Microbial Sciences, School of Biosciences and Medicine, Faculty of Health and Medical Sciences, University of Surrey, Guildford GU2 7XH, UK
| | - Naeem H Syed
- School of Psychology and Life Sciences, Canterbury Christ Church University, Canterbury CT1 1QU, UK.
| |
Collapse
|
9
|
Fan K, Moore JE, Zhang XO, Weng Z. Genetic and epigenetic features of promoters with ubiquitous chromatin accessibility support ubiquitous transcription of cell-essential genes. Nucleic Acids Res 2021; 49:5705-5725. [PMID: 33978759 PMCID: PMC8191798 DOI: 10.1093/nar/gkab345] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2021] [Revised: 03/19/2021] [Accepted: 05/01/2021] [Indexed: 12/04/2022] Open
Abstract
Gene expression is controlled by regulatory elements within accessible chromatin. Although most regulatory elements are cell type-specific, a subset is accessible in nearly all the 517 human and 94 mouse cell and tissue types assayed by the ENCODE consortium. We systematically analyzed 9000 human and 8000 mouse ubiquitously-accessible candidate cis-regulatory elements (cCREs) with promoter-like signatures (PLSs) from ENCODE, which we denote ubi-PLSs. These are more CpG-rich than non-ubi-PLSs and correspond to genes with ubiquitously high transcription, including a majority of cell-essential genes. ubi-PLSs are enriched with motifs of ubiquitously-expressed transcription factors and preferentially bound by transcriptional cofactors regulating ubiquitously-expressed genes. They are highly conserved between human and mouse at the synteny level but exhibit frequent turnover of motif sites; accordingly, ubi-PLSs show increased variation at their centers compared with flanking regions among the ∼186 thousand human genomes sequenced by the TOPMed project. Finally, ubi-PLSs are enriched in genes implicated in Mendelian diseases, especially diseases broadly impacting most cell types, such as deficiencies in mitochondrial functions. Thus, a set of roughly 9000 mammalian promoters are actively maintained in an accessible state across cell types by a distinct set of transcription factors and cofactors to ensure the transcriptional programs of cell-essential genes.
Collapse
Affiliation(s)
- Kaili Fan
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| | - Xiao-ou Zhang
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| |
Collapse
|
10
|
Liu G, Zhao H, Meng H, Xing Y, Cai L. A deformation energy model reveals sequence-dependent property of nucleosome positioning. Chromosoma 2021; 130:27-40. [PMID: 33452566 PMCID: PMC7889546 DOI: 10.1007/s00412-020-00750-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2020] [Revised: 12/24/2020] [Accepted: 12/29/2020] [Indexed: 11/18/2022]
Abstract
We present a deformation energy model for predicting nucleosome positioning, in which a position-dependent structural parameter set derived from crystal structures of nucleosomes was used to calculate the DNA deformation energy. The model is successful in predicting nucleosome occupancy genome-wide in budding yeast, nucleosome free energy, and rotational positioning of nucleosomes. Our model also indicates that the genomic regions underlying the MNase-sensitive nucleosomes in budding yeast have high deformation energy and, consequently, low nucleosome-forming ability, while the MNase-sensitive non-histone particles are characterized by much lower DNA deformation energy and high nucleosome preference. In addition, we also revealed that remodelers, SNF2 and RSC8, are likely to act in chromatin remodeling by binding to broad nucleosome-depleted regions that are intrinsically favorable for nucleosome positioning. Our data support the important role of position-dependent physical properties of DNA in nucleosome positioning.
Collapse
Affiliation(s)
- Guoqing Liu
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China.
- Inner Mongolia Key Lab of Functional Genome Bioinformatics, Inner Mongolia University of Science and Technology, Baotou, 014010, China.
| | - Hongyu Zhao
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
- Inner Mongolia Key Lab of Functional Genome Bioinformatics, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Hu Meng
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
- Inner Mongolia Key Lab of Functional Genome Bioinformatics, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Yongqiang Xing
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
- Inner Mongolia Key Lab of Functional Genome Bioinformatics, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Lu Cai
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
- Inner Mongolia Key Lab of Functional Genome Bioinformatics, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| |
Collapse
|
11
|
Markus BM, Waldman BS, Lorenzi HA, Lourido S. High-Resolution Mapping of Transcription Initiation in the Asexual Stages of Toxoplasma gondii. Front Cell Infect Microbiol 2021; 10:617998. [PMID: 33553008 PMCID: PMC7854901 DOI: 10.3389/fcimb.2020.617998] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 12/03/2020] [Indexed: 12/13/2022] Open
Abstract
Toxoplasma gondii is a common parasite of humans and animals, causing life-threatening disease in the immunocompromized, fetal abnormalities when contracted during gestation, and recurrent ocular lesions in some patients. Central to the prevalence and pathogenicity of this protozoan is its ability to adapt to a broad range of environments, and to differentiate between acute and chronic stages. These processes are underpinned by a major rewiring of gene expression, yet the mechanisms that regulate transcription in this parasite are only partially characterized. Deciphering these mechanisms requires a precise and comprehensive map of transcription start sites (TSSs); however, Toxoplasma TSSs have remained incompletely defined. To address this challenge, we used 5'-end RNA sequencing to genomically assess transcription initiation in both acute and chronic stages of Toxoplasma. Here, we report an in-depth analysis of transcription initiation at promoters, and provide empirically-defined TSSs for 7603 (91%) protein-coding genes, of which only 1840 concur with existing gene models. Comparing data from acute and chronic stages, we identified instances of stage-specific alternative TSSs that putatively generate mRNA isoforms with distinct 5' termini. Analysis of the nucleotide content and nucleosome occupancy around TSSs allowed us to examine the determinants of TSS choice, and outline features of Toxoplasma promoter architecture. We also found pervasive divergent transcription at Toxoplasma promoters, clustered within the nucleosomes of highly-symmetrical phased arrays, underscoring chromatin contributions to transcription initiation. Corroborating previous observations, we asserted that Toxoplasma 5' leaders are among the longest of any eukaryote studied thus far, displaying a median length of approximately 800 nucleotides. Further highlighting the utility of a precise TSS map, we pinpointed motifs associated with transcription initiation, including the binding sites of the master regulator of chronic-stage differentiation, BFD1, and a novel motif with a similar positional arrangement present at 44% of Toxoplasma promoters. This work provides a critical resource for functional genomics in Toxoplasma, and lays down a foundation to study the interactions between genomic sequences and the regulatory factors that control transcription in this parasite.
Collapse
Affiliation(s)
- Benedikt M. Markus
- Whitehead Institute for Biomedical Research, Cambridge, MA, United States
- Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Benjamin S. Waldman
- Whitehead Institute for Biomedical Research, Cambridge, MA, United States
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, United States
| | | | - Sebastian Lourido
- Whitehead Institute for Biomedical Research, Cambridge, MA, United States
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, United States
| |
Collapse
|
12
|
Ash1 and Tup1 dependent repression of the Saccharomyces cerevisiae HO promoter requires activator-dependent nucleosome eviction. PLoS Genet 2020; 16:e1009133. [PMID: 33382702 PMCID: PMC7806131 DOI: 10.1371/journal.pgen.1009133] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 01/13/2021] [Accepted: 11/25/2020] [Indexed: 11/30/2022] Open
Abstract
Transcriptional regulation of the Saccharomyces cerevisiae HO gene is highly complex, requiring a balance of multiple activating and repressing factors to ensure that only a few transcripts are produced in mother cells within a narrow window of the cell cycle. Here, we show that the Ash1 repressor associates with two DNA sequences that are usually concealed within nucleosomes in the HO promoter and recruits the Tup1 corepressor and the Rpd3 histone deacetylase, both of which are required for full repression in daughters. Genome-wide ChIP identified greater than 200 additional sites of co-localization of these factors, primarily within large, intergenic regions from which they could regulate adjacent genes. Most Ash1 binding sites are in nucleosome depleted regions (NDRs), while a small number overlap nucleosomes, similar to HO. We demonstrate that Ash1 binding to the HO promoter does not occur in the absence of the Swi5 transcription factor, which recruits coactivators that evict nucleosomes, including the nucleosomes obscuring the Ash1 binding sites. In the absence of Swi5, artificial nucleosome depletion allowed Ash1 to bind, demonstrating that nucleosomes are inhibitory to Ash1 binding. The location of binding sites within nucleosomes may therefore be a mechanism for limiting repressive activity to periods of nucleosome eviction that are otherwise associated with activation of the promoter. Our results illustrate that activation and repression can be intricately connected, and events set in motion by an activator may also ensure the appropriate level of repression and reset the promoter for the next activation cycle. Nucleosomes inhibit both gene expression and DNA-binding by regulatory factors. Here we examine the role of nucleosomes in regulating the binding of repressive transcription factors to the complex promoter for the yeast HO gene. Ash1 is a sequence-specific DNA-binding protein, and we show that it recruits the Tup1 global repressive factor to the HO promoter. Using a method to determine where Ash1 and Tup1 are bound to DNA throughout the genome, we discovered that Tup1 is also present at most places where Ash1 binds. The majority of these sites are in “Nucleosome Depleted Regions,” or NDRs, where the absence of chromatin makes factor binding easier. We discovered that the HO promoter is an exception, in that the two places where Ash1 binds overlap nucleosomes. Activation of the HO promoter is a complex, multi-step process, and we demonstrated that chromatin factors transiently evict these nucleosomes from the HO promoter during the cell cycle, allowing Ash1 to bind and recruit Tup1. Thus, activators must evict nucleosomes from the promoter to allow the repressive machinery to bind.
Collapse
|
13
|
Serizay J, Dong Y, Jänes J, Chesney M, Cerrato C, Ahringer J. Distinctive regulatory architectures of germline-active and somatic genes in C. elegans. Genome Res 2020; 30:1752-1765. [PMID: 33093068 PMCID: PMC7706728 DOI: 10.1101/gr.265934.120] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 10/08/2020] [Indexed: 01/08/2023]
Abstract
RNA profiling has provided increasingly detailed knowledge of gene expression patterns, yet the different regulatory architectures that drive them are not well understood. To address this, we profiled and compared transcriptional and regulatory element activities across five tissues of Caenorhabditis elegans, covering ∼90% of cells. We find that the majority of promoters and enhancers have tissue-specific accessibility, and we discover regulatory grammars associated with ubiquitous, germline, and somatic tissue–specific gene expression patterns. In addition, we find that germline-active and soma-specific promoters have distinct features. Germline-active promoters have well-positioned +1 and −1 nucleosomes associated with a periodic 10-bp WW signal (W = A/T). Somatic tissue–specific promoters lack positioned nucleosomes and this signal, have wide nucleosome-depleted regions, and are more enriched for core promoter elements, which largely differ between tissues. We observe the 10-bp periodic WW signal at ubiquitous promoters in other animals, suggesting it is an ancient conserved signal. Our results show fundamental differences in regulatory architectures of germline and somatic tissue–specific genes, uncover regulatory rules for generating diverse gene expression patterns, and provide a tissue-specific resource for future studies.
Collapse
Affiliation(s)
- Jacques Serizay
- The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge CB2 1QN, United Kingdom
| | - Yan Dong
- The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge CB2 1QN, United Kingdom
| | - Jürgen Jänes
- The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge CB2 1QN, United Kingdom
| | - Michael Chesney
- The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge CB2 1QN, United Kingdom
| | - Chiara Cerrato
- The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge CB2 1QN, United Kingdom
| | - Julie Ahringer
- The Gurdon Institute and Department of Genetics, University of Cambridge, Cambridge CB2 1QN, United Kingdom
| |
Collapse
|
14
|
Zhu Y, Ong CS, Huttley GA. Machine Learning Techniques for Classifying the Mutagenic Origins of Point Mutations. Genetics 2020; 215:25-40. [PMID: 32193188 PMCID: PMC7198283 DOI: 10.1534/genetics.120.303093] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 03/05/2020] [Indexed: 11/18/2022] Open
Abstract
There is increasing interest in developing diagnostics that discriminate individual mutagenic mechanisms in a range of applications that include identifying population-specific mutagenesis and resolving distinct mutation signatures in cancer samples. Analyses for these applications assume that mutagenic mechanisms have a distinct relationship with neighboring bases that allows them to be distinguished. Direct support for this assumption is limited to a small number of simple cases, e.g., CpG hypermutability. We have evaluated whether the mechanistic origin of a point mutation can be resolved using only sequence context for a more complicated case. We contrasted single nucleotide variants originating from the multitude of mutagenic processes that normally operate in the mouse germline with those induced by the potent mutagen N-ethyl-N-nitrosourea (ENU). The considerable overlap in the mutation spectra of these two samples make this a challenging problem. Employing a new, robust log-linear modeling method, we demonstrate that neighboring bases contain information regarding point mutation direction that differs between the ENU-induced and spontaneous mutation variant classes. A logistic regression classifier exhibited strong performance at discriminating between the different mutation classes. Concordance between the feature set of the best classifier and information content analyses suggest our results can be generalized to other mutation classification problems. We conclude that machine learning can be used to build a practical classification tool to identify the mutation mechanism for individual genetic variants. Software implementing our approach is freely available under an open-source license.
Collapse
Affiliation(s)
- Yicheng Zhu
- Research School of Biology, The Australian National University, Canberra, Australian Capital Territory 2601, Australia
| | - Cheng Soon Ong
- Data61, CSIRO, Black Mountain Campus, Canberra, Australian Capital Territory 2601, Australia
- Research School of Computer Science, The Australian National University, Canberra, Australian Capital Territory 2601, Australia
| | - Gavin A Huttley
- Research School of Biology, The Australian National University, Canberra, Australian Capital Territory 2601, Australia
| |
Collapse
|
15
|
Hocher A, Rojec M, Swadling JB, Esin A, Warnecke T. The DNA-binding protein HTa from Thermoplasma acidophilum is an archaeal histone analog. eLife 2019; 8:52542. [PMID: 31710291 PMCID: PMC6877293 DOI: 10.7554/elife.52542] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Accepted: 11/10/2019] [Indexed: 02/06/2023] Open
Abstract
Histones are a principal constituent of chromatin in eukaryotes and fundamental to our understanding of eukaryotic gene regulation. In archaea, histones are widespread but not universal: several lineages have lost histone genes. What prompted or facilitated these losses and how archaea without histones organize their chromatin remains largely unknown. Here, we elucidate primary chromatin architecture in an archaeon without histones, Thermoplasma acidophilum, which harbors a HU family protein (HTa) that protects part of the genome from micrococcal nuclease digestion. Charting HTa-based chromatin architecture in vitro, in vivo and in an HTa-expressing E. coli strain, we present evidence that HTa is an archaeal histone analog. HTa preferentially binds to GC-rich sequences, exhibits invariant positioning throughout the growth cycle, and shows archaeal histone-like oligomerization behavior. Our results suggest that HTa, a DNA-binding protein of bacterial origin, has converged onto an architectural role filled by histones in other archaea.
Collapse
Affiliation(s)
- Antoine Hocher
- MRC London Institute of Medical Sciences (LMS), London, United Kingdom.,Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College, London, United Kingdom
| | - Maria Rojec
- MRC London Institute of Medical Sciences (LMS), London, United Kingdom.,Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College, London, United Kingdom
| | - Jacob B Swadling
- MRC London Institute of Medical Sciences (LMS), London, United Kingdom.,Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College, London, United Kingdom
| | - Alexander Esin
- MRC London Institute of Medical Sciences (LMS), London, United Kingdom.,Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College, London, United Kingdom
| | - Tobias Warnecke
- MRC London Institute of Medical Sciences (LMS), London, United Kingdom.,Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College, London, United Kingdom
| |
Collapse
|
16
|
Jabbari K, Chakraborty M, Wiehe T. DNA sequence-dependent chromatin architecture and nuclear hubs formation. Sci Rep 2019; 9:14646. [PMID: 31601866 PMCID: PMC6787200 DOI: 10.1038/s41598-019-51036-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Accepted: 09/18/2019] [Indexed: 02/08/2023] Open
Abstract
In this study, by exploring chromatin conformation capture data, we show that the nuclear segregation of Topologically Associated Domains (TADs) is contributed by DNA sequence composition. GC-peaks and valleys of TADs strongly influence interchromosomal interactions and chromatin 3D structure. To gain insight on the compositional and functional constraints associated with chromatin interactions and TADs formation, we analysed intra-TAD and intra-loop GC variations. This led to the identification of clear GC-gradients, along which, the density of genes, super-enhancers, transcriptional activity, and CTCF binding sites occupancy co-vary non-randomly. Further, the analysis of DNA base composition of nucleolar aggregates and nuclear speckles showed strong sequence-dependant effects. We conjecture that dynamic DNA binding affinity and flexibility underlay the emergence of chromatin condensates, their growth is likely promoted in mechanically soft regions (GC-rich) of the lowest chromatin and nucleosome densities. As a practical perspective, the strong linear association between sequence composition and interchromosomal contacts can help define consensus chromatin interactions, which in turn may be used to study alternative states of chromatin architecture.
Collapse
Affiliation(s)
- Kamel Jabbari
- Institute for Genetics, Biocenter Cologne, University of Cologne, Zülpicher Straße 47a, 50674, Köln, Germany.
| | - Maharshi Chakraborty
- Institute for Genetics, Biocenter Cologne, University of Cologne, Zülpicher Straße 47a, 50674, Köln, Germany
| | - Thomas Wiehe
- Institute for Genetics, Biocenter Cologne, University of Cologne, Zülpicher Straße 47a, 50674, Köln, Germany
| |
Collapse
|
17
|
He Y, Lu J, Ye Z, Hao S, Wang L, Kohli M, Tindall DJ, Li B, Zhu R, Wang L, Huang H. Androgen receptor splice variants bind to constitutively open chromatin and promote abiraterone-resistant growth of prostate cancer. Nucleic Acids Res 2019; 46:1895-1911. [PMID: 29309643 PMCID: PMC5829742 DOI: 10.1093/nar/gkx1306] [Citation(s) in RCA: 74] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Accepted: 12/20/2017] [Indexed: 11/13/2022] Open
Abstract
Androgen receptor (AR) splice variants (ARVs) are implicated in development of castration-resistant prostate cancer (CRPC). Upregulation of ARVs often correlates with persistent AR activity after androgen deprivation therapy (ADT). However, the genomic and epigenomic characteristics of ARV-dependent cistrome and the disease relevance of ARV-mediated transcriptome remain elusive. Through integrated chromatin immunoprecipitation coupled sequencing (ChIP-seq) and RNA sequencing (RNA-seq) analysis, we identified ARV-preferential-binding sites (ARV-PBS) and a set of genes preferentially transactivated by ARVs in CRPC cells. ARVs preferentially bind to enhancers located in nucleosome-depleted regions harboring the full AR-response element (AREfull), while full-length AR (ARFL)-PBS are enhancers resided in closed chromatin regions containing the composite FOXA1-nnnn-AREhalf motif. ARV-PBS exclusively overlapped with AR binding sites in castration-resistant (CR) tumors in patients and ARV-preferentially activated genes were up-regulated in abiraterone-resistant patient specimens. Expression of ARV-PBS target genes, such as oncogene RAP2A and cell cycle gene E2F7, were significantly associated with castration resistance, poor survival and tumor progression. We uncover distinct genomic and epigenomic features of ARV-PBS, highlighting that ARVs are useful tools to depict AR-regulated oncogenic genome and epigenome landscapes in prostate cancer. Our data also suggest that the ARV-preferentially activated transcriptional program could be targeted for effective treatment of CRPC.
Collapse
Affiliation(s)
- Yundong He
- Department of Biochemistry and Molecular Biology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Ji Lu
- Department of Biochemistry and Molecular Biology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Zhenqing Ye
- Division of Biomedical Statistics and Informatics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Siyuan Hao
- Department of Urology, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Liewei Wang
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Manish Kohli
- Department of Oncology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Donald J Tindall
- Department of Biochemistry and Molecular Biology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA.,Department of Urology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Benyi Li
- Department of Urology, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Runzhi Zhu
- Department of Urology, University of Kansas Medical Center, Kansas City, KS 66160, USA.,Center for Cell Therapy, The Affiliated Hospital of Jiangsu University, Zhenjiang, Jiangsu 212001, China
| | - Liguo Wang
- Division of Biomedical Statistics and Informatics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Haojie Huang
- Department of Biochemistry and Molecular Biology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA.,Department of Urology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA.,Mayo Clinic Cancer Center, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| |
Collapse
|
18
|
Datta S, Patel M, Patel D, Singh U. Distinct DNA Sequence Preference for Histone Occupancy in Primary and Transformed Cells. Cancer Inform 2019; 18:1176935119843835. [PMID: 31037026 PMCID: PMC6475841 DOI: 10.1177/1176935119843835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2019] [Accepted: 03/24/2019] [Indexed: 11/15/2022] Open
Abstract
Genome-wide occupancy of several histone modifications in various cell types has been studied using chromatin immunoprecipitation (ChIP) sequencing. Histone occupancy depends on DNA sequence features like inter-strand symmetry of base composition and periodic occurrence of TT/AT. However, whether DNA sequence motifs act as an additional effector of histone occupancy is not known. We have analyzed the presence of DNA sequence motifs in publicly available ChIP-sequence datasets for different histone modifications. Our results show that DNA sequence motifs are associated with histone occupancy, some of which are different between primary and transformed cells. The motifs for primary and transformed cells showed different levels of GC-richness and proximity to transcription start sites (TSSs). The TSSs associated with transformed or primary cell-specific motifs showed different levels of TSS flank transcription in primary and transformed cells. Interestingly, TSSs with a motif-linked occupancy of H2AFZ, a component of positioned nucleosomes, showed a distinct pattern of RNA Polymerase II (POLR2A) occupancy and TSS flank transcription in primary and transformed cells. These results indicate that DNA sequence features dictate differential histone occupancy in primary and transformed cells, and the DNA sequence motifs affect transcription through regulation of histone occupancy.
Collapse
Affiliation(s)
| | | | - Divyesh Patel
- HoMeCell Lab, Discipline of Biological Engineering, Indian Institute of Technology Gandhinagar, Gandhinagar, India
| | - Umashankar Singh
- HoMeCell Lab, Discipline of Biological Engineering, Indian Institute of Technology Gandhinagar, Gandhinagar, India
| |
Collapse
|
19
|
The Role of Nucleosomes in Epigenetic Gene Regulation. Clin Epigenetics 2019. [DOI: 10.1007/978-981-13-8958-0_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
|
20
|
Malkowska M, Zubek J, Plewczynski D, Wyrwicz LS. ShapeGTB: the role of local DNA shape in prioritization of functional variants in human promoters with machine learning. PeerJ 2018; 6:e5742. [PMID: 30519505 PMCID: PMC6275119 DOI: 10.7717/peerj.5742] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Accepted: 09/13/2018] [Indexed: 02/01/2023] Open
Abstract
Motivation The identification of functional sequence variations in regulatory DNA regions is one of the major challenges of modern genetics. Here, we report results of a combined multifactor analysis of properties characterizing functional sequence variants located in promoter regions of genes. Results We demonstrate that GC-content of the local sequence fragments and local DNA shape features play significant role in prioritization of functional variants and outscore features related to histone modifications, transcription factors binding sites, or evolutionary conservation descriptors. Those observations allowed us to build specialized machine learning classifier identifying functional single nucleotide polymorphisms within promoter regions—ShapeGTB. We compared our method with more general tools predicting pathogenicity of all non-coding variants. ShapeGTB outperformed them by a wide margin (average precision 0.93 vs. 0.47–0.55). On the external validation set based on ClinVar database it displayed worse performance but was still competitive with other methods (average precision 0.47 vs. 0.23–0.42). Such results suggest unique characteristics of mutations located within promoter regions and are a promising signal for the development of more accurate variant prioritization tools in the future.
Collapse
Affiliation(s)
- Maja Malkowska
- Laboratory of Bioinformatics and Biostatistics, Maria Sklodowska-Curie Memorial Cancer Centre and Institute of Oncology, Warsaw, Poland
| | - Julian Zubek
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw, Poland
| | - Dariusz Plewczynski
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw, Poland.,Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | - Lucjan S Wyrwicz
- Laboratory of Bioinformatics and Biostatistics, Maria Sklodowska-Curie Memorial Cancer Centre and Institute of Oncology, Warsaw, Poland
| |
Collapse
|
21
|
|
22
|
Li W, Thanos D, Provata A. Quantifying local randomness in human DNA and RNA sequences using Erdös motifs. J Theor Biol 2018; 461:41-50. [PMID: 30336158 DOI: 10.1016/j.jtbi.2018.09.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 08/14/2018] [Accepted: 09/25/2018] [Indexed: 10/28/2022]
Abstract
In 1932, Paul Erdös asked whether a random walk constructed from a binary sequence can achieve the lowest possible deviation (lowest discrepancy), for the sequence itself and for all its subsequences formed by homogeneous arithmetic progressions. Although avoiding low discrepancy is impossible for infinite sequences, as recently proven by Terence Tao, attempts were made to construct such sequences with finite lengths. We recognize that such constructed sequences (we call these "Erdös sequences") exhibit certain hallmarks of randomness at the local level: they show roughly equal frequencies of short subsequences, and at the same time exclude trivial periodic patterns. For the human DNA we examine the frequency of a set of Erdös motifs of length-10 using three nucleotides-to-binary mappings. The particular length-10 Erdös sequence is derived from the length-11 Mathias sequence and is identical with the first 10 digits of the Thue-Morse sequence, underscoring the fact that both are deficient in periodicities. Our calculations indicate that: (1) the purine(A and G)/pyridimine(C and T) based Erdös motifs are greatly underrepresented in the human genome, (2) the strong(G and C)/weak(A and T) based Erdös motifs are slightly overrepresented, (3) the densities of the two are negatively correlated, (4) the Erdös motifs based on all three mappings being combined are slightly underrepresented, and (5) the strong/weak based Erdös motifs are greatly overrepresented in the human messenger RNA sequences.
Collapse
Affiliation(s)
- Wentian Li
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, USA.
| | - Dimitrios Thanos
- Department of Mathematics, National and Kapodistrian University of Athens, Athens GR-15784, Greece; Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", Athens GR-15341, Greece
| | - Astero Provata
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", Athens GR-15341, Greece
| |
Collapse
|
23
|
Tahir M, Hayat M, Khan SA. iNuc-ext-PseTNC: an efficient ensemble model for identification of nucleosome positioning by extending the concept of Chou's PseAAC to pseudo-tri-nucleotide composition. Mol Genet Genomics 2018; 294:199-210. [PMID: 30291426 DOI: 10.1007/s00438-018-1498-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Accepted: 09/28/2018] [Indexed: 10/28/2022]
Abstract
Nucleosome is a central element of eukaryotic chromatin, which composes of histone proteins and DNA molecules. It performs vital roles in many eukaryotic intra-nuclear processes, for instance, chromatin structure and transcriptional regulation formation. Identification of nucleosome positioning via wet lab is difficult; so, the attention is diverted towards the accurate intelligent automated prediction. In this regard, a novel intelligent automated model "iNuc-ext-PseTNC" is developed to identify the nucleosome positioning in genomes accurately. In this predictor, the sequences of DNA are mathematically represented by two different discrete feature extraction techniques, namely pseudo-tri-nucleotide composition (PseTNC) and pseudo-di-nucleotide composition. Several contemporary machine learning algorithms were examined. Further, the predictions of individual classifiers were integrated through an evolutionary genetic algorithm. The success rates of the ensemble model are higher than individual classifiers. After analyzing the prediction results, it is noticed that iNuc-ext-PseTNC model has achieved better performance in combination with PseTNC feature space, which are 94.3%, 93.14%, and 88.60% of accuracies using six-fold cross-validation test for the three benchmark datasets S1, S2, and S3, respectively. The achieved outcomes exposed that the results of iNuc-ext-PseTNC model are prominent compared to the existing methods so far notifiable in the literature. It is ascertained that the proposed model might be more fruitful and a practical tool for rudimentary academia and research.
Collapse
Affiliation(s)
- Muhammad Tahir
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KP, Pakistan
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KP, Pakistan.
| | - Sher Afzal Khan
- Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KP, Pakistan
| |
Collapse
|
24
|
Zhao H, Zhang F, Guo M, Xing Y, Liu G, Zhao X, Cai L. The affinity of DNA sequences containing R5Y5 motif and TA repeats with 10.5-bp periodicity to histone octamer in vitro. J Biomol Struct Dyn 2018; 37:1935-1943. [PMID: 30044196 DOI: 10.1080/07391102.2018.1477621] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Nucleosome positioning along the genome is partially determined by the intrinsic DNA sequence preferences on histone. RRRRRYYYYY (R5Y5, R = Purine and Y = Pyrimidine) motif in nucleosome DNA, which was presented based on several theoretical models by Trifonov et al., might be a facilitating sequence pattern for nucleosome assembly. However, there is not a high conformity experimental evidence to support the concept that R5Y5 motif is a key element for the determination of nucleosome positioning. In this work, the ability of the canonical, H2A.Z- and H3.3-containing octamers to assemble nucleosome on DNA templates containing R5Y5 motif and TA repeats within 10.5-bp periodicity was investigated by using salt-dialysis method in vitro. The results showed that the10.5-bp periodical distributions of both R5Y5 motif and TA repeats along DNA templates can significantly promote canonical nucleosome assembly and may be key sequence factors for canonical nucleosome assembly. Compared with TA repeats within 10.5-bp periodicity, R5Y5 motif in DNA templates did not elevate H2A.Z- and H3.3-containing nucleosome formation efficiency in vitro. This result indicates that R5Y5 motif probably isn't a pivotal factor to regulate nucleosome assembly on histone variants. It is speculated that the regulatory mechanism of nucleosome assembly is different between canonical and variant histone. These conclusions can provide a deeper insight on the mechanism of nucleosome positioning. Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Hongyu Zhao
- a School of Life Science and Technology , Inner Mongolia University of Science and Technology , Baotou , China.,b Inner Mongolia Key Laboratory of Functional Genome Bioinformatics , Inner Mongolia University of Science and Technology , Baotou , China
| | - Fenghui Zhang
- a School of Life Science and Technology , Inner Mongolia University of Science and Technology , Baotou , China
| | - Mingxin Guo
- a School of Life Science and Technology , Inner Mongolia University of Science and Technology , Baotou , China
| | - Yongqiang Xing
- a School of Life Science and Technology , Inner Mongolia University of Science and Technology , Baotou , China.,b Inner Mongolia Key Laboratory of Functional Genome Bioinformatics , Inner Mongolia University of Science and Technology , Baotou , China
| | - Guoqing Liu
- a School of Life Science and Technology , Inner Mongolia University of Science and Technology , Baotou , China.,b Inner Mongolia Key Laboratory of Functional Genome Bioinformatics , Inner Mongolia University of Science and Technology , Baotou , China
| | - Xiujuan Zhao
- a School of Life Science and Technology , Inner Mongolia University of Science and Technology , Baotou , China.,b Inner Mongolia Key Laboratory of Functional Genome Bioinformatics , Inner Mongolia University of Science and Technology , Baotou , China
| | - Lu Cai
- a School of Life Science and Technology , Inner Mongolia University of Science and Technology , Baotou , China.,b Inner Mongolia Key Laboratory of Functional Genome Bioinformatics , Inner Mongolia University of Science and Technology , Baotou , China
| |
Collapse
|
25
|
Liu G, Liu GJ, Tan JX, Lin H. DNA physical properties outperform sequence compositional information in classifying nucleosome-enriched and -depleted regions. Genomics 2018; 111:1167-1175. [PMID: 30055231 DOI: 10.1016/j.ygeno.2018.07.013] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Revised: 07/07/2018] [Accepted: 07/15/2018] [Indexed: 12/15/2022]
Abstract
The nucleosome is the fundamental structural unit of eukaryotic chromatin and plays an essential role in the epigenetic regulation of cellular processes, such as DNA replication, recombination, and transcription. Hence, it is important to identify nucleosome positions in the genome. Our previous model based on DNA deformation energy, in which a set of DNA physical descriptors was used, performed well in predicting nucleosome dyad positions and occupancy. In this study, we established a machine-learning model for predicting nucleosome occupancy in order to further verify the physical descriptors. Results showed that (1) our model outperformed several other sequence compositional information-based models, indicating a stronger dependence of nucleosome positioning on DNA physical properties; (2) nucleosome-enriched and -depleted regions have distinct features in terms of DNA physical descriptors like sequence-dependent flexibility and equilibrium structure parameters; (3) gene transcription start sites and termination sites can be well characterized with the distribution patterns of the physical descriptors, indicating the regulatory role of DNA physical properties in gene transcription. In addition, we developed a web server for the model, which is freely accessible at http://lin-group.cn/server/iNuc-force/.
Collapse
Affiliation(s)
- Guoqing Liu
- The School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou 014010, China.
| | - Guo-Jun Liu
- School of Natural Sciences and Mathematics, Ural Federal University, Ekaterinburg 620000, Russia
| | - Jiu-Xin Tan
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hao Lin
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.
| |
Collapse
|
26
|
Ma W, Yang L, Rohs R, Noble WS. DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding. Bioinformatics 2018; 33:3003-3010. [PMID: 28541376 PMCID: PMC5870879 DOI: 10.1093/bioinformatics/btx336] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Accepted: 05/23/2017] [Indexed: 01/07/2023] Open
Abstract
Motivation Transcription factors (TFs) bind to specific DNA sequence motifs. Several lines of evidence suggest that TF-DNA binding is mediated in part by properties of the local DNA shape: the width of the minor groove, the relative orientations of adjacent base pairs, etc. Several methods have been developed to jointly account for DNA sequence and shape properties in predicting TF binding affinity. However, a limitation of these methods is that they typically require a training set of aligned TF binding sites. Results We describe a sequence + shape kernel that leverages DNA sequence and shape information to better understand protein-DNA binding preference and affinity. This kernel extends an existing class of k-mer based sequence kernels, based on the recently described di-mismatch kernel. Using three in vitro benchmark datasets, derived from universal protein binding microarrays (uPBMs), genomic context PBMs (gcPBMs) and SELEX-seq data, we demonstrate that incorporating DNA shape information improves our ability to predict protein-DNA binding affinity. In particular, we observe that (i) the k-spectrum + shape model performs better than the classical k-spectrum kernel, particularly for small k values; (ii) the di-mismatch kernel performs better than the k-mer kernel, for larger k; and (iii) the di-mismatch + shape kernel performs better than the di-mismatch kernel for intermediate k values. Availability and implementation The software is available at https://bitbucket.org/wenxiu/sequence-shape.git. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wenxiu Ma
- Department of Statistics, University of California Riverside, Riverside, CA 92521, USA
| | - Lin Yang
- Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Remo Rohs
- Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - William Stafford Noble
- Department of Genome Sciences, Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
27
|
The implication of DNA bending energy for nucleosome positioning and sliding. Sci Rep 2018; 8:8853. [PMID: 29891930 PMCID: PMC5995830 DOI: 10.1038/s41598-018-27247-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Accepted: 05/24/2018] [Indexed: 11/24/2022] Open
Abstract
Nucleosome not only directly affects cellular processes, such as DNA replication, recombination, and transcription, but also severs as a fundamentally important target of epigenetic modifications. Our previous study indicated that the bending property of DNA is important in nucleosome formation, particularly in predicting the dyad positions of nucleosomes on a DNA segment. Here, we investigated the role of bending energy in nucleosome positioning and sliding in depth to decipher sequence-directed mechanism. The results show that bending energy is a good physical index to predict the free energy in the process of nucleosome reconstitution in vitro. Our data also imply that there are at least 20% of the nucleosomes in budding yeast do not adopt canonical positioning, in which underlying sequences wrapped around histones are structurally symmetric. We also revealed distinct patterns of bending energy profile for distinctly organized chromatin structures, such as well-positioned nucleosomes, fuzzy nucleosomes, and linker regions and discussed nucleosome sliding in terms of bending energy. We proposed that the stability of a nucleosome is positively correlated with the strength of the bending anisotropy of DNA segment, and both accessibility and directionality of nucleosome sliding is likely to be modulated by diverse patterns of DNA bending energy profile.
Collapse
|
28
|
Brunet FG, Audit B, Drillon G, Argoul F, Volff JN, Arneodo A. Evidence for DNA Sequence Encoding of an Accessible Nucleosomal Array across Vertebrates. Biophys J 2018; 114:2308-2316. [PMID: 29580552 PMCID: PMC6028776 DOI: 10.1016/j.bpj.2018.02.025] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Revised: 02/07/2018] [Accepted: 02/20/2018] [Indexed: 12/15/2022] Open
Abstract
Nucleosome-depleted regions around which nucleosomes order following the "statistical" positioning scenario were recently shown to be encoded in the DNA sequence in human. This intrinsic nucleosomal ordering strongly correlates with oscillations in the local GC content as well as with the interspecies and intraspecies mutation profiles, revealing the existence of both positive and negative selection. In this letter, we show that these predicted nucleosome inhibitory energy barriers (NIEBs) with compacted neighboring nucleosomes are indeed ubiquitous to all vertebrates tested. These 1 kb-sized chromatin patterns are widely distributed along vertebrate chromosomes, overall covering more than a third of the genome. We have previously observed in human deviations from neutral evolution at these genome-wide distributed regions, which we interpreted as a possible indication of the selection of an open, accessible, and dynamic nucleosomal array to constitutively facilitate the epigenetic regulation of nuclear functions in a cell-type-specific manner. As a first, very appealing observation supporting this hypothesis, we report evidence of a strong association between NIEB borders and the poly(A) tails of Alu sequences in human. These results suggest that NIEBs provide adequate chromatin patterns favorable to the integration of Alu retrotransposons and, more generally to various transposable elements in the genomes of primates and other vertebrates.
Collapse
Affiliation(s)
- Frédéric G Brunet
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, Lyon, France
| | - Benjamin Audit
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France
| | - Guénola Drillon
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France
| | - Françoise Argoul
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France; LOMA, Université de Bordeaux, CNRS UMR 5798, Talence, France
| | - Jean-Nicolas Volff
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, Lyon, France
| | - Alain Arneodo
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France; LOMA, Université de Bordeaux, CNRS UMR 5798, Talence, France.
| |
Collapse
|
29
|
Jia C, Yang Q, Zou Q. NucPosPred: Predicting species-specific genomic nucleosome positioning via four different modes of general PseKNC. J Theor Biol 2018; 450:15-21. [PMID: 29678692 DOI: 10.1016/j.jtbi.2018.04.025] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Revised: 04/13/2018] [Accepted: 04/16/2018] [Indexed: 11/20/2022]
Abstract
The nucleosome is the basic structure of chromatin in eukaryotic cells, with essential roles in the regulation of many biological processes, such as DNA transcription, replication and repair, and RNA splicing. Because of the importance of nucleosomes, the factors that determine their positioning within genomes should be investigated. High-resolution nucleosome-positioning maps are now available for organisms including Saccharomyces cerevisiae, Drosophila melanogaster and Caenorhabditis elegans, enabling the identification of nucleosome positioning by application of computational tools. Here, we describe a novel predictor called NucPosPred, which was specifically designed for large-scale identification of nucleosome positioning in C. elegans and D. melanogaster genomes. NucPosPred was separately optimized for each species for four types of DNA sequence feature extraction, with consideration of two classification algorithms (gradient-boosting decision tree and support vector machine). The overall accuracy obtained with NucPosPred was 92.29% for C. elegans and 88.26% for D. melanogaster, outperforming previous methods and demonstrating the potential for species-specific prediction of nucleosome positioning. For the convenience of most experimental scientists, a web-server for the predictor NucPosPred is available at http://121.42.167.206/NucPosPred/index.jsp.
Collapse
Affiliation(s)
- Cangzhi Jia
- Science of College, Dalian Maritime University, No. 1 Linghai Road, Dalian 116026, China.
| | - Qing Yang
- Science of College, Dalian Maritime University, No. 1 Linghai Road, Dalian 116026, China
| | - Quan Zou
- School of Computer Science and Technology, Tianjin University, Tianjin, China.
| |
Collapse
|
30
|
Modes of Interaction of KMT2 Histone H3 Lysine 4 Methyltransferase/COMPASS Complexes with Chromatin. Cells 2018; 7:cells7030017. [PMID: 29498679 PMCID: PMC5870349 DOI: 10.3390/cells7030017] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Revised: 02/22/2018] [Accepted: 02/27/2018] [Indexed: 02/07/2023] Open
Abstract
Regulation of gene expression is achieved by sequence-specific transcriptional regulators, which convey the information that is contained in the sequence of DNA into RNA polymerase activity. This is achieved by the recruitment of transcriptional co-factors. One of the consequences of co-factor recruitment is the control of specific properties of nucleosomes, the basic units of chromatin, and their protein components, the core histones. The main principles are to regulate the position and the characteristics of nucleosomes. The latter includes modulating the composition of core histones and their variants that are integrated into nucleosomes, and the post-translational modification of these histones referred to as histone marks. One of these marks is the methylation of lysine 4 of the core histone H3 (H3K4). While mono-methylation of H3K4 (H3K4me1) is located preferentially at active enhancers, tri-methylation (H3K4me3) is a mark found at open and potentially active promoters. Thus, H3K4 methylation is typically associated with gene transcription. The class 2 lysine methyltransferases (KMTs) are the main enzymes that methylate H3K4. KMT2 enzymes function in complexes that contain a necessary core complex composed of WDR5, RBBP5, ASH2L, and DPY30, the so-called WRAD complex. Here we discuss recent findings that try to elucidate the important question of how KMT2 complexes are recruited to specific sites on chromatin. This is embedded into short overviews of the biological functions of KMT2 complexes and the consequences of H3K4 methylation.
Collapse
|
31
|
High-Resolution Genome-Wide Mapping of Nucleosome Positioning and Occupancy Level Using Paired-End Sequencing Technology. Methods Mol Biol 2018; 1528:229-243. [PMID: 27854025 DOI: 10.1007/978-1-4939-6630-1_14] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
Because of its profound influence on DNA accessibility for protein binding and thus on the regulation of diverse biological processes, nucleosome positioning has been studied for many years. In the past decade, high-throughput sequencing technologies have opened new perspectives in this research field by allowing the study of nucleosome positioning and occupancy on a genome-wide scale, therefore providing understanding on important aspects of chromatin packaging, as well as on various chromatin-template processes like transcription. In this chapter, we provide the protocol of MNase sequencing for the genome-wide mapping of nucleosomes using MNase to generate mononucleosomal DNA fragments and next-generation sequencing technology to identify their individual location.
Collapse
|
32
|
Nanan KK, Ocheltree C, Sturgill D, Mandler MD, Prigge M, Varma G, Oberdoerffer S. Independence between pre-mRNA splicing and DNA methylation in an isogenic minigene resource. Nucleic Acids Res 2017; 45:12780-12797. [PMID: 29244186 PMCID: PMC5727405 DOI: 10.1093/nar/gkx900] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Revised: 09/13/2017] [Accepted: 09/25/2017] [Indexed: 12/27/2022] Open
Abstract
Actively transcribed genes adopt a unique chromatin environment with characteristic patterns of enrichment. Within gene bodies, H3K36me3 and cytosine DNA methylation are elevated at exons of spliced genes and have been implicated in the regulation of pre-mRNA splicing. H3K36me3 is further responsive to splicing, wherein splicing inhibition led to a redistribution and general reduction over gene bodies. In contrast, little is known of the mechanisms supporting elevated DNA methylation at actively spliced genic locations. Recent evidence associating the de novo DNA methyltransferase Dnmt3b with H3K36me3-rich chromatin raises the possibility that genic DNA methylation is influenced by splicing-associated H3K36me3. Here, we report the generation of an isogenic resource to test the direct impact of splicing on chromatin. A panel of minigenes of varying splicing potential were integrated into a single FRT site for inducible expression. Profiling of H3K36me3 confirmed the established relationship to splicing, wherein levels were directly correlated with splicing efficiency. In contrast, DNA methylation was equivalently detected across the minigene panel, irrespective of splicing and H3K36me3 status. In addition to revealing a degree of independence between genic H3K36me3 and DNA methylation, these findings highlight the generated minigene panel as a flexible platform for the query of splicing-dependent chromatin modifications.
Collapse
Affiliation(s)
- Kyster K. Nanan
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Cody Ocheltree
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - David Sturgill
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Mariana D. Mandler
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Maria Prigge
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Garima Varma
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Shalini Oberdoerffer
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
33
|
Angarica VE, Del Sol A. Bioinformatics Tools for Genome-Wide Epigenetic Research. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2017; 978:489-512. [PMID: 28523562 DOI: 10.1007/978-3-319-53889-1_25] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Epigenetics play a central role in the regulation of many important cellular processes, and dysregulations at the epigenetic level could be the source of serious pathologies, such as neurological disorders affecting brain development, neurodegeneration, and intellectual disability. Despite significant technological advances for epigenetic profiling, there is still a need for a systematic understanding of how epigenetics shapes cellular circuitry, and disease pathogenesis. The development of accurate computational approaches for analyzing complex epigenetic profiles is essential for disentangling the mechanisms underlying cellular development, and the intricate interaction networks determining and sensing chromatin modifications and DNA methylation to control gene expression. In this chapter, we review the recent advances in the field of "computational epigenetics," including computational methods for processing different types of epigenetic data, prediction of chromatin states, and study of protein dynamics. We also discuss how "computational epigenetics" has complemented the fast growth in the generation of epigenetic data for uncovering the main differences and similarities at the epigenetic level between individuals and the mechanisms underlying disease onset and progression.
Collapse
Affiliation(s)
- Vladimir Espinosa Angarica
- Computational Biology Group, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4366 Belvaux, Luxembourg.
| | - Antonio Del Sol
- Computational Biology Group, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4366 Belvaux, Luxembourg
| |
Collapse
|
34
|
Meng H, Li H, Zheng Y, Yang Z, Jia Y, Bo S. Evolutionary analysis of nucleosome positioning sequences based on New Symmetric Relative Entropy. Genomics 2017; 110:154-161. [PMID: 28917635 DOI: 10.1016/j.ygeno.2017.09.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Revised: 09/06/2017] [Accepted: 09/12/2017] [Indexed: 10/18/2022]
Abstract
New Symmetric Relative Entropy (NSRE) was applied innovatively to analyze the nucleosome sequences in S. cerevisiae, S. pombe and Drosophila. NSRE distributions could well reflect the characteristic differences of nucleosome sequences among three organisms, and the differences indicate a concerted evolution in the sequence usage of nucleosome. Further analysis about the nucleosomes around TSS shows that the constitutive property of +1/-1 nucleosomes in S. cerevisiae is different from that in S. pombe and Drosophila, which indicates that S. cerevisiae has a different transcription regulation mechanism based on nucleosome. However, in either case, the nucleosome dyad region is conserved and always has a higher NSRE. Base composition analysis shows that this conservative property in nucleosome dyad region is mainly determined by base A and T, and the dependence degrees on base A and T are consistent in three organisms.
Collapse
Affiliation(s)
- Hu Meng
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Hong Li
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China.
| | - Yan Zheng
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Zhenhua Yang
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Yun Jia
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Suling Bo
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| |
Collapse
|
35
|
Application of Synthetic Tumor-Specific Promoters Responsive to the Tumor Microenvironment. Methods Mol Biol 2017. [PMID: 28801910 DOI: 10.1007/978-1-4939-7223-4_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
Activity of endogenous promoters can be altered by including additional responsive elements (REs). These elements can be responsive to features of the tumor environment or alternatively to signaling pathways specifically activated in cancer cells. These REs incorporated into tumor-specific promoters can improve cancer targeting, the replicative capacity, and lytic activity of conditionally replicative adenovirus. Here we outline an approach to incorporate hypoxia and inflammation REs into a specific fragment of the SPARC promoter and the steps to clone a nucleosome positioning sequence (NPS ) identified in the osteocalcin promoter that contains a Wnt RE upstream of a heterologous synthetic promoter.
Collapse
|
36
|
How does chromatin package DNA within nucleus and regulate gene expression? Int J Biol Macromol 2017; 101:862-881. [PMID: 28366861 DOI: 10.1016/j.ijbiomac.2017.03.165] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2017] [Revised: 03/28/2017] [Accepted: 03/28/2017] [Indexed: 01/26/2023]
Abstract
The human body is made up of 60 trillion cells, each cell containing 2 millions of genomic DNA in its nucleus. How is this genomic deoxyribonucleic acid [DNA] organised into nuclei? Around 1880, W. Flemming discovered a nuclear substance that was clearly visible on staining under primitive light microscopes and named it 'chromatin'; this is now thought to be the basic unit of genomic DNA organization. Since long before DNA was known to carry genetic information, chromatin has fascinated biologists. DNA has a negatively charged phosphate backbone that produces electrostatic repulsion between adjacent DNA regions, making it difficult for DNA to fold upon itself. In this article, we will try to shed light on how does chromatin package DNA within nucleus and regulate gene expression?
Collapse
|
37
|
Sexton BS, Druliner BR, Vera DL, Avey D, Zhu F, Dennis JH. Hierarchical regulation of the genome: global changes in nucleosome organization potentiate genome response. Oncotarget 2016; 7:6460-75. [PMID: 26771136 PMCID: PMC4872727 DOI: 10.18632/oncotarget.6841] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2015] [Accepted: 12/28/2015] [Indexed: 11/25/2022] Open
Abstract
Nucleosome occupancy is critically important in regulating access to the eukaryotic genome. Few studies in human cells have measured genome-wide nucleosome distributions at high temporal resolution during a response to a common stimulus. We measured nucleosome distributions at high temporal resolution following Kaposi's-sarcoma-associated herpesvirus (KSHV) reactivation using our newly developed mTSS-seq technology, which maps nucleosome distribution at the transcription start sites (TSS) of all human genes. Nucleosomes underwent widespread changes in organization 24 hours after KSHV reactivation and returned to their basal nucleosomal architecture 48 hours after KSHV reactivation. The widespread changes consisted of an indiscriminate remodeling event resulting in the loss of nucleosome rotational phasing signals. Additionally, one in six TSSs in the human genome possessed nucleosomes that are translationally remodeled. 72% of the loci with translationally remodeled nucleosomes have nucleosomes that moved to positions encoded by the underlying DNA sequence. Finally we demonstrated that these widespread alterations in nucleosomal architecture potentiated regulatory factor binding. These descriptions of nucleosomal architecture changes provide a new framework for understanding the role of chromatin in the genomic response, and have allowed us to propose a hierarchical model for chromatin-based regulation of genome response.
Collapse
Affiliation(s)
- Brittany S Sexton
- Department of Biological Science, The Florida State University, Tallahassee, Florida, USA
| | - Brooke R Druliner
- Department of Biological Science, The Florida State University, Tallahassee, Florida, USA.,Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota, USA
| | - Daniel L Vera
- Department of Biological Science, The Florida State University, Tallahassee, Florida, USA.,The Center for Genomics and Personalized Medicine The Florida State University, Tallahassee, Florida, USA
| | - Denis Avey
- Department of Biological Science, The Florida State University, Tallahassee, Florida, USA
| | - Fanxiu Zhu
- Department of Biological Science, The Florida State University, Tallahassee, Florida, USA
| | - Jonathan H Dennis
- Department of Biological Science, The Florida State University, Tallahassee, Florida, USA
| |
Collapse
|
38
|
Yang D, Ioshikhes I. Drosophila H2A and H2A.Z Nucleosome Sequences Reveal Different Nucleosome Positioning Sequence Patterns. J Comput Biol 2016; 24:289-298. [PMID: 27992255 DOI: 10.1089/cmb.2016.0173] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Nucleosomes are implicated in transcriptional regulation as well as in packing and stabilizing the DNA. Nucleosome positions affect the transcription by impeding or facilitating the binding of transcription factors. The DNA sequence, especially the periodic occurrences of dinucleotides, is a major factor that affects the nucleosome positioning. We analyzed the Drosophila DNA sequences bound by H2A and H2A.Z nucleosomes. Periodic patterns of dinucleotides (weak-weak/strong-strong or purine-purine/pyrimidine-pyrimidine) were identified as WW/SS and RR/YY nucleosome positioning sequence (NPS) patterns. The WW/SS NPS pattern of the H2A nucleosome has a 10-bp period of weak-weak/strong-strong (W = A or T; S = G or C) dinucleotides. The 10-bp periodicity, however, is disrupted in the middle of the sequence. At the dyad, the SS dinucleotide is preferred. On the other hand, the RR/YY NPS pattern has an 18-bp periodicity of purine-purine/pyrimidine-pyrimidine (R = A or G; Y = T or C) dinucleotides. The NPS patterns from H2A.Z nucleosomes differ from the NPS patterns from H2A nucleosomes. The RR/YY pattern of H2A.Z nucleosomes has major peaks shifted by 10 bp deviated from the H2A nucleosome pattern. The H2A and H2A.Z nucleosomes have different sequence preferences. The shifted peaks coincide with DNA regions interacting with the histone loops.
Collapse
Affiliation(s)
- Doo Yang
- 1 Ottawa Institute of Systems Biology, University of Ottawa , Ottawa, Ontario, Canada .,2 Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa , Ottawa, Ontario, Canada
| | - Ilya Ioshikhes
- 1 Ottawa Institute of Systems Biology, University of Ottawa , Ottawa, Ontario, Canada .,2 Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa , Ottawa, Ontario, Canada
| |
Collapse
|
39
|
Lu Y, Gan Y, Guan J, Zhou S. An integrative analysis of nucleosome occupancy and positioning using diverse sequence dependent properties. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.11.107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
40
|
Awazu A. Prediction of nucleosome positioning by the incorporation of frequencies and distributions of three different nucleotide segment lengths into a general pseudo k-tuple nucleotide composition. Bioinformatics 2016; 33:42-48. [PMID: 27563027 PMCID: PMC5860184 DOI: 10.1093/bioinformatics/btw562] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2016] [Revised: 08/02/2016] [Accepted: 08/19/2016] [Indexed: 11/13/2022] Open
Abstract
Motivation Nucleosome positioning plays important roles in many eukaryotic intranuclear processes, such as transcriptional regulation and chromatin structure formation. The investigations of nucleosome positioning rules provide a deeper understanding of these intracellular processes. Results Nucleosome positioning prediction was performed using a model consisting of three types of variables characterizing a DNA sequence—the number of five-nucleotide sequences, the number of three-nucleotide combinations in one period of a helix, and mono- and di-nucleotide distributions in DNA fragments. Using recently proposed stringent benchmark datasets with low biases for Saccharomyces cerevisiae, Homo sapiens, Caenorhabditis elegans and Drosophila melanogaster, the present model was shown to have a better prediction performance than the recently proposed predictors. This model was able to display the common and organism-dependent factors that affect nucleosome forming and inhibiting sequences as well. Therefore, the predictors developed here can accurately predict nucleosome positioning and help determine the key factors influencing this process. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Akinori Awazu
- Department of Mathematical and Life Sciences.,Research Center for Mathematics on Chromatin Live Dynamics, Hiroshima University, Kagami-yama 1-3-1, Higashi-Hiroshima, 739-8526, Japan
| |
Collapse
|
41
|
Drillon G, Audit B, Argoul F, Arneodo A. Evidence of selection for an accessible nucleosomal array in human. BMC Genomics 2016; 17:526. [PMID: 27472913 PMCID: PMC4966569 DOI: 10.1186/s12864-016-2880-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2015] [Accepted: 07/04/2016] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Recently, a physical model of nucleosome formation based on sequence-dependent bending properties of the DNA double-helix has been used to reveal some enrichment of nucleosome-inhibiting energy barriers (NIEBs) nearby ubiquitous human "master" replication origins. Here we use this model to predict the existence of about 1.6 millions NIEBs over the 22 human autosomes. RESULTS We show that these high energy barriers of mean size 153 bp correspond to nucleosome-depleted regions (NDRs) in vitro, as expected, but also in vivo. On either side of these NIEBs, we observe, in vivo and in vitro, a similar compacted nucleosome ordering, suggesting an absence of chromatin remodeling. This nucleosomal ordering strongly correlates with oscillations of the GC content as well as with the interspecies and intraspecies mutation profiles along these regions. Comparison of these divergence rates reveals the existence of both positive and negative selections linked to nucleosome positioning around these intrinsic NDRs. Overall, these NIEBs and neighboring nucleosomes cover 37.5 % of the human genome where nucleosome occupancy is stably encoded in the DNA sequence. These 1 kb-sized regions of intrinsic nucleosome positioning are equally found in GC-rich and GC-poor isochores, in early and late replicating regions, in intergenic and genic regions but not at gene promoters. CONCLUSION The source of selection pressure on the NIEBs has yet to be resolved in future work. One possible scenario is that these widely distributed chromatin patterns have been selected in human to impair the condensation of the nucleosomal array into the 30 nm chromatin fiber, so as to facilitate the epigenetic regulation of nuclear functions in a cell-type-specific manner.
Collapse
Affiliation(s)
- Guénola Drillon
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, F-69342 France
| | - Benjamin Audit
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, F-69342 France
| | - Françoise Argoul
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, F-69342 France
- LOMA, Université de Bordeaux, CNRS, UMR 5798, 51 Cours de le Libération, Talence, F-33405 France
| | - Alain Arneodo
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, Lyon, F-69342 France
- LOMA, Université de Bordeaux, CNRS, UMR 5798, 51 Cours de le Libération, Talence, F-33405 France
| |
Collapse
|
42
|
Whole genome nucleosome sequencing identifies novel types of forensic markers in degraded DNA samples. Sci Rep 2016; 6:26101. [PMID: 27189082 PMCID: PMC4870644 DOI: 10.1038/srep26101] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Accepted: 04/27/2016] [Indexed: 11/08/2022] Open
Abstract
In the case of mass disasters, missing persons and forensic caseworks, highly degraded biological samples are often encountered. It can be a challenge to analyze and interpret the DNA profiles from these samples. Here we provide a new strategy to solve the problem by taking advantage of the intrinsic structural properties of DNA. We have assessed the in vivo positions of more than 35 million putative nucleosome cores in human leukocytes using high-throughput whole genome sequencing, and identified 2,462 single nucleotide variations (SNVs), 128 insertion-deletion polymorphisms (indels). After comparing the sequence reads with 44 STR loci commonly used in forensics, five STRs (TH01, TPOX, D18S51, DYS391, and D10S1248)were matched. We compared these “nucleosome protected STRs” (NPSTRs) with five other non-NPSTRs using mini-STR primer design, real-time PCR, and capillary gel electrophoresis on artificially degraded DNA. Moreover, genotyping performance of the five NPSTRs and five non-NPSTRs was also tested with real casework samples. All results show that loci located in nucleosomes are more likely to be successfully genotyped in degraded samples. In conclusion, after further strict validation, these markers could be incorporated into future forensic and paleontology identification kits, resulting in higher discriminatory power for certain degraded sample types.
Collapse
|
43
|
MNase titration reveals differences between nucleosome occupancy and chromatin accessibility. Nat Commun 2016; 7:11485. [PMID: 27151365 PMCID: PMC4859066 DOI: 10.1038/ncomms11485] [Citation(s) in RCA: 145] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2015] [Accepted: 03/31/2016] [Indexed: 01/01/2023] Open
Abstract
Chromatin accessibility plays a fundamental role in gene regulation. Nucleosome placement, usually measured by quantifying protection of DNA from enzymatic digestion, can regulate accessibility. We introduce a metric that uses micrococcal nuclease (MNase) digestion in a novel manner to measure chromatin accessibility by combining information from several digests of increasing depths. This metric, MACC (MNase accessibility), quantifies the inherent heterogeneity of nucleosome accessibility in which some nucleosomes are seen preferentially at high MNase and some at low MNase. MACC interrogates each genomic locus, measuring both nucleosome location and accessibility in the same assay. MACC can be performed either with or without a histone immunoprecipitation step, and thereby compares histone and non-histone protection. We find that changes in accessibility at enhancers, promoters and other regulatory regions do not correlate with changes in nucleosome occupancy. Moreover, high nucleosome occupancy does not necessarily preclude high accessibility, which reveals novel principles of chromatin regulation.
Collapse
|
44
|
Wight A, Yang D, Ioshikhes I, Makrigiannis AP. Nucleosome Presence at AML-1 Binding Sites Inversely Correlates with Ly49 Expression: Revelations from an Informatics Analysis of Nucleosomes and Immune Cell Transcription Factors. PLoS Comput Biol 2016; 12:e1004894. [PMID: 27124577 PMCID: PMC4849748 DOI: 10.1371/journal.pcbi.1004894] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 03/31/2016] [Indexed: 12/28/2022] Open
Abstract
Beyond its role in genomic organization and compaction, the nucleosome is believed to participate in the regulation of gene transcription. Here, we report a computational method to evaluate the nucleosome sensitivity for a transcription factor over a given stretch of the genome. Sensitive factors are predicted to be those with binding sites preferentially contained within nucleosome boundaries and lacking 10 bp periodicity. Based on these criteria, the Acute Myeloid Leukemia-1a (AML-1a) transcription factor, a regulator of immune gene expression, was identified as potentially sensitive to nucleosomal regulation within the mouse Ly49 gene family. This result was confirmed in RMA, a cell line with natural expression of Ly49, using MNase-Seq to generate a nucleosome map of chromosome 6, where the Ly49 gene family is located. Analysis of this map revealed a specific depletion of nucleosomes at AML-1a binding sites in the expressed Ly49A when compared to the other, silent Ly49 genes. Our data suggest that nucleosome-based regulation contributes to the expression of Ly49 genes, and we propose that this method of predicting nucleosome sensitivity could aid in dissecting the regulatory role of nucleosomes in general. The nucleosome—a large protein complex with DNA wound around it—is the fundamental unit of genomic organization in the eukaryotic cell. More than just a DNA organizer, however, nucleosomes may control gene expression by interfering with the cell’s ability to access the wound-up DNA, as shown by recent research. In this report, we demonstrate a computational method for predicting which elements of the genome are sensitive to regulation by nucleosomes. As a proof-of-concept, we identify AML-1a binding sites—important sequences in DNA regulation—as being specifically nucleosome sensitive. We then show that AML-1a sites are specifically depleted of nucleosomes when a gene is expressed, indicating the ability for nucleosomes to suppress the expression of that gene. This finding confirms that nucleosomes are likely involved in genome regulation, and provides a method for predicting which areas of the genome are probably affected most by nucleosomes. This paper also highlights the usefulness of the Ly49 gene family in testing computer-derived genomic predictions, and is of interest to anyone studying how gene expression is regulated from cell to cell.
Collapse
Affiliation(s)
- Andrew Wight
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
| | - Doo Yang
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
- Institute of Systems Biology, University of Ottawa, Ottawa, Ontario, Canada
| | - Ilya Ioshikhes
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
- Institute of Systems Biology, University of Ottawa, Ottawa, Ontario, Canada
- * E-mail: (II); (APM)
| | - Andrew P. Makrigiannis
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
- Department of Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia, Canada
- * E-mail: (II); (APM)
| |
Collapse
|
45
|
Liu G, Xing Y, Zhao H, Wang J, Shang Y, Cai L. A deformation energy-based model for predicting nucleosome dyads and occupancy. Sci Rep 2016; 6:24133. [PMID: 27053067 PMCID: PMC4823781 DOI: 10.1038/srep24133] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2016] [Accepted: 03/21/2016] [Indexed: 12/14/2022] Open
Abstract
Nucleosome plays an essential role in various cellular processes, such as DNA replication, recombination, and transcription. Hence, it is important to decode the mechanism of nucleosome positioning and identify nucleosome positions in the genome. In this paper, we present a model for predicting nucleosome positioning based on DNA deformation, in which both bending and shearing of the nucleosomal DNA are considered. The model successfully predicted the dyad positions of nucleosomes assembled in vitro and the in vitro map of nucleosomes in Saccharomyces cerevisiae. Applying the model to Caenorhabditis elegans and Drosophila melanogaster, we achieved satisfactory results. Our data also show that shearing energy of nucleosomal DNA outperforms bending energy in nucleosome occupancy prediction and the ability to predict nucleosome dyad positions is attributed to bending energy that is associated with rotational positioning of nucleosomes.
Collapse
Affiliation(s)
- Guoqing Liu
- The Institute of Bioengineering and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China.,Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology, Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| | - Yongqiang Xing
- The Institute of Bioengineering and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Hongyu Zhao
- The Institute of Bioengineering and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Jianying Wang
- The Institute of Bioengineering and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China.,State Key Laboratory for Utilization of Bayan Obo Multi-Metallic Resources, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Yu Shang
- Computational Systems Biology Lab, Department of Biochemistry and Molecular Biology, Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA.,College of Computer Science and Technology, Jilin University, Changchun, Jilin 130021, China
| | - Lu Cai
- The Institute of Bioengineering and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| |
Collapse
|
46
|
Rube HT, Lee W, Hejna M, Chen H, Yasui DH, Hess JF, LaSalle JM, Song JS, Gong Q. Sequence features accurately predict genome-wide MeCP2 binding in vivo. Nat Commun 2016; 7:11025. [PMID: 27008915 PMCID: PMC4820824 DOI: 10.1038/ncomms11025] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2015] [Accepted: 02/15/2016] [Indexed: 01/24/2023] Open
Abstract
Methyl-CpG binding protein 2 (MeCP2) is critical for proper brain development and
expressed at near-histone levels in neurons, but the mechanism of its genomic
localization remains poorly understood. Using high-resolution MeCP2-binding data, we
show that DNA sequence features alone can predict binding with 88% accuracy.
Integrating MeCP2 binding and DNA methylation in a probabilistic graphical model, we
demonstrate that previously reported genome-wide association with methylation is in
part due to MeCP2's affinity to GC-rich chromatin, a result replicated using
published data. Furthermore, MeCP2 co-localizes with nucleosomes. Finally, MeCP2
binding downstream of promoters correlates with increased expression in
Mecp2-deficient neurons. MeCP2 is critical for proper brain development, and mutations in the
gene encoding MeCP2 are responsible for several neurological disorders. Here, the
authors show that the previously reported genome-wide preference of MeCP2 to methylated
CpGs is in part due to MeCP2's affinity to GC-rich chromatin.
Collapse
Affiliation(s)
- H Tomas Rube
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, Champaign, Illinois 61801, USA.,Department of Physics, University of Illinois, Urbana-Champaign, Champaign, Illinois 61801, USA.,Department of Biological Sciences, Columbia University, New York, New York 10027, USA
| | - Wooje Lee
- Department of Cell Biology and Human Anatomy, University of California School of Medicine, Davis, California 95616, USA.,Division of Life Sciences, Korea University, Seoul 136-713, Korea
| | - Miroslav Hejna
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, Champaign, Illinois 61801, USA.,Department of Physics, University of Illinois, Urbana-Champaign, Champaign, Illinois 61801, USA
| | - Huaiyang Chen
- Department of Cell Biology and Human Anatomy, University of California School of Medicine, Davis, California 95616, USA
| | - Dag H Yasui
- Department of Medical Microbiology and Immunology, Genome Center, MIND Institute, University of California School of Medicine, Davis, California 95616, USA
| | - John F Hess
- Department of Cell Biology and Human Anatomy, University of California School of Medicine, Davis, California 95616, USA
| | - Janine M LaSalle
- Department of Medical Microbiology and Immunology, Genome Center, MIND Institute, University of California School of Medicine, Davis, California 95616, USA
| | - Jun S Song
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, Champaign, Illinois 61801, USA.,Department of Physics, University of Illinois, Urbana-Champaign, Champaign, Illinois 61801, USA.,Department of Bioengineering, University of Illinois, Urbana-Champaign, Champaign, Illinois 61801, USA.,Department of Biostatistics and Epidemiology, University of California, San Francisco, California 94158, USA
| | - Qizhi Gong
- Department of Cell Biology and Human Anatomy, University of California School of Medicine, Davis, California 95616, USA
| |
Collapse
|
47
|
Chen W, Feng P, Ding H, Lin H, Chou KC. Using deformation energy to analyze nucleosome positioning in genomes. Genomics 2016; 107:69-75. [DOI: 10.1016/j.ygeno.2015.12.005] [Citation(s) in RCA: 87] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Revised: 12/06/2015] [Accepted: 12/22/2015] [Indexed: 12/28/2022]
|
48
|
Utro F, Di Benedetto V, Corona DF, Giancarlo R. The intrinsic combinatorial organization and information theoretic content of a sequence are correlated to the DNA encoded nucleosome organization of eukaryotic genomes. Bioinformatics 2015; 32:835-42. [DOI: 10.1093/bioinformatics/btv679] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2015] [Accepted: 11/09/2015] [Indexed: 11/14/2022] Open
Abstract
Abstract
Motivation: Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only.
Results: We contribute to close this important methodological gap between the two models by providing three very simple formulas for the sequence specific one. They are all based on well-known formulas in Computer Science and Bioinformatics, and they give different quantifications of how complex a sequence is. In view of how remarkably well they perform, it is very surprising that measures of sequence complexity have not even been considered as candidates to close the mentioned gap. We provide experimental evidence that the intrinsic level of combinatorial organization and information-theoretic content of subsequences within a genome are strongly correlated to the level of DNA encoded nucleosome organization discovered by Kaplan et al. Our results establish an important connection between the intrinsic complexity of subsequences in a genome and the intrinsic, i.e. DNA encoded, nucleosome organization of eukaryotic genomes. It is a first step towards a mathematical characterization of this latter ‘encoding’.
Supplementary information: Supplementary data are available at Bioinformatics online.
Contact: futro@us.ibm.com.
Collapse
Affiliation(s)
- Filippo Utro
- Computational Genomics Group, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA,
| | | | - Davide F.V. Corona
- Dipartimento STEBICEF, Dulbecco Telethon Institute c/o Università di Palermo, Palermo, Italy
| | | |
Collapse
|
49
|
Flickinger R. AT-rich repetitive DNA sequences, transcription frequency and germ layer determination. Mech Dev 2015; 138 Pt 3:227-32. [PMID: 26506258 DOI: 10.1016/j.mod.2015.10.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2015] [Revised: 10/19/2015] [Accepted: 10/21/2015] [Indexed: 01/30/2023]
Abstract
Non-coding sequences of frog embryo endoderm poly (A+) nuclear RNA are AU-enriched, as compared to those of ectoderm and mesoderm. Endoderm blastomeres contain much less H1 histone than is present in ectoderm and mesoderm. H1 histone preferentially binds AT-rich DNA sequences to repress their transcription. The AT-enrichment of non-coding DNA sequences transcribed into poly (A+) nuclear RNA, as well as the low amount of H1 histone, may contribute to the higher transcription frequency of mRNA of endoderm, as compared to that of ectoderm and mesoderm. A greater accumulation of H1 histone in presumptive mesoderm and ectoderm may prevent transcription of endoderm specifying genes in mesoderm and ectoderm. Experimental upregulation of various transcription factors (TFs) can redirect germ layer fate. Most of these TFs bind AT-rich consensus sequences in DNA, suggesting that H1 histone and TFs active during germ layer determination are binding similar sequences.
Collapse
Affiliation(s)
- Reed Flickinger
- Emeritus Department, Biological Sciences State University of New York at Buffalo, Buffalo, N.Y. 14260, USA.
| |
Collapse
|
50
|
Yazdi PG, Pedersen BA, Taylor JF, Khattab OS, Chen YH, Chen Y, Jacobsen SE, Wang PH. Nucleosome Organization in Human Embryonic Stem Cells. PLoS One 2015; 10:e0136314. [PMID: 26305225 PMCID: PMC4549264 DOI: 10.1371/journal.pone.0136314] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2015] [Accepted: 08/02/2015] [Indexed: 12/18/2022] Open
Abstract
The fundamental repeating unit of eukaryotic chromatin is the nucleosome. Besides being involved in packaging DNA, nucleosome organization plays an important role in transcriptional regulation and cellular identity. Currently, there is much debate about the major determinants of the nucleosome architecture of a genome and its significance with little being known about its role in stem cells. To address these questions, we performed ultra-deep sequencing of nucleosomal DNA in two human embryonic stem cell lines and integrated our data with numerous epigenomic maps. Our analyses have revealed that the genome is a determinant of nucleosome organization with transcriptionally inactive regions characterized by a “ground state” of nucleosome profiles driven by underlying DNA sequences. DNA sequence preferences are associated with heterogeneous chromatin organization around transcription start sites. Transcription, histone modifications, and DNA methylation alter this “ground state” by having distinct effects on both nucleosome positioning and occupancy. As the transcriptional rate increases, nucleosomes become better positioned. Exons transcribed and included in the final spliced mRNA have distinct nucleosome profiles in comparison to exons not included at exon-exon junctions. Genes marked by the active modification H3K4m3 are characterized by lower nucleosome occupancy before the transcription start site compared to genes marked by the inactive modification H3K27m3, while bivalent domains, genes associated with both marks, lie exactly in the middle. Combinatorial patterns of epigenetic marks (chromatin states) are associated with unique nucleosome profiles. Nucleosome organization varies around transcription factor binding in enhancers versus promoters. DNA methylation is associated with increasing nucleosome occupancy and different types of methylations have distinct location preferences within the nucleosome core particle. Finally, computational analysis of nucleosome organization alone is sufficient to elucidate much of the circuitry of pluripotency. Our results, suggest that nucleosome organization is associated with numerous genomic and epigenomic processes and can be used to elucidate cellular identity.
Collapse
Affiliation(s)
- Puya G. Yazdi
- UC Irvine Diabetes Center, University of California Irvine, Irvine, California, United States of America
- Sue and Bill Gross Stem Cell Research Center, University of California Irvine, Irvine, California, United States of America
- Department of Medicine, University of California Irvine, Irvine, California, United States of America
| | - Brian A. Pedersen
- UC Irvine Diabetes Center, University of California Irvine, Irvine, California, United States of America
- Sue and Bill Gross Stem Cell Research Center, University of California Irvine, Irvine, California, United States of America
- Department of Medicine, University of California Irvine, Irvine, California, United States of America
| | - Jared F. Taylor
- UC Irvine Diabetes Center, University of California Irvine, Irvine, California, United States of America
- Sue and Bill Gross Stem Cell Research Center, University of California Irvine, Irvine, California, United States of America
- Department of Medicine, University of California Irvine, Irvine, California, United States of America
| | - Omar S. Khattab
- UC Irvine Diabetes Center, University of California Irvine, Irvine, California, United States of America
- Sue and Bill Gross Stem Cell Research Center, University of California Irvine, Irvine, California, United States of America
| | - Yu-Han Chen
- UC Irvine Diabetes Center, University of California Irvine, Irvine, California, United States of America
- Sue and Bill Gross Stem Cell Research Center, University of California Irvine, Irvine, California, United States of America
- Department of Medicine, University of California Irvine, Irvine, California, United States of America
| | - Yumay Chen
- UC Irvine Diabetes Center, University of California Irvine, Irvine, California, United States of America
- Sue and Bill Gross Stem Cell Research Center, University of California Irvine, Irvine, California, United States of America
- Department of Medicine, University of California Irvine, Irvine, California, United States of America
| | - Steven E. Jacobsen
- Department of Molecular, Cell and Developmental Biology, University of California Los Angeles, Los Angeles, California, United States of America
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California Los Angeles, Los Angeles, California, United States of America
- Howard Hughes Medical Institute, University of California Los Angeles, Los Angeles, California, United States of America
| | - Ping H. Wang
- UC Irvine Diabetes Center, University of California Irvine, Irvine, California, United States of America
- Sue and Bill Gross Stem Cell Research Center, University of California Irvine, Irvine, California, United States of America
- Department of Medicine, University of California Irvine, Irvine, California, United States of America
- Department of Biological Chemistry, University of California Irvine, Irvine, California, United States of America
- Department of Physiology & Biophysics, University of California Irvine, Irvine, California, United States of America
- * E-mail:
| |
Collapse
|