1
|
Zhu Y, Vvedenskaya IO, Sze SH, Nickels BE, Kaplan CD. Quantitative analysis of transcription start site selection reveals control by DNA sequence, RNA polymerase II activity and NTP levels. Nat Struct Mol Biol 2024; 31:190-202. [PMID: 38177677 PMCID: PMC10928753 DOI: 10.1038/s41594-023-01171-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 11/03/2023] [Indexed: 01/06/2024]
Abstract
Transcription start site (TSS) selection is a key step in gene expression and occurs at many promoter positions over a wide range of efficiencies. Here we develop a massively parallel reporter assay to quantitatively dissect contributions of promoter sequence, nucleoside triphosphate substrate levels and RNA polymerase II (Pol II) activity to TSS selection by 'promoter scanning' in Saccharomyces cerevisiae (Pol II MAssively Systematic Transcript End Readout, 'Pol II MASTER'). Using Pol II MASTER, we measure the efficiency of Pol II initiation at 1,000,000 individual TSS sequences in a defined promoter context. Pol II MASTER confirms proposed critical qualities of S. cerevisiae TSS -8, -1 and +1 positions, quantitatively, in a controlled promoter context. Pol II MASTER extends quantitative analysis to surrounding sequences and determines that they tune initiation over a wide range of efficiencies. These results enabled the development of a predictive model for initiation efficiency based on sequence. We show that genetic perturbation of Pol II catalytic activity alters initiation efficiency mostly independently of TSS sequence, but selectively modulates preference for the initiating nucleotide. Intriguingly, we find that Pol II initiation efficiency is directly sensitive to guanosine-5'-triphosphate levels at the first five transcript positions and to cytosine-5'-triphosphate and uridine-5'-triphosphate levels at the second position genome wide. These results suggest individual nucleoside triphosphate levels can have transcript-specific effects on initiation, representing a cryptic layer of potential regulation at the level of Pol II biochemical properties. The results establish Pol II MASTER as a method for quantitative dissection of transcription initiation in eukaryotes.
Collapse
Affiliation(s)
- Yunye Zhu
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Irina O Vvedenskaya
- Department of Genetics and Waksman Institute, Rutgers University, Piscataway, NJ, USA
| | - Sing-Hoi Sze
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, USA
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA
| | - Bryce E Nickels
- Department of Genetics and Waksman Institute, Rutgers University, Piscataway, NJ, USA
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
2
|
Stepankiw N, Yang AWH, Hughes TR. The human genome contains over a million autonomous exons. Genome Res 2023; 33:1865-1878. [PMID: 37945377 PMCID: PMC10760453 DOI: 10.1101/gr.277792.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 10/27/2023] [Indexed: 11/12/2023]
Abstract
Mammalian mRNA and lncRNA exons are often small compared to introns. The exon definition model predicts that exons splice autonomously, dependent on proximal exon sequence features, explaining their delineation within large introns. This model has not been examined on a genome-wide scale, however, leaving open the question of how often mRNA and lncRNA exons are autonomous. It is also unknown how frequently such exons can arise by chance. Here, we directly assayed large fragments (500-1000 bp) of the human genome by exon trapping, which detects exons spliced into a heterologous transgene, here designed with a large intron context. We define the trapped exons as "autonomous." We obtained ∼1.25 million trapped exons, including most known mRNA and well-annotated lncRNA internal exons, demonstrating that human exons are predominantly autonomous. mRNA exons are trapped with the highest efficiency. Nearly a million of the trapped exons are unannotated, most located in intergenic regions and antisense to mRNA, with depletion from the forward strand of introns. These exons are not conserved, suggesting they are nonfunctional and arose from random mutations. They are nonetheless highly enriched with known splicing promoting sequence features that delineate known exons. Novel autonomous exons are more numerous than annotated lncRNA exons, and computational models also indicate they will occur with similar frequency in any randomly generated sequence. These results show that most human coding exons splice autonomously, and provide an explanation for the existence of many unconserved lncRNAs, as well as a new annotation and inclusion levels of spliceable loci in the human genome.
Collapse
Affiliation(s)
- Nicholas Stepankiw
- Donnelly Centre, University of Toronto, Toronto, Ontario, Canada M5S 3E1
| | - Ally W H Yang
- Donnelly Centre, University of Toronto, Toronto, Ontario, Canada M5S 3E1
| | - Timothy R Hughes
- Donnelly Centre, University of Toronto, Toronto, Ontario, Canada M5S 3E1;
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| |
Collapse
|
3
|
Briffa A, Hollwey E, Shahzad Z, Moore JD, Lyons DB, Howard M, Zilberman D. Millennia-long epigenetic fluctuations generate intragenic DNA methylation variance in Arabidopsis populations. Cell Syst 2023; 14:953-967.e17. [PMID: 37944515 DOI: 10.1016/j.cels.2023.10.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 07/18/2023] [Accepted: 10/13/2023] [Indexed: 11/12/2023]
Abstract
Methylation of CG dinucleotides (mCGs), which regulates eukaryotic genome functions, is epigenetically propagated by Dnmt1/MET1 methyltransferases. How mCG is established and transmitted across generations despite imperfect enzyme fidelity is unclear. Whether mCG variation in natural populations is governed by genetic or epigenetic inheritance also remains mysterious. Here, we show that MET1 de novo activity, which is enhanced by existing proximate methylation, seeds and stabilizes mCG in Arabidopsis thaliana genes. MET1 activity is restricted by active demethylation and suppressed by histone variant H2A.Z, producing localized mCG patterns. Based on these observations, we develop a stochastic mathematical model that precisely recapitulates mCG inheritance dynamics and predicts intragenic mCG patterns and their population-scale variation given only CG site spacing. Our results demonstrate that intragenic mCG establishment, inheritance, and variance constitute a unified epigenetic process, revealing that intragenic mCG undergoes large, millennia-long epigenetic fluctuations and can therefore mediate evolution on this timescale.
Collapse
Affiliation(s)
- Amy Briffa
- Department of Computational and Systems Biology, John Innes Centre, Norwich NR4 7UH, UK
| | - Elizabeth Hollwey
- Department of Cell and Developmental Biology, John Innes Centre, Norwich NR4 7UH, UK; Institute of Science and Technology, 3400 Klosterneuburg, Austria
| | - Zaigham Shahzad
- Department of Cell and Developmental Biology, John Innes Centre, Norwich NR4 7UH, UK; Department of Life Sciences, Syed Babar Ali School of Science and Engineering, Lahore University of Management Sciences, Lahore, Pakistan
| | - Jonathan D Moore
- Department of Cell and Developmental Biology, John Innes Centre, Norwich NR4 7UH, UK
| | - David B Lyons
- Department of Cell and Developmental Biology, John Innes Centre, Norwich NR4 7UH, UK
| | - Martin Howard
- Department of Computational and Systems Biology, John Innes Centre, Norwich NR4 7UH, UK.
| | - Daniel Zilberman
- Department of Cell and Developmental Biology, John Innes Centre, Norwich NR4 7UH, UK; Institute of Science and Technology, 3400 Klosterneuburg, Austria.
| |
Collapse
|
4
|
Yadav M, Zuiddam M, Schiessel H. The role of transcript regions and amino acid choice in nucleosome positioning. NAR Genom Bioinform 2023; 5:lqad080. [PMID: 37705829 PMCID: PMC10495542 DOI: 10.1093/nargab/lqad080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/19/2023] [Accepted: 08/30/2023] [Indexed: 09/15/2023] Open
Abstract
Eukaryotic DNA is organized and compacted in a string of nucleosomes, DNA-wrapped protein cylinders. The positions of nucleosomes along DNA are not random but show well-known base pair sequence preferences that result from the sequence-dependent elastic and geometric properties of the DNA double helix. Here, we focus on DNA around transcription start sites, which are known to typically attract nucleosomes in multicellular life forms through their high GC content. We aim to understand how these GC signals, as observed in genome-wide averages, are produced and encoded through different genomic regions (mainly 5' UTRs, coding exons, and introns). Our study uses a bioinformatics approach to decompose the genome-wide GC signal into between-region and within-region signals. We find large differences in GC signal contributions between vertebrates and plants and, remarkably, even between closely related species. Introns contribute most to the GC signal in vertebrates, while in plants the exons dominate. Further, we find signal strengths stronger on DNA than on mRNA, suggesting a biological function of GC signals along the DNA itself, as is the case for nucleosome positioning. Finally, we make the surprising discovery that both the choice of synonymous codons and amino acids contribute to the nucleosome positioning signal.
Collapse
Affiliation(s)
- Manish Yadav
- Cluster of Excellence Physics of Life, TU Dresden, 01062 Dresden, Germany
| | - Martijn Zuiddam
- Institute Lorentz for Theoretical Physics, Leiden University, Leiden, the Netherlands
| | - Helmut Schiessel
- Cluster of Excellence Physics of Life, TU Dresden, 01062 Dresden, Germany
- Institut für Theoretische Physik, Technische Universität Dresden, 01062 Dresden, Germany
| |
Collapse
|
5
|
Xu H, Li C, Xu C, Zhang J. Chance promoter activities illuminate the origins of eukaryotic intergenic transcriptions. Nat Commun 2023; 14:1826. [PMID: 37005399 PMCID: PMC10067814 DOI: 10.1038/s41467-023-37610-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 03/23/2023] [Indexed: 04/04/2023] Open
Abstract
It is debated whether the pervasive intergenic transcription from eukaryotic genomes has functional significance or simply reflects the promiscuity of RNA polymerases. We approach this question by comparing chance promoter activities with the expression levels of intergenic regions in the model eukaryote Saccharomyces cerevisiae. We build a library of over 105 strains, each carrying a 120-nucleotide, chromosomally integrated, completely random sequence driving the potential transcription of a barcode. Quantifying the RNA concentration of each barcode in two environments reveals that 41-63% of random sequences have significant, albeit usually low, promoter activities. Therefore, even in eukaryotes, where the presence of chromatin is thought to repress transcription, chance transcription is prevalent. We find that only 1-5% of yeast intergenic transcriptions are unattributable to chance promoter activities or neighboring gene expressions, and these transcriptions exhibit higher-than-expected environment-specificity. These findings suggest that only a minute fraction of intergenic transcription is functional in yeast.
Collapse
Affiliation(s)
- Haiqing Xu
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Chuan Li
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
- Microsoft, Redmond, WA, USA
| | - Chuan Xu
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA
- Bio-X Institutes, Shanghai Jiao Tong University, Shanghai, China
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
6
|
Rasband SA, Bolton PE, Fang Q, Johnson PLF, Braun MJ. Evolution of the Growth Hormone Gene Duplication in Passerine Birds. Genome Biol Evol 2023; 15:evad033. [PMID: 36848146 PMCID: PMC10016047 DOI: 10.1093/gbe/evad033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Revised: 12/11/2022] [Accepted: 01/09/2023] [Indexed: 03/01/2023] Open
Abstract
Birds of the order Passeriformes represent the most speciose order of land vertebrates. Despite strong scientific interest in this super-radiation, genetic traits unique to passerines are not well characterized. A duplicate copy of growth hormone (GH) is the only gene known to be present in all major lineages of passerines, but not in other birds. GH genes plausibly influence extreme life history traits that passerines exhibit, including the shortest embryo-to-fledging developmental period of any avian order. To unravel the implications of this GH duplication, we investigated the molecular evolution of the ancestral avian GH gene (GH or GH1) and the novel passerine GH paralog (GH2), using 497 gene sequences extracted from 342 genomes. Passerine GH1 and GH2 are reciprocally monophyletic, consistent with a single duplication event from a microchromosome onto a macrochromosome in a common ancestor of extant passerines. Additional chromosomal rearrangements have changed the syntenic and potential regulatory context of these genes. Both passerine GH1 and GH2 display substantially higher rates of nonsynonymous codon change than non-passerine avian GH, suggesting positive selection following duplication. A site involved in signal peptide cleavage is under selection in both paralogs. Other sites under positive selection differ between the two paralogs, but many are clustered in one region of a 3D model of the protein. Both paralogs retain key functional features and are actively but differentially expressed in two major passerine suborders. These phenomena suggest that GH genes may be evolving novel adaptive roles in passerine birds.
Collapse
Affiliation(s)
- Shauna A Rasband
- Behavior, Ecology, Evolution and Systematics Graduate Program, University of Maryland, College Park, Maryland
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC
| | - Peri E Bolton
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC
- Department of Biology, East Carolina University, Greenville, North Carolina
| | - Qi Fang
- BGI-Shenzhen, Beishan Industrial Zone, Shenzhen, China
| | | | - Michael J Braun
- Behavior, Ecology, Evolution and Systematics Graduate Program, University of Maryland, College Park, Maryland
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC
| |
Collapse
|
7
|
Redd PS, Diaz S, Weidner D, Benjamin J, Hancock CN. Mobility of mPing and its associated elements is regulated by both internal and terminal sequences. Mob DNA 2023; 14:1. [PMID: 36774502 PMCID: PMC9921582 DOI: 10.1186/s13100-023-00289-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 02/04/2023] [Indexed: 02/13/2023] Open
Abstract
BACKGROUND DNA transposable elements are mobilized by a "cut and paste" mechanism catalyzed by the binding of one or more transposase proteins to terminal inverted repeats (TIRs) to form a transpositional complex. Study of the rice genome indicates that the mPing element has experienced a recent burst in transposition compared to the closely related Ping and Pong elements. A previously developed yeast transposition assay allowed us to probe the role of both internal and terminal sequences in the mobilization of these elements. RESULTS We observed that mPing and a synthetic mPong element have significantly higher transposition efficiency than the related autonomous Ping and Pong elements. Systematic mutation of the internal sequences of both mPing and mPong identified multiple regions that promote or inhibit transposition. Simultaneous alteration of single bases on both mPing TIRs resulted in a significant reduction in transposition frequency, indicating that each base plays a role in efficient transposase binding. Testing chimeric mPing and mPong elements verified the important role of both the TIRs and internal regulatory regions. Previous experiments showed that the G at position 16, adjacent to the 5' TIR, allows mPing to have higher mobility. Alteration of the 16th and 17th base from mPing's 3' end or replacement of the 3' end with Pong 3' sequences significantly increased transposition frequency. CONCLUSIONS As the transposase proteins were consistent throughout this study, we conclude that the observed transposition differences are due to the element sequences. The presence of sub-optimal internal regions and TIR bases supports a model in which transposable elements self-limit their activity to prevent host damage and detection by host regulatory mechanisms. Knowing the role of the TIRs, adjacent sub-TIRs, and internal regulatory sequences allows for the creation of hyperactive elements.
Collapse
Affiliation(s)
- Priscilla S. Redd
- grid.267160.40000 0000 9205 7135Department of Biology and Geology, University of South Carolina Aiken, Aiken, SC 29801 USA
| | - Stephanie Diaz
- grid.267160.40000 0000 9205 7135Department of Biology and Geology, University of South Carolina Aiken, Aiken, SC 29801 USA ,grid.66859.340000 0004 0546 1623Present address: Bayer Pharmaceuticals, Broad Institute of MIT and Harvard, Cambridge, MA 02142 USA
| | - David Weidner
- grid.267160.40000 0000 9205 7135Department of Biology and Geology, University of South Carolina Aiken, Aiken, SC 29801 USA
| | - Jazmine Benjamin
- grid.267160.40000 0000 9205 7135Department of Biology and Geology, University of South Carolina Aiken, Aiken, SC 29801 USA ,grid.265892.20000000106344187Present address: Division of Nephrology, Department of Medicine, University of Alabama at Birmingham, Birmingham, AL 35233 USA
| | - C. Nathan Hancock
- grid.267160.40000 0000 9205 7135Department of Biology and Geology, University of South Carolina Aiken, Aiken, SC 29801 USA
| |
Collapse
|
8
|
The Role of PARP1 and PAR in ATP-Independent Nucleosome Reorganisation during the DNA Damage Response. Genes (Basel) 2022; 14:genes14010112. [PMID: 36672853 PMCID: PMC9859207 DOI: 10.3390/genes14010112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 12/22/2022] [Accepted: 12/23/2022] [Indexed: 12/31/2022] Open
Abstract
The functioning of the eukaryotic cell genome is mediated by sophisticated protein-nucleic-acid complexes, whose minimal structural unit is the nucleosome. After the damage to genomic DNA, repair proteins need to gain access directly to the lesion; therefore, the initiation of the DNA damage response inevitably leads to local chromatin reorganisation. This review focuses on the possible involvement of PARP1, as well as proteins acting nucleosome compaction, linker histone H1 and non-histone chromatin protein HMGB1. The polymer of ADP-ribose is considered the main regulator during the development of the DNA damage response and in the course of assembly of the correct repair complex.
Collapse
|
9
|
Zeng Y, Fair BJ, Zeng H, Krishnamohan A, Hou Y, Hall JM, Ruthenburg AJ, Li YI, Staley JP. Profiling lariat intermediates reveals genetic determinants of early and late co-transcriptional splicing. Mol Cell 2022; 82:4681-4699.e8. [PMID: 36435176 PMCID: PMC10448999 DOI: 10.1016/j.molcel.2022.11.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 09/10/2022] [Accepted: 11/02/2022] [Indexed: 11/27/2022]
Abstract
Long introns with short exons in vertebrate genes are thought to require spliceosome assembly across exons (exon definition), rather than introns, thereby requiring transcription of an exon to splice an upstream intron. Here, we developed CoLa-seq (co-transcriptional lariat sequencing) to investigate the timing and determinants of co-transcriptional splicing genome wide. Unexpectedly, 90% of all introns, including long introns, can splice before transcription of a downstream exon, indicating that exon definition is not obligatory for most human introns. Still, splicing timing varies dramatically across introns, and various genetic elements determine this variation. Strong U2AF2 binding to the polypyrimidine tract predicts early splicing, explaining exon definition-independent splicing. Together, our findings question the essentiality of exon definition and reveal features beyond intron and exon length that are determinative for splicing timing.
Collapse
Affiliation(s)
- Yi Zeng
- Department of Molecular Genetics and Cell Biology, University of Chicago, Chicago, IL 60637, USA
| | - Benjamin J Fair
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Huilin Zeng
- 855 Jefferson Ave. Redwood City, CA 94063, USA
| | - Aiswarya Krishnamohan
- Department of Molecular Genetics and Cell Biology, University of Chicago, Chicago, IL 60637, USA
| | - Yichen Hou
- Department of Molecular Genetics and Cell Biology, University of Chicago, Chicago, IL 60637, USA
| | - Johnathon M Hall
- Department of Molecular Genetics and Cell Biology, University of Chicago, Chicago, IL 60637, USA
| | - Alexander J Ruthenburg
- Department of Molecular Genetics and Cell Biology, University of Chicago, Chicago, IL 60637, USA; Department of Biochemistry & Molecular Biology, University of Chicago, Chicago, IL 60637, USA
| | - Yang I Li
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA; Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.
| | - Jonathan P Staley
- Department of Molecular Genetics and Cell Biology, University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
10
|
Comprehensive computational analysis of epigenetic descriptors affecting CRISPR-Cas9 off-target activity. BMC Genomics 2022; 23:805. [PMID: 36474180 PMCID: PMC9724382 DOI: 10.1186/s12864-022-09012-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 10/17/2022] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND A common issue in CRISPR-Cas9 genome editing is off-target activity, which prevents the widespread use of CRISPR-Cas9 in medical applications. Among other factors, primary chromatin structure and epigenetics may influence off-target activity. METHODS In this work, we utilize crisprSQL, an off-target database, to analyze the effect of 19 epigenetic descriptors on CRISPR-Cas9 off-target activity. Termed as 19 epigenetic features/scores, they consist of 6 experimental epigenetic and 13 computed nucleosome organization-related features. In terms of novel features, 15 of the epigenetic scores are newly considered. The 15 newly considered scores consist of 13 freshly computed nucleosome occupancy/positioning scores and 2 experimental features (MNase and DRIP). The other 4 existing scores are experimental features (CTCF, DNase I, H3K4me3, RRBS) commonly used in deep learning models for off-target activity prediction. For data curation, MNase was aggregated from existing experimental nucleosome occupancy data. Based on the sequence context information available in crisprSQL, we also computed nucleosome occupancy/positioning scores for off-target sites. RESULTS To investigate the relationship between the 19 epigenetic features and off-target activity, we first conducted Spearman and Pearson correlation analysis. Such analysis shows that some computed scores derived from training-based models and training-free algorithms outperform all experimental epigenetic features. Next, we evaluated the contribution of all epigenetic features in two successful machine/deep learning models which predict off-target activity. We found that some computed scores, unlike all 6 experimental features, significantly contribute to the predictions of both models. As a practical research contribution, we make the off-target dataset containing all 19 epigenetic features available to the research community. CONCLUSIONS Our comprehensive computational analysis helps the CRISPR-Cas9 community better understand the relationship between epigenetic features and CRISPR-Cas9 off-target activity.
Collapse
|
11
|
Zuiddam M, Shakiba B, Schiessel H. Multiplexing mechanical and translational cues on genes. Biophys J 2022; 121:4311-4324. [PMID: 36230003 PMCID: PMC9703045 DOI: 10.1016/j.bpj.2022.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 07/06/2022] [Accepted: 10/07/2022] [Indexed: 12/14/2022] Open
Abstract
The genetic code gives precise instructions on how to translate codons into amino acids. Due to the degeneracy of the genetic code-18 out of 20 amino acids are encoded for by more than one codon-more information can be stored in a basepair sequence. Indeed, various types of additional information have been discussed in the literature, e.g., the positioning of nucleosomes along eukaryotic genomes and the modulation of the translating efficiency in ribosomes to influence cotranslational protein folding. The purpose of this study is to show that it is indeed possible to carry more than one additional layer of information on top of a gene. In particular, we show how much translation efficiency and nucleosome positioning can be adjusted simultaneously without changing the encoded protein. We achieve this by mapping genes on weighted graphs that contain all synonymous genes, and then finding shortest paths through these graphs. This enables us, for example, to readjust the disrupted translational efficiency profile after a gene has been introduced from one organism (e.g., human) into another (e.g., yeast) without greatly changing the nucleosome landscape intrinsically encoded by the DNA molecule.
Collapse
Affiliation(s)
- Martijn Zuiddam
- Institute Lorentz for Theoretical Physics, Leiden University, Leiden, the Netherlands
| | - Bahareh Shakiba
- Institute Lorentz for Theoretical Physics, Leiden University, Leiden, the Netherlands
| | - Helmut Schiessel
- Cluster of Excellence Physics of Life, TU Dresden, Dresden, Germany.
| |
Collapse
|
12
|
Liu G, Sun Y, Jia L, Li R, Zuo Y. Chromatin accessibility shapes meiotic recombination in mouse primordial germ cells through assisting double-strand breaks and loop formation. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2022; 1865:194844. [PMID: 35870788 DOI: 10.1016/j.bbagrm.2022.194844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/27/2022] [Accepted: 07/09/2022] [Indexed: 06/15/2023]
Abstract
Meiotic recombination is a driver of evolution, and aberrant recombination is a major contributor to aneuploidy in mammals. Mechanism of recombination remains elusive yet. Here, we present a computational analysis to explore recombination-related dynamics of chromatin accessibility in mouse primordial germ cells (PGCs). Our data reveals that: (1) recombination hotspots which get accessible at meiosis-specific DNase I-hypersensitive sites (DHSs) only when PGCs enter meiosis are located preferentially in intronic and distal intergenic regions; (2) stable DHSs maintained stably across PGC differentiation are enriched by CTCF motifs and CTCF binding and mediate chromatin loop formation; (3) compared with the specific DHSs aroused at meiotic stage, stable DHSs are largely encoded in DNA sequence and also enriched by epigenetic marks; (4) PRDM9 is likely to target nucleosome-occupied hotspot regions and remodels local chromatin structure to make them accessible for recombination machinery; and (5) cells undergoing meiotic recombination are deficient in TAD structure and chromatin loop arrays are organized regularly along the axis formed between homologous chromosomes. Taken together, by analyzing DHS-related DNA features, epigenetic marks and 3D genome structure, we revealed some specific roles of chromatin accessibility in recombination, which would expand our understanding of recombination mechanism.
Collapse
Affiliation(s)
- Guoqing Liu
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, China; Inner Mongolia Key Laboratory of Functional Genomics and Bioinformatics, Inner Mongolia University of Science and Technology, Baotou, China; School of Life Sciences, Peking University, Beijing, China.
| | - Yu Sun
- School of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Lumeng Jia
- School of Life Sciences, Peking University, Beijing, China
| | - Ruifeng Li
- School of Life Sciences, Peking University, Beijing, China
| | - Yongchun Zuo
- School of Life Sciences, Inner Mongolia University, Hohhot, China.
| |
Collapse
|
13
|
Lima ARJ, Silva HGD, Poubel S, Rosón JN, de Lima LPO, Costa-Silva HM, Gonçalves CS, Galante PAF, Holetz F, Motta MCMM, Silber AM, Elias MC, da Cunha JPC. Open chromatin analysis in Trypanosoma cruzi life forms highlights critical differences in genomic compartments and developmental regulation at tDNA loci. Epigenetics Chromatin 2022; 15:22. [PMID: 35650626 PMCID: PMC9158160 DOI: 10.1186/s13072-022-00450-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 04/18/2022] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Genomic organization and gene expression regulation in trypanosomes are remarkable because protein-coding genes are organized into codirectional gene clusters with unrelated functions. Moreover, there is no dedicated promoter for each gene, resulting in polycistronic gene transcription, with posttranscriptional control playing a major role. Nonetheless, these parasites harbor epigenetic modifications at critical regulatory genome features that dynamically change among parasite stages, which are not fully understood. RESULTS Here, we investigated the impact of chromatin changes in a scenario commanded by posttranscriptional control exploring the parasite Trypanosoma cruzi and its differentiation program using FAIRE-seq approach supported by transmission electron microscopy. We identified differences in T. cruzi genome compartments, putative transcriptional start regions, and virulence factors. In addition, we also detected a developmental chromatin regulation at tRNA loci (tDNA), which could be linked to the intense chromatin remodeling and/or the translation regulatory mechanism required for parasite differentiation. We further integrated the open chromatin profile with public transcriptomic and MNase-seq datasets. Strikingly, a positive correlation was observed between active chromatin and steady-state transcription levels. CONCLUSION Taken together, our results indicate that chromatin changes reflect the unusual gene expression regulation of trypanosomes and the differences among parasite developmental stages, even in the context of a lack of canonical transcriptional control of protein-coding genes.
Collapse
Affiliation(s)
- Alex Ranieri Jerônimo Lima
- grid.418514.d0000 0001 1702 8585Laboratório de Ciclo Celular, Instituto Butantan, São Paulo, SP Brazil ,grid.418514.d0000 0001 1702 8585Centro de Toxinas, Resposta Imune E Sinalização Celular (CeTICS), Instituto Butantan, São Paulo, Brazil
| | - Herbert Guimarães de
Sousa Silva
- grid.418514.d0000 0001 1702 8585Laboratório de Ciclo Celular, Instituto Butantan, São Paulo, SP Brazil ,grid.418514.d0000 0001 1702 8585Centro de Toxinas, Resposta Imune E Sinalização Celular (CeTICS), Instituto Butantan, São Paulo, Brazil ,grid.411249.b0000 0001 0514 7202Departamento de Microbiologia, Universidade Federal de São Paulo, Escola Paulista de Medicina, Imunologia E Parasitologia, São Paulo, SP Brazil
| | - Saloe Poubel
- grid.418514.d0000 0001 1702 8585Laboratório de Ciclo Celular, Instituto Butantan, São Paulo, SP Brazil ,grid.418514.d0000 0001 1702 8585Centro de Toxinas, Resposta Imune E Sinalização Celular (CeTICS), Instituto Butantan, São Paulo, Brazil
| | - Juliana Nunes Rosón
- grid.418514.d0000 0001 1702 8585Laboratório de Ciclo Celular, Instituto Butantan, São Paulo, SP Brazil ,grid.418514.d0000 0001 1702 8585Centro de Toxinas, Resposta Imune E Sinalização Celular (CeTICS), Instituto Butantan, São Paulo, Brazil ,grid.411249.b0000 0001 0514 7202Departamento de Microbiologia, Universidade Federal de São Paulo, Escola Paulista de Medicina, Imunologia E Parasitologia, São Paulo, SP Brazil
| | - Loyze Paola Oliveira de Lima
- grid.418514.d0000 0001 1702 8585Laboratório de Ciclo Celular, Instituto Butantan, São Paulo, SP Brazil ,grid.418514.d0000 0001 1702 8585Centro de Toxinas, Resposta Imune E Sinalização Celular (CeTICS), Instituto Butantan, São Paulo, Brazil
| | - Héllida Marina Costa-Silva
- grid.418514.d0000 0001 1702 8585Laboratório de Ciclo Celular, Instituto Butantan, São Paulo, SP Brazil ,grid.418514.d0000 0001 1702 8585Centro de Toxinas, Resposta Imune E Sinalização Celular (CeTICS), Instituto Butantan, São Paulo, Brazil
| | - Camila Silva Gonçalves
- grid.8536.80000 0001 2294 473XLaboratório de Ultraestrutura Celular Hertha Meyer, Instituto de Biofísica Carlos Chagas Filho, Universidade Federal Do Rio de Janeiro, IBCCF, CCS, UFRJ, Cidade Universitária, Rio de Janeiro, RJ Brazil ,Centro Nacional de Biologia Estrutural E Bioimagem, Rio de Janeiro, RJ Brazil
| | - Pedro A. F. Galante
- grid.413471.40000 0000 9080 8521Centro de Oncologia Molecular, Hospital Sírio Libanês, São Paulo, SP Brazil
| | - Fabiola Holetz
- grid.418068.30000 0001 0723 0931Instituto Carlos Chagas, Fiocruz, Curitiba, PR Brazil
| | - Maria Cristina Machado M. Motta
- grid.8536.80000 0001 2294 473XLaboratório de Ultraestrutura Celular Hertha Meyer, Instituto de Biofísica Carlos Chagas Filho, Universidade Federal Do Rio de Janeiro, IBCCF, CCS, UFRJ, Cidade Universitária, Rio de Janeiro, RJ Brazil ,Centro Nacional de Biologia Estrutural E Bioimagem, Rio de Janeiro, RJ Brazil
| | - Ariel M. Silber
- grid.11899.380000 0004 1937 0722Universidade de São Paulo, São Paulo, SP Brazil
| | - M. Carolina Elias
- grid.418514.d0000 0001 1702 8585Laboratório de Ciclo Celular, Instituto Butantan, São Paulo, SP Brazil ,grid.418514.d0000 0001 1702 8585Centro de Toxinas, Resposta Imune E Sinalização Celular (CeTICS), Instituto Butantan, São Paulo, Brazil
| | - Julia Pinheiro Chagas da Cunha
- Laboratório de Ciclo Celular, Instituto Butantan, São Paulo, SP, Brazil. .,Centro de Toxinas, Resposta Imune E Sinalização Celular (CeTICS), Instituto Butantan, São Paulo, Brazil.
| |
Collapse
|
14
|
Trotta E. GC content strongly influences the role of poly(dA) in the intrinsic nucleosome positioning in Saccharomyces cerevisiae. Yeast 2022; 39:262-271. [PMID: 35348238 PMCID: PMC9541940 DOI: 10.1002/yea.3701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 02/21/2022] [Accepted: 02/22/2022] [Indexed: 11/08/2022] Open
Abstract
The nucleosome is the basic structural element of genomic DNA packaging and plays a role in transcription, replication, and recombination. Poly(dA) tracts are considered major sequence determinants of nucleosome positioning, although their role is not well understood. Here, we show that the homopolymeric character and the low GC content of poly(dA)s play different roles in nucleosome formation. We found that the inherent low GC content of poly(dA) alone can account for the deep and anisotropic nucleosome depletion at structurally and functionally important regions of promoters and origins of replication. We also show that the level of nucleosome occupancy at poly(dA) is strongly related to the local nucleotide background and its high frequency of occurrence in Saccharomyces cerevisiae does not appear merely to be associated with its intrinsic nucleosome-excluding properties. In addition, we show that the GC content alone can predict more than 60% of the in vitro nucleosome map, providing further evidence that the intrinsic nucleosome positioning is more greatly determined by GC content than poly(dA) stretches. Our results are consistent with a model in which poly(dA) stretches act at two distinct levels: first, by its low GC content, which intrinsically contributes to hinder nucleosome formation, and second, by its contiguous runs of dA that selectively drive the recruitment of non-histone proteins with structural and functional roles.
Collapse
Affiliation(s)
- Edoardo Trotta
- Institute of Translational Pharmacology, Consiglio Nazionale delle Ricerche (CNR), Rome, Italy
| |
Collapse
|
15
|
Broad domains of histone marks in the highly compact Paramecium macronuclear genome. Genome Res 2022; 32:710-725. [PMID: 35264449 PMCID: PMC8997361 DOI: 10.1101/gr.276126.121] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 03/04/2022] [Indexed: 11/25/2022]
Abstract
The unicellular ciliate Paramecium contains a large vegetative macronucleus with several unusual characteristics, including an extremely high coding density and high polyploidy. As macronculear chromatin is devoid of heterochromatin, our study characterizes the functional epigenomic organization necessary for gene regulation and proper Pol II activity. Histone marks (H3K4me3, H3K9ac, H3K27me3) reveal no narrow peaks but broad domains along gene bodies, whereas intergenic regions are devoid of nucleosomes. Our data implicate H3K4me3 levels inside ORFs to be the main factor associated with gene expression, and H3K27me3 appears in association with H3K4me3 in plastic genes. Silent and lowly expressed genes show low nucleosome occupancy, suggesting that gene inactivation does not involve increased nucleosome occupancy and chromatin condensation. Because of a high occupancy of Pol II along highly expressed ORFs, transcriptional elongation appears to be quite different from that of other species. This is supported by missing heptameric repeats in the C-terminal domain of Pol II and a divergent elongation system. Our data imply that unoccupied DNA is the default state, whereas gene activation requires nucleosome recruitment together with broad domains of H3K4me3. In summary, gene activation and silencing in Paramecium run counter to the current understanding of chromatin biology.
Collapse
|
16
|
Gnan S, Matelot M, Weiman M, Arnaiz O, Guérin F, Sperling L, Bétermier M, Thermes C, Chen CL, Duharcourt S. GC content, but not nucleosome positioning, directly contributes to intron splicing efficiency in Paramecium. Genome Res 2022; 32:699-709. [PMID: 35264448 PMCID: PMC8997360 DOI: 10.1101/gr.276125.121] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 02/14/2022] [Indexed: 11/24/2022]
Abstract
Eukaryotic genes are interrupted by introns that must be accurately spliced from mRNA precursors. With an average length of 25 nt, the more than 90,000 introns of Paramecium tetraurelia stand among the shortest introns reported in eukaryotes. The mechanisms specifying the correct recognition of these tiny introns remain poorly understood. Splicing can occur cotranscriptionally, and it has been proposed that chromatin structure might influence splice site recognition. To investigate the roles of nucleosome positioning in intron recognition, we determined the nucleosome occupancy along the P. tetraurelia genome. We show that P. tetraurelia displays a regular nucleosome array with a nucleosome repeat length of ∼151 bp, among the smallest periodicities reported. Our analysis has revealed that introns are frequently associated with inter-nucleosomal DNA, pointing to an evolutionary constraint favoring introns at the AT-rich nucleosome edge sequences. Using accurate splicing efficiency data from cells depleted for nonsense-mediated decay effectors, we show that introns located at the edge of nucleosomes display higher splicing efficiency than those at the center. However, multiple regression analysis indicates that the low GC content of introns, rather than nucleosome positioning, is associated with high splicing efficiency. Our data reveal a complex link between GC content, nucleosome positioning, and intron evolution in Paramecium.
Collapse
Affiliation(s)
- Stefano Gnan
- Institut Curie, Université PSL, Sorbonne Université, CNRS UMR3244, Dynamics of Genetic Information, Paris, 75005 France
| | - Mélody Matelot
- Université Paris Cité, CNRS, Institut Jacques Monod, F-75013 Paris, France
| | - Marion Weiman
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Arnaiz
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Frédéric Guérin
- Université Paris Cité, CNRS, Institut Jacques Monod, F-75013 Paris, France
| | - Linda Sperling
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Mireille Bétermier
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Claude Thermes
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Chun-Long Chen
- Institut Curie, Université PSL, Sorbonne Université, CNRS UMR3244, Dynamics of Genetic Information, Paris, 75005 France
| | - Sandra Duharcourt
- Université Paris Cité, CNRS, Institut Jacques Monod, F-75013 Paris, France
| |
Collapse
|
17
|
Li W, Almirantis Y, Provata A. Revisiting the neutral dynamics derived limiting guanine-cytosine content using human de novo point mutation data. Meta Gene 2022. [DOI: 10.1016/j.mgene.2021.100994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|
18
|
Xu B, Li X, Gao X, Jia Y, Liu J, Li F, Zhang Z. DeNOPA: decoding nucleosome positions sensitively with sparse ATAC-seq data. Brief Bioinform 2021; 23:6454261. [PMID: 34875002 DOI: 10.1093/bib/bbab469] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 10/09/2021] [Accepted: 10/13/2021] [Indexed: 12/25/2022] Open
Abstract
As the basal bricks, the dynamics and arrangement of nucleosomes orchestrate the higher architecture of chromatin in a fundamental way, thereby affecting almost all nuclear biology processes. Thanks to its rather simple protocol, assay for transposase-accessible chromatin using sequencing (ATAC)-seq has been rapidly adopted as a major tool for chromatin-accessible profiling at both bulk and single-cell levels; however, to picture the arrangement of nucleosomes per se remains a challenge with ATAC-seq. In the present work, we introduce a novel ATAC-seq analysis toolkit, named decoding nucleosome organization profile based on ATAC-seq data (deNOPA), to predict nucleosome positions. Assessments showed that deNOPA outperformed state-of-the-art tools with ultra-sparse ATAC-seq data, e.g. no more than 0.5 fragment per base pair. The remarkable performance of deNOPA was fueled by the short fragment reads, which compose nearly half of sequenced reads in the ATAC-seq libraries and are commonly discarded by state-of-the-art nucleosome positioning tools. However, we found that the short fragment reads enrich information on nucleosome positions and that the linker regions were predicted by reads from both short and long fragments using Gaussian smoothing. Last, using deNOPA, we showed that the dynamics of nucleosome organization may not directly couple with chromatin accessibility in the cis-regulatory regions when human cells respond to heat shock stimulation. Our deNOPA provides a powerful tool with which to analyze the dynamics of chromatin at nucleosome position level with ultra-sparse ATAC-seq data.
Collapse
Affiliation(s)
- Bingxiang Xu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, P.R. China.,School of Kinesiology, Shanghai University of Sport, Shanghai, China
| | - Xiaoli Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, P.R. China
| | - Xiaomeng Gao
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, P.R. China
| | - Yan Jia
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China
| | - Jing Liu
- School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, P.R. China
| | - Feifei Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China
| | - Zhihua Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, P.R. China.,School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, P.R. China
| |
Collapse
|
19
|
Han S, Lee H, Lee AJ, Kim SK, Jung I, Koh GY, Kim TK, Lee D. CHD4 Conceals Aberrant CTCF-Binding Sites at TAD Interiors by Regulating Chromatin Accessibility in Mouse Embryonic Stem Cells. Mol Cells 2021; 44:805-829. [PMID: 34764232 PMCID: PMC8627837 DOI: 10.14348/molcells.2021.0224] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 09/06/2021] [Indexed: 11/27/2022] Open
Abstract
CCCTC-binding factor (CTCF) critically contributes to 3D chromatin organization by determining topologically associated domain (TAD) borders. Although CTCF primarily binds at TAD borders, there also exist putative CTCF-binding sites within TADs, which are spread throughout the genome by retrotransposition. However, the detailed mechanism responsible for masking the putative CTCF-binding sites remains largely elusive. Here, we show that the ATP-dependent chromatin remodeler, chromodomain helicase DNA-binding 4 (CHD4), regulates chromatin accessibility to conceal aberrant CTCF-binding sites embedded in H3K9me3-enriched heterochromatic B2 short interspersed nuclear elements (SINEs) in mouse embryonic stem cells (mESCs). Upon CHD4 depletion, these aberrant CTCF-binding sites become accessible and aberrant CTCF recruitment occurs within TADs, resulting in disorganization of local TADs. RNA-binding intrinsically disordered domains (IDRs) of CHD4 are required to prevent this aberrant CTCF binding, and CHD4 is critical for the repression of B2 SINE transcripts. These results collectively reveal that a CHD4-mediated mechanism ensures appropriate CTCF binding and associated TAD organization in mESCs.
Collapse
Affiliation(s)
- Sungwook Han
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
| | - Hosuk Lee
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
- Center for Vascular Research, Institute for Basic Sciences, Daejeon 34141, Korea
| | - Andrew J. Lee
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
| | - Seung-Kyoon Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 37673, Korea
| | - Inkyung Jung
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
| | - Gou Young Koh
- Center for Vascular Research, Institute for Basic Sciences, Daejeon 34141, Korea
| | - Tae-Kyung Kim
- Department of Life Sciences, Pohang University of Science and Technology, Pohang 37673, Korea
| | - Daeyoup Lee
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
| |
Collapse
|
20
|
Yoo J, Park S, Maffeo C, Ha T, Aksimentiev A. DNA sequence and methylation prescribe the inside-out conformational dynamics and bending energetics of DNA minicircles. Nucleic Acids Res 2021; 49:11459-11475. [PMID: 34718725 PMCID: PMC8599915 DOI: 10.1093/nar/gkab967] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 09/27/2021] [Accepted: 10/11/2021] [Indexed: 11/13/2022] Open
Abstract
Eukaryotic genome and methylome encode DNA fragments' propensity to form nucleosome particles. Although the mechanical properties of DNA possibly orchestrate such encoding, the definite link between 'omics' and DNA energetics has remained elusive. Here, we bridge the divide by examining the sequence-dependent energetics of highly bent DNA. Molecular dynamics simulations of 42 intact DNA minicircles reveal that each DNA minicircle undergoes inside-out conformational transitions with the most likely configuration uniquely prescribed by the nucleotide sequence and methylation of DNA. The minicircles' local geometry consists of straight segments connected by sharp bends compressing the DNA's inward-facing major groove. Such an uneven distribution of the bending stress favors minimum free energy configurations that avoid stiff base pair sequences at inward-facing major grooves. Analysis of the minicircles' inside-out free energy landscapes yields a discrete worm-like chain model of bent DNA energetics that accurately account for its nucleotide sequence and methylation. Experimentally measuring the dependence of the DNA looping time on the DNA sequence validates the model. When applied to a nucleosome-like DNA configuration, the model quantitatively reproduces yeast and human genomes' nucleosome occupancy. Further analyses of the genome-wide chromatin structure data suggest that DNA bending energetics is a fundamental determinant of genome architecture.
Collapse
Affiliation(s)
- Jejoong Yoo
- Department of Physics, Sungkyunkwan University, Suwon 16419, Republic of Korea
| | - Sangwoo Park
- Department of Biophysics and Biophysical Chemistry, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Christopher Maffeo
- Department of Physics and the Center for the Physics of Living Cells, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Taekjip Ha
- Department of Biophysics and Biophysical Chemistry, Johns Hopkins University, Baltimore, MD 21205, USA
- Howard Hughes Medical Institute, Baltimore, MD 21218, USA
| | - Aleksei Aksimentiev
- Department of Physics and the Center for the Physics of Living Cells, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
21
|
Campos TL, Korhonen PK, Hofmann A, Gasser RB, Young ND. Harnessing model organism genomics to underpin the machine learning-based prediction of essential genes in eukaryotes - Biotechnological implications. Biotechnol Adv 2021; 54:107822. [PMID: 34461202 DOI: 10.1016/j.biotechadv.2021.107822] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 08/17/2021] [Accepted: 08/24/2021] [Indexed: 12/17/2022]
Abstract
The availability of high-quality genomes and advances in functional genomics have enabled large-scale studies of essential genes in model eukaryotes, including the 'elegant worm' (Caenorhabditis elegans; Nematoda) and the 'vinegar fly' (Drosophila melanogaster; Arthropoda). However, this is not the case for other, much less-studied organisms, such as socioeconomically important parasites, for which functional genomic platforms usually do not exist. Thus, there is a need to develop innovative techniques or approaches for the prediction, identification and investigation of essential genes. A key approach that could enable the prediction of such genes is machine learning (ML). Here, we undertake an historical review of experimental and computational approaches employed for the characterisation of essential genes in eukaryotes, with a particular focus on model ecdysozoans (C. elegans and D. melanogaster), and discuss the possible applicability of ML-approaches to organisms such as socioeconomically important parasites. We highlight some recent results showing that high-performance ML, combined with feature engineering, allows a reliable prediction of essential genes from extensive, publicly available 'omic data sets, with major potential to prioritise such genes (with statistical confidence) for subsequent functional genomic validation. These findings could 'open the door' to fundamental and applied research areas. Evidence of some commonality in the essential gene-complement between these two organisms indicates that an ML-engineering approach could find broader applicability to ecdysozoans such as parasitic nematodes or arthropods, provided that suitably large and informative data sets become/are available for proper feature engineering, and for the robust training and validation of algorithms. This area warrants detailed exploration to, for example, facilitate the identification and characterisation of essential molecules as novel targets for drugs and vaccines against parasitic diseases. This focus is particularly important, given the substantial impact that such diseases have worldwide, and the current challenges associated with their prevention and control and with drug resistance in parasite populations.
Collapse
Affiliation(s)
- Tulio L Campos
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia; Bioinformatics Core Facility, Instituto Aggeu Magalhães, Fundação Oswaldo Cruz (IAM-Fiocruz), Recife, Pernambuco, Brazil
| | - Pasi K Korhonen
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Andreas Hofmann
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia.
| | - Neil D Young
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia.
| |
Collapse
|
22
|
Abstract
The same gene is often regulated differently in response to stress in even closely related plant species. Directly measuring stress-responsive gene expression can be financially and logistically challenging in nonmodel species. Here, we show that models trained using data on which genes respond to cold in one species can predict which genes will respond to cold in related species, even when the training and target species vary in their degree of tolerance to cold. The prediction models we used require only genomic sequence and gene models. As a result, data from well-studied model species may be used to predict which genes will respond to stress in less-studied species with sequenced genomes. Although genome-sequence assemblies are available for a growing number of plant species, gene-expression responses to stimuli have been cataloged for only a subset of these species. Many genes show altered transcription patterns in response to abiotic stresses. However, orthologous genes in related species often exhibit different responses to a given stress. Accordingly, data on the regulation of gene expression in one species are not reliable predictors of orthologous gene responses in a related species. Here, we trained a supervised classification model to identify genes that transcriptionally respond to cold stress. A model trained with only features calculated directly from genome assemblies exhibited only modest decreases in performance relative to models trained by using genomic, chromatin, and evolution/diversity features. Models trained with data from one species successfully predicted which genes would respond to cold stress in other related species. Cross-species predictions remained accurate when training was performed in cold-sensitive species and predictions were performed in cold-tolerant species and vice versa. Models trained with data on gene expression in multiple species provided at least equivalent performance to models trained and tested in a single species and outperformed single-species models in cross-species prediction. These results suggest that classifiers trained on stress data from well-studied species may suffice for predicting gene-expression patterns in related, less-studied species with sequenced genomes.
Collapse
|
23
|
Barbier J, Vaillant C, Volff JN, Brunet FG, Audit B. Coupling between Sequence-Mediated Nucleosome Organization and Genome Evolution. Genes (Basel) 2021; 12:genes12060851. [PMID: 34205881 PMCID: PMC8228248 DOI: 10.3390/genes12060851] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 05/27/2021] [Accepted: 05/27/2021] [Indexed: 12/12/2022] Open
Abstract
The nucleosome is a major modulator of DNA accessibility to other cellular factors. Nucleosome positioning has a critical importance in regulating cell processes such as transcription, replication, recombination or DNA repair. The DNA sequence has an influence on the position of nucleosomes on genomes, although other factors are also implicated, such as ATP-dependent remodelers or competition of the nucleosome with DNA binding proteins. Different sequence motifs can promote or inhibit the nucleosome formation, thus influencing the accessibility to the DNA. Sequence-encoded nucleosome positioning having functional consequences on cell processes can then be selected or counter-selected during evolution. We review the interplay between sequence evolution and nucleosome positioning evolution. We first focus on the different ways to encode nucleosome positions in the DNA sequence, and to which extent these mechanisms are responsible of genome-wide nucleosome positioning in vivo. Then, we discuss the findings about selection of sequences for their nucleosomal properties. Finally, we illustrate how the nucleosome can directly influence sequence evolution through its interactions with DNA damage and repair mechanisms. This review aims to provide an overview of the mutual influence of sequence evolution and nucleosome positioning evolution, possibly leading to complex evolutionary dynamics.
Collapse
Affiliation(s)
- Jérémy Barbier
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, F-69364 Lyon, France; (J.B.); (F.G.B.)
- Laboratoire de Physique, Univ Lyon, ENS de Lyon, CNRS, F-69342 Lyon, France;
| | - Cédric Vaillant
- Laboratoire de Physique, Univ Lyon, ENS de Lyon, CNRS, F-69342 Lyon, France;
| | - Jean-Nicolas Volff
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, F-69364 Lyon, France; (J.B.); (F.G.B.)
- Correspondence: (J.-N.V.); (B.A.)
| | - Frédéric G. Brunet
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, F-69364 Lyon, France; (J.B.); (F.G.B.)
| | - Benjamin Audit
- Laboratoire de Physique, Univ Lyon, ENS de Lyon, CNRS, F-69342 Lyon, France;
- Correspondence: (J.-N.V.); (B.A.)
| |
Collapse
|
24
|
Peculiarities of Plasmodium falciparum Gene Regulation and Chromatin Structure. Int J Mol Sci 2021; 22:ijms22105168. [PMID: 34068393 PMCID: PMC8153576 DOI: 10.3390/ijms22105168] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 05/10/2021] [Accepted: 05/10/2021] [Indexed: 12/14/2022] Open
Abstract
The highly complex life cycle of the human malaria parasite, Plasmodium falciparum, is based on an orchestrated and tightly regulated gene expression program. In general, eukaryotic transcription regulation is determined by a combination of sequence-specific transcription factors binding to regulatory DNA elements and the packaging of DNA into chromatin as an additional layer. The accessibility of regulatory DNA elements is controlled by the nucleosome occupancy and changes of their positions by an active process called nucleosome remodeling. These epigenetic mechanisms are poorly explored in P. falciparum. The parasite genome is characterized by an extraordinarily high AT-content and the distinct architecture of functional elements, and chromatin-related proteins also exhibit high sequence divergence compared to other eukaryotes. Together with the distinct biochemical properties of nucleosomes, these features suggest substantial differences in chromatin-dependent regulation. Here, we highlight the peculiarities of epigenetic mechanisms in P. falciparum, addressing chromatin structure and dynamics with respect to their impact on transcriptional control. We focus on the specialized chromatin remodeling enzymes and discuss their essential function in P. falciparum gene regulation.
Collapse
|
25
|
Liu Y, Yang Q, Zhao F. Synonymous but Not Silent: The Codon Usage Code for Gene Expression and Protein Folding. Annu Rev Biochem 2021; 90:375-401. [PMID: 33441035 DOI: 10.1146/annurev-biochem-071320-112701] [Citation(s) in RCA: 70] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Codon usage bias, the preference for certain synonymous codons, is found in all genomes. Although synonymous mutations were previously thought to be silent, a large body of evidence has demonstrated that codon usage can play major roles in determining gene expression levels and protein structures. Codon usage influences translation elongation speed and regulates translation efficiency and accuracy. Adaptation of codon usage to tRNA expression determines the proteome landscape. In addition, codon usage biases result in nonuniform ribosome decoding rates on mRNAs, which in turn influence the cotranslational protein folding process that is critical for protein function in diverse biological processes. Conserved genome-wide correlations have also been found between codon usage and protein structures. Furthermore, codon usage is a major determinant of mRNA levels through translation-dependent effects on mRNA decay and translation-independent effects on transcriptional and posttranscriptional processes. Here, we discuss the multifaceted roles and mechanisms of codon usage in different gene regulatory processes.
Collapse
Affiliation(s)
- Yi Liu
- Department of Physiology, University of Texas Southwestern Medical Center, Dallas, Texas 75390-9040, USA;
| | - Qian Yang
- Department of Physiology, University of Texas Southwestern Medical Center, Dallas, Texas 75390-9040, USA;
| | - Fangzhou Zhao
- Department of Physiology, University of Texas Southwestern Medical Center, Dallas, Texas 75390-9040, USA;
| |
Collapse
|
26
|
Lyubitelev AV, Kirpichnikov MP, Studitsky VM. The Role of Linker Histones in Carcinogenesis. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2021. [DOI: 10.1134/s1068162021010143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
27
|
Krützfeldt LM, Schubach M, Kircher M. The impact of different negative training data on regulatory sequence predictions. PLoS One 2020; 15:e0237412. [PMID: 33259518 PMCID: PMC7707526 DOI: 10.1371/journal.pone.0237412] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Accepted: 11/12/2020] [Indexed: 01/08/2023] Open
Abstract
Regulatory regions, like promoters and enhancers, cover an estimated 5–15% of the human genome. Changes to these sequences are thought to underlie much of human phenotypic variation and a substantial proportion of genetic causes of disease. However, our understanding of their functional encoding in DNA is still very limited. Applying machine or deep learning methods can shed light on this encoding and gapped k-mer support vector machines (gkm-SVMs) or convolutional neural networks (CNNs) are commonly trained on putative regulatory sequences. Here, we investigate the impact of negative sequence selection on model performance. By training gkm-SVM and CNN models on open chromatin data and corresponding negative training dataset, both learners and two approaches for negative training data are compared. Negative sets use either genomic background sequences or sequence shuffles of the positive sequences. Model performance was evaluated on three different tasks: predicting elements active in a cell-type, predicting cell-type specific elements, and predicting elements' relative activity as measured from independent experimental data. Our results indicate strong effects of the negative training data, with genomic backgrounds showing overall best results. Specifically, models trained on highly shuffled sequences perform worse on the complex tasks of tissue-specific activity and quantitative activity prediction, and seem to learn features of artificial sequences rather than regulatory activity. Further, we observe that insufficient matching of genomic background sequences results in model biases. While CNNs achieved and exceeded the performance of gkm-SVMs for larger training datasets, gkm-SVMs gave robust and best results for typical training dataset sizes without the need of hyperparameter optimization.
Collapse
Affiliation(s)
- Louisa-Marie Krützfeldt
- Charité–Universitätsmedizin Berlin, Berlin, Germany
- Berlin Institute of Health (BIH), Berlin, Germany
| | - Max Schubach
- Charité–Universitätsmedizin Berlin, Berlin, Germany
- Berlin Institute of Health (BIH), Berlin, Germany
| | - Martin Kircher
- Charité–Universitätsmedizin Berlin, Berlin, Germany
- Berlin Institute of Health (BIH), Berlin, Germany
- * E-mail:
| |
Collapse
|
28
|
Lee W, Kim J, Yun JM, Ohn T, Gong Q. MeCP2 regulates gene expression through recognition of H3K27me3. Nat Commun 2020; 11:3140. [PMID: 32561780 PMCID: PMC7305159 DOI: 10.1038/s41467-020-16907-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Accepted: 05/27/2020] [Indexed: 02/08/2023] Open
Abstract
MeCP2 plays a multifaceted role in gene expression regulation and chromatin organization. Interaction between MeCP2 and methylated DNA in the regulation of gene expression is well established. However, the widespread distribution of MeCP2 suggests it has additional interactions with chromatin. Here we demonstrate, by both biochemical and genomic analyses, that MeCP2 directly interacts with nucleosomes and its genomic distribution correlates with that of H3K27me3. In particular, the methyl-CpG-binding domain of MeCP2 shows preferential interactions with H3K27me3. We further observe that the impact of MeCP2 on transcriptional changes correlates with histone post-translational modification patterns. Our findings indicate that MeCP2 interacts with genomic loci via binding to DNA as well as histones, and that interaction between MeCP2 and histone proteins plays a key role in gene expression regulation.
Collapse
Affiliation(s)
- Wooje Lee
- Department of Cellular & Molecular Medicine, College of Medicine, Chosun University, Gwangju, 61452, South Korea
| | - Jeeho Kim
- Department of Cellular & Molecular Medicine, College of Medicine, Chosun University, Gwangju, 61452, South Korea
| | - Jung-Mi Yun
- Department of Food and Nutrition, Chonnam National University, Gwangju, 61186, South Korea
| | - Takbum Ohn
- Department of Cellular & Molecular Medicine, College of Medicine, Chosun University, Gwangju, 61452, South Korea.
| | - Qizhi Gong
- Department of Cell Biology and Human Anatomy, University of California at Davis, School of Medicine, Davis, CA, 95616, USA.
| |
Collapse
|
29
|
Neipel J, Brandani G, Schiessel H. Translational nucleosome positioning: A computational study. Phys Rev E 2020; 101:022405. [PMID: 32168683 DOI: 10.1103/physreve.101.022405] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Accepted: 11/25/2019] [Indexed: 01/26/2023]
Abstract
About three-quarters of eukaryotic DNA is wrapped into nucleosomes; DNA spools with a protein core. The affinity of a given DNA stretch to be incorporated into a nucleosome is known to depend on the base-pair sequence-dependent geometry and elasticity of the DNA double helix. This causes the rotational and translational positioning of nucleosomes. In this study we ask the question whether the latter can be predicted by a simple coarse-grained DNA model with sequence-dependent elasticity, the rigid base-pair model. Whereas this model is known to be rather robust in predicting rotational nucleosome positioning, we show that the translational positioning is a rather subtle effect that is dominated by the guanine-cytosine content dependence of entropy rather than energy. A correct qualitative prediction within the rigid base-pair framework can only be achieved by assuming that DNA elasticity effectively changes on complexation into the nucleosome complex. With that extra assumption we arrive at a model which gives an excellent quantitative agreement to experimental in vitro nucleosome maps, under the additional assumption that nucleosomes equilibrate their positions only locally.
Collapse
Affiliation(s)
- J Neipel
- Max Planck Institute for the Physics of Complex Systems, 01187 Dresden, Germany.,Faculty of Physics, Ludwig-Maximilians-Universität München, 80333 München, Germany.,Instituut-Lorentz, Universiteit Leiden, Postbus 9506, 2300 RA Leiden, The Netherlands
| | - G Brandani
- Department of Biophysics, Graduate School of Science, Kyoto University, Kyoto 606-8502, Japan
| | - H Schiessel
- Instituut-Lorentz, Universiteit Leiden, Postbus 9506, 2300 RA Leiden, The Netherlands
| |
Collapse
|
30
|
Schnepf M, Ludwig C, Bandilla P, Ceolin S, Unnerstall U, Jung C, Gaul U. Sensitive Automated Measurement of Histone-DNA Affinities in Nucleosomes. iScience 2020; 23:100824. [PMID: 31982782 PMCID: PMC6994541 DOI: 10.1016/j.isci.2020.100824] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Revised: 12/12/2019] [Accepted: 01/06/2020] [Indexed: 11/06/2022] Open
Abstract
The DNA of eukaryotes is wrapped around histone octamers to form nucleosomes. Although it is well established that the DNA sequence significantly influences nucleosome formation, its precise contribution has remained controversial, partially owing to the lack of quantitative affinity data. Here, we present a method to measure DNA-histone binding free energies at medium throughput and with high sensitivity. Competitive nucleosome formation is achieved through automation, and a modified epifluorescence microscope is used to rapidly and accurately measure the fractions of bound/unbound DNA based on fluorescence anisotropy. The procedure allows us to obtain full titration curves with high reproducibility. We applied this technique to measure the histone-DNA affinities for 47 DNA sequences and analyzed how the affinities correlate with relevant DNA sequence features. We found that the GC content has a significant impact on nucleosome-forming preferences, but 10 bp dinucleotide periodicities and the presence of poly(dA:dT) stretches do not. Robotics permits full titration series to measure histone-DNA binding affinities Fluorescence anisotropy used as a fast, sensitive readout of bound/unbound DNA Free energies span three orders of magnitude, less for naturally occurring sequences GC content is a major determinant of measured binding free energies
Collapse
Affiliation(s)
- Max Schnepf
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Claudia Ludwig
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Peter Bandilla
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Stefano Ceolin
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Ulrich Unnerstall
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Christophe Jung
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany.
| | - Ulrike Gaul
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| |
Collapse
|
31
|
Zenil H, Minary P. Training-free measures based on algorithmic probability identify high nucleosome occupancy in DNA sequences. Nucleic Acids Res 2019; 47:e129. [PMID: 31511887 PMCID: PMC6846163 DOI: 10.1093/nar/gkz750] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Revised: 07/10/2019] [Accepted: 08/27/2019] [Indexed: 01/01/2023] Open
Abstract
We introduce and study a set of training-free methods of an information-theoretic and algorithmic complexity nature that we apply to DNA sequences to identify their potential to identify nucleosomal binding sites. We test the measures on well-studied genomic sequences of different sizes drawn from different sources. The measures reveal the known in vivo versus in vitro predictive discrepancies and uncover their potential to pinpoint high and low nucleosome occupancy. We explore different possible signals within and beyond the nucleosome length and find that the complexity indices are informative of nucleosome occupancy. We found that, while it is clear that the gold standard Kaplan model is driven by GC content (by design) and by k-mer training; for high occupancy, entropy and complexity-based scores are also informative and can complement the Kaplan model.
Collapse
Affiliation(s)
- Hector Zenil
- Oxford Immune Algorithmics, Oxford University Innovation, Oxford, UK
- Algorithmic Dynamics Lab, Unit of Computational Medicine, SciLifeLab, Center for Molecular Medicine, Karolinska Institute, Stockholm, Sweden
- Algorithmic Nature Group, LABORES for the Natural and Digital Sciences, Paris, France
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Peter Minary
- Department of Computer Science, University of Oxford, Oxford, UK
| |
Collapse
|
32
|
Hocher A, Rojec M, Swadling JB, Esin A, Warnecke T. The DNA-binding protein HTa from Thermoplasma acidophilum is an archaeal histone analog. eLife 2019; 8:52542. [PMID: 31710291 PMCID: PMC6877293 DOI: 10.7554/elife.52542] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Accepted: 11/10/2019] [Indexed: 02/06/2023] Open
Abstract
Histones are a principal constituent of chromatin in eukaryotes and fundamental to our understanding of eukaryotic gene regulation. In archaea, histones are widespread but not universal: several lineages have lost histone genes. What prompted or facilitated these losses and how archaea without histones organize their chromatin remains largely unknown. Here, we elucidate primary chromatin architecture in an archaeon without histones, Thermoplasma acidophilum, which harbors a HU family protein (HTa) that protects part of the genome from micrococcal nuclease digestion. Charting HTa-based chromatin architecture in vitro, in vivo and in an HTa-expressing E. coli strain, we present evidence that HTa is an archaeal histone analog. HTa preferentially binds to GC-rich sequences, exhibits invariant positioning throughout the growth cycle, and shows archaeal histone-like oligomerization behavior. Our results suggest that HTa, a DNA-binding protein of bacterial origin, has converged onto an architectural role filled by histones in other archaea.
Collapse
Affiliation(s)
- Antoine Hocher
- MRC London Institute of Medical Sciences (LMS), London, United Kingdom.,Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College, London, United Kingdom
| | - Maria Rojec
- MRC London Institute of Medical Sciences (LMS), London, United Kingdom.,Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College, London, United Kingdom
| | - Jacob B Swadling
- MRC London Institute of Medical Sciences (LMS), London, United Kingdom.,Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College, London, United Kingdom
| | - Alexander Esin
- MRC London Institute of Medical Sciences (LMS), London, United Kingdom.,Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College, London, United Kingdom
| | - Tobias Warnecke
- MRC London Institute of Medical Sciences (LMS), London, United Kingdom.,Institute of Clinical Sciences (ICS), Faculty of Medicine, Imperial College, London, United Kingdom
| |
Collapse
|
33
|
Chereji RV, Bryson TD, Henikoff S. Quantitative MNase-seq accurately maps nucleosome occupancy levels. Genome Biol 2019; 20:198. [PMID: 31519205 PMCID: PMC6743174 DOI: 10.1186/s13059-019-1815-z] [Citation(s) in RCA: 74] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2019] [Accepted: 09/04/2019] [Indexed: 12/15/2022] Open
Abstract
Micrococcal nuclease (MNase) is widely used to map nucleosomes. However, its aggressive endo-/exo-nuclease activities make MNase-seq unreliable for determining nucleosome occupancies, because cleavages within linker regions produce oligo- and mono-nucleosomes, whereas cleavages within nucleosomes destroy them. Here, we introduce a theoretical framework for predicting nucleosome occupancies and an experimental protocol with appropriate spike-in normalization that confirms our theory and provides accurate occupancy levels over an MNase digestion time course. As with human cells, we observe no overall differences in nucleosome occupancies between Drosophila euchromatin and heterochromatin, which implies that heterochromatic compaction does not reduce MNase accessibility of linker DNA.
Collapse
Affiliation(s)
- Răzvan V Chereji
- Division of Developmental Biology, Eunice Kennedy Shriver National Institute for Child Health and Human Development, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Terri D Bryson
- Howard Hughes Medical Institute and Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| | - Steven Henikoff
- Howard Hughes Medical Institute and Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA.
| |
Collapse
|
34
|
Chen FX, Smith ER, Shilatifard A. Born to run: control of transcription elongation by RNA polymerase II. Nat Rev Mol Cell Biol 2019; 19:464-478. [PMID: 29740129 DOI: 10.1038/s41580-018-0010-5] [Citation(s) in RCA: 262] [Impact Index Per Article: 52.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The dynamic regulation of transcription elongation by RNA polymerase II (Pol II) is an integral part of the implementation of gene expression programmes during development. In most metazoans, the majority of transcribed genes exhibit transient pausing of Pol II at promoter-proximal regions, and the release of Pol II into gene bodies is controlled by many regulatory factors that respond to environmental and developmental cues. Misregulation of the elongation stage of transcription is implicated in cancer and other human diseases, suggesting that mechanistic understanding of transcription elongation control is therapeutically relevant. In this Review, we discuss the features, establishment and maintenance of Pol II pausing, the transition into productive elongation, the control of transcription elongation by enhancers and by factors of other cellular processes, such as topoisomerases and poly(ADP-ribose) polymerases (PARPs), and the potential of therapeutic targeting of the elongation stage of transcription by Pol II.
Collapse
Affiliation(s)
- Fei Xavier Chen
- Simpson Querrey Center for Epigenetics and the Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Edwin R Smith
- Simpson Querrey Center for Epigenetics and the Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Ali Shilatifard
- Simpson Querrey Center for Epigenetics and the Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
| |
Collapse
|
35
|
Glaich O, Leader Y, Lev Maor G, Ast G. Histone H1.5 binds over splice sites in chromatin and regulates alternative splicing. Nucleic Acids Res 2019; 47:6145-6159. [PMID: 31076740 PMCID: PMC6614845 DOI: 10.1093/nar/gkz338] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Revised: 04/17/2019] [Accepted: 04/27/2019] [Indexed: 12/11/2022] Open
Abstract
Chromatin organization and epigenetic markers influence splicing, though the magnitudes of these effects and the mechanisms are largely unknown. Here, we demonstrate that linker histone H1.5 influences mRNA splicing. We observed that linker histone H1.5 binds DNA over splice sites of short exons in human lung fibroblasts (IMR90 cells). We found that association of H1.5 with these splice sites correlated with the level of inclusion of alternatively spliced exons. Exons marked by H1.5 had more RNA polymerase II (RNAP II) stalling near the 3' splice site than did exons not associated with H1.5. In cells depleted of H1.5, we showed that the inclusion of five exons evaluated decreased and that RNAP II levels over these exons were also reduced. Our findings indicate that H1.5 is involved in regulation of splice site selection and alternative splicing, a function not previously demonstrated for linker histones.
Collapse
Affiliation(s)
- Ohad Glaich
- Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Ramat Aviv 69978, Israel
| | - Yodfat Leader
- Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Ramat Aviv 69978, Israel
| | - Galit Lev Maor
- Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Ramat Aviv 69978, Israel
| | - Gil Ast
- Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Ramat Aviv 69978, Israel
| |
Collapse
|
36
|
Cakiroglu A, Clapier CR, Ehrensberger AH, Darbo E, Cairns BR, Luscombe NM, Svejstrup JQ. Genome-wide reconstitution of chromatin transactions reveals that RSC preferentially disrupts H2AZ-containing nucleosomes. Genome Res 2019; 29:988-998. [PMID: 31097474 PMCID: PMC6581049 DOI: 10.1101/gr.243139.118] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 05/08/2019] [Indexed: 12/03/2022]
Abstract
Chromatin transactions are typically studied in vivo, or in vitro using artificial chromatin lacking the epigenetic complexity of the natural material. Attempting to bridge the gap between these approaches, we established a system for isolating the yeast genome as a library of mononucleosomes harboring the natural epigenetic signature, suitable for biochemical manipulation. Combined with deep sequencing, this library was used to investigate the stability of individual nucleosomes and, as proof of principle, the nucleosome preference of the chromatin remodeling complex, RSC. This approach uncovered a distinct preference of RSC for nucleosomes derived from regions with a high density of histone variant H2AZ, and this preference is indeed markedly diminished using nucleosomes from cells lacking H2AZ. The preference for H2AZ remodeling/nucleosome ejection can also be reconstituted with recombinant nucleosome arrays. Together, our data indicate that, despite being separated from their genomic context, individual nucleosomes can retain their original identity as promoter- or transcription start site (TSS)-nucleosomes. Besides shedding new light on substrate preference of the chromatin remodeler RSC, the simple experimental system outlined here should be generally applicable to the study of chromatin transactions.
Collapse
Affiliation(s)
- Aylin Cakiroglu
- Bioinformatics and Computational Biology Laboratory, The Francis Crick Institute, London NW1 1AT, United Kingdom
| | - Cedric R Clapier
- Department of Oncological Sciences, Huntsman Cancer Institute, and Howard Hughes Medical Institute, University of Utah School of Medicine, Salt Lake City, Utah 84112, USA
| | - Andreas H Ehrensberger
- Bioinformatics and Computational Biology Laboratory, The Francis Crick Institute, London NW1 1AT, United Kingdom
- Mechanisms of Transcription Laboratory, The Francis Crick Institute, London NW1 1AT, United Kingdom
| | - Elodie Darbo
- Bioinformatics and Computational Biology Laboratory, The Francis Crick Institute, London NW1 1AT, United Kingdom
| | - Bradley R Cairns
- Department of Oncological Sciences, Huntsman Cancer Institute, and Howard Hughes Medical Institute, University of Utah School of Medicine, Salt Lake City, Utah 84112, USA
| | - Nicholas M Luscombe
- Bioinformatics and Computational Biology Laboratory, The Francis Crick Institute, London NW1 1AT, United Kingdom
- UCL Genetics Institute, Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, United Kingdom
| | - Jesper Q Svejstrup
- Mechanisms of Transcription Laboratory, The Francis Crick Institute, London NW1 1AT, United Kingdom
| |
Collapse
|
37
|
Veil M, Yampolsky LY, Grüning B, Onichtchouk D. Pou5f3, SoxB1, and Nanog remodel chromatin on high nucleosome affinity regions at zygotic genome activation. Genome Res 2019; 29:383-395. [PMID: 30674556 PMCID: PMC6396415 DOI: 10.1101/gr.240572.118] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 01/16/2019] [Indexed: 12/16/2022]
Abstract
The zebrafish embryo is transcriptionally mostly quiescent during the first 10 cell cycles, until the main wave of zygotic genome activation (ZGA) occurs, accompanied by fast chromatin remodeling. At ZGA, homologs of the mammalian stem cell transcription factors (TFs) Pou5f3, Nanog, and Sox19b bind to thousands of developmental enhancers to initiate transcription. So far, how these TFs influence chromatin dynamics at ZGA has remained unresolved. To address this question, we analyzed nucleosome positions in wild-type and maternal-zygotic (MZ) mutants for pou5f3 and nanog by MNase-seq. We show that Nanog, Sox19b, and Pou5f3 bind to the high nucleosome affinity regions (HNARs). HNARs are spanning over 600 bp, featuring high in vivo and predicted in vitro nucleosome occupancy and high predicted propeller twist DNA shape value. We suggest a two-step nucleosome destabilization-depletion model, in which the same intrinsic DNA properties of HNAR promote both high nucleosome occupancy and differential binding of TFs. In the first step, already before ZGA, Pou5f3 and Nanog destabilize nucleosomes at HNAR centers genome-wide. In the second step, post-ZGA, Nanog, Pou5f3, and SoxB1 maintain open chromatin state on the subset of HNARs, acting synergistically. Nanog binds to the HNAR center, whereas the Pou5f3 stabilizes the flanks. The HNAR model will provide a useful tool for genome regulatory studies in a variety of biological systems.
Collapse
Affiliation(s)
- Marina Veil
- Department of Developmental Biology, Institute of Biology I, Faculty of Biology, Albert Ludwigs University of Freiburg, 79104, Freiburg, Germany
| | - Lev Y Yampolsky
- Department of Biological Sciences, East Tennessee State University, Johnson City, Tennessee 37614-1710, USA.,Zoological Institute, Basel University, Basel, CH-4051 Switzerland
| | - Björn Grüning
- Department of Computer Science, Albert Ludwigs University of Freiburg, 79110, Freiburg, Germany.,Center for Biological Systems Analysis (ZBSA), University of Freiburg, 79104, Freiburg, Germany
| | - Daria Onichtchouk
- Department of Developmental Biology, Institute of Biology I, Faculty of Biology, Albert Ludwigs University of Freiburg, 79104, Freiburg, Germany.,Signalling Research centers BIOSS and CIBSS, 79104, Freiburg, Germany.,Institute of Developmental Biology RAS, 119991 Moscow, Russia
| |
Collapse
|
38
|
Winogradoff D, Aksimentiev A. Molecular Mechanism of Spontaneous Nucleosome Unraveling. J Mol Biol 2019; 431:323-335. [PMID: 30468737 PMCID: PMC6331254 DOI: 10.1016/j.jmb.2018.11.013] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Revised: 10/26/2018] [Accepted: 11/12/2018] [Indexed: 11/18/2022]
Abstract
Meters of DNA wrap around histone proteins to form nucleosomes and fit inside the micron-diameter nucleus. For the genetic information encoded in the DNA to become available for transcription, replication, and repair, the DNA-histone assembly must be disrupted. Experiment has indicated that the outer stretches of nucleosomal DNA "breathe" by spontaneously detaching from and reattaching to the histone core. Here, we report direct observation of spontaneous DNA breathing in atomistic molecular dynamics simulations, detailing a microscopic mechanism of the DNA breathing process. According to our simulations, the outer stretches of nucleosomal DNA detach in discrete steps involving 5 or 10 base pairs, with the detachment process being orchestrated by the motion of several conserved histone residues. The inner stretches of nucleosomal DNA are found to be more stably associated with the histone core by more abundant nonspecific DNA-protein contacts, providing a microscopic interpretation of nucleosome unraveling experiments. The CG content of nucleosomal DNA is found to anticorrelate with the extent of unwrapping, supporting the possibility that AT-rich segments may signal the start of transcription by forming less stable nucleosomes.
Collapse
Affiliation(s)
- David Winogradoff
- Center for the Physics of Living Cells, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Aleksei Aksimentiev
- Center for the Physics of Living Cells, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
| |
Collapse
|
39
|
Ohno M, Ando T, Priest DG, Kumar V, Yoshida Y, Taniguchi Y. Sub-nucleosomal Genome Structure Reveals Distinct Nucleosome Folding Motifs. Cell 2019; 176:520-534.e25. [DOI: 10.1016/j.cell.2018.12.014] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2017] [Revised: 10/16/2018] [Accepted: 12/09/2018] [Indexed: 12/11/2022]
|
40
|
Malkowska M, Zubek J, Plewczynski D, Wyrwicz LS. ShapeGTB: the role of local DNA shape in prioritization of functional variants in human promoters with machine learning. PeerJ 2018; 6:e5742. [PMID: 30519505 PMCID: PMC6275119 DOI: 10.7717/peerj.5742] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Accepted: 09/13/2018] [Indexed: 02/01/2023] Open
Abstract
Motivation The identification of functional sequence variations in regulatory DNA regions is one of the major challenges of modern genetics. Here, we report results of a combined multifactor analysis of properties characterizing functional sequence variants located in promoter regions of genes. Results We demonstrate that GC-content of the local sequence fragments and local DNA shape features play significant role in prioritization of functional variants and outscore features related to histone modifications, transcription factors binding sites, or evolutionary conservation descriptors. Those observations allowed us to build specialized machine learning classifier identifying functional single nucleotide polymorphisms within promoter regions—ShapeGTB. We compared our method with more general tools predicting pathogenicity of all non-coding variants. ShapeGTB outperformed them by a wide margin (average precision 0.93 vs. 0.47–0.55). On the external validation set based on ClinVar database it displayed worse performance but was still competitive with other methods (average precision 0.47 vs. 0.23–0.42). Such results suggest unique characteristics of mutations located within promoter regions and are a promising signal for the development of more accurate variant prioritization tools in the future.
Collapse
Affiliation(s)
- Maja Malkowska
- Laboratory of Bioinformatics and Biostatistics, Maria Sklodowska-Curie Memorial Cancer Centre and Institute of Oncology, Warsaw, Poland
| | - Julian Zubek
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw, Poland
| | - Dariusz Plewczynski
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Warsaw, Poland.,Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | - Lucjan S Wyrwicz
- Laboratory of Bioinformatics and Biostatistics, Maria Sklodowska-Curie Memorial Cancer Centre and Institute of Oncology, Warsaw, Poland
| |
Collapse
|
41
|
Doris SM, Chuang J, Viktorovskaya O, Murawska M, Spatt D, Churchman LS, Winston F. Spt6 Is Required for the Fidelity of Promoter Selection. Mol Cell 2018; 72:687-699.e6. [PMID: 30318445 DOI: 10.1016/j.molcel.2018.09.005] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Revised: 08/20/2018] [Accepted: 08/31/2018] [Indexed: 01/06/2023]
Abstract
Spt6 is a conserved factor that controls transcription and chromatin structure across the genome. Although Spt6 is viewed as an elongation factor, spt6 mutations in Saccharomyces cerevisiae allow elevated levels of transcripts from within coding regions, suggesting that Spt6 also controls initiation. To address the requirements for Spt6 in transcription and chromatin structure, we have combined four genome-wide approaches. Our results demonstrate that Spt6 represses transcription initiation at thousands of intragenic promoters. We characterize these intragenic promoters and find sequence features conserved with genic promoters. Finally, we show that Spt6 also regulates transcription initiation at most genic promoters and propose a model of initiation site competition to account for this. Together, our results demonstrate that Spt6 controls the fidelity of transcription initiation throughout the genome.
Collapse
Affiliation(s)
- Stephen M Doris
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - James Chuang
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA; Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | | | | | - Dan Spatt
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | | | - Fred Winston
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
42
|
Abstract
Nucleosomes form the fundamental building blocks of eukaryotic chromatin, and previous attempts to understand the principles governing their genome-wide distribution have spurred much interest and debate in biology. In particular, the precise role of DNA sequence in shaping local chromatin structure has been controversial. This paper rigorously quantifies the contribution of hitherto-debated sequence features-including G+C content, 10.5 bp periodicity, and poly(dA:dT) tracts-to three distinct aspects of genome-wide nucleosome landscape: occupancy, translational positioning and rotational positioning. Our computational framework simultaneously learns nucleosome number and nucleosome-positioning energy from genome-wide nucleosome maps. In contrast to other previous studies, our model can predict both in vitro and in vivo nucleosome maps in Saccharomyces cerevisiae. We find that although G+C content is the primary determinant of MNase-derived nucleosome occupancy, MNase digestion biases may substantially influence this GC dependence. By contrast, poly(dA:dT) tracts are seen to deter nucleosome formation, regardless of the experimental method used. We further show that the 10.5 bp nucleotide periodicity facilitates rotational but not translational positioning. Applying our method to in vivo nucleosome maps demonstrates that, for a subset of genes, the regularly-spaced nucleosome arrays observed around transcription start sites can be partially recapitulated by DNA sequence alone. Finally, in vivo nucleosome occupancy derived from MNase-seq experiments around transcription termination sites can be mostly explained by the genomic sequence. Implications of these results and potential extensions of the proposed computational framework are discussed.
Collapse
Affiliation(s)
- Hu Jin
- Department of Physics, University of Illinois, Urbana-Champaign, Urbana, IL 61801
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, Urbana, IL 61801
| | - Alex I. Finnegan
- Department of Physics, University of Illinois, Urbana-Champaign, Urbana, IL 61801
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, Urbana, IL 61801
| | - Jun S. Song
- Department of Physics, University of Illinois, Urbana-Champaign, Urbana, IL 61801
- Carl R. Woese Institute for Genomic Biology, University of Illinois, Urbana-Champaign, Urbana, IL 61801
| |
Collapse
|
43
|
Lecellier CH, Wasserman WW, Mathelier A. Human Enhancers Harboring Specific Sequence Composition, Activity, and Genome Organization Are Linked to the Immune Response. Genetics 2018; 209:1055-1071. [PMID: 29871881 PMCID: PMC6063234 DOI: 10.1534/genetics.118.301116] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 06/01/2018] [Indexed: 12/15/2022] Open
Abstract
The FANTOM5 consortium recently characterized 65,423 human enhancers from 1829 cell and tissue samples using the Cap Analysis of Gene Expression technology. We showed that the guanine and cytosine content at enhancer regions distinguishes two classes of enhancers harboring distinct DNA structural properties at flanking regions. A functional analysis of their predicted gene targets highlighted one class of enhancers as significantly enriched for associations with immune response genes. Moreover, these enhancers were specifically enriched for regulatory motifs recognized by transcription factors involved in immune response. We observed that enhancers enriched for links to immune response genes were more cell-type specific, preferentially activated upon bacterial infection, and with specific response activity. Looking at chromatin capture data, we found that the two classes of enhancers were lying in distinct topologically associating domains and chromatin loops. Our results suggest that specific nucleotide compositions encode for classes of enhancers that are functionally distinct and specifically organized in the human genome.
Collapse
Affiliation(s)
- Charles-Henri Lecellier
- Institut de Génétique Moléculaire de Montpellier, University of Montpellier, Centre National de la Recherche Scientifique (CNRS), 34293 Montpellier cedex5, France
- Institut de Biologie Computationnelle, 34095 Montpellier, France
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, Canada
| | - Wyeth W Wasserman
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, Canada
| | - Anthony Mathelier
- Centre for Molecular Medicine and Therapeutics at the Child and Family Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, V5Z 4H4, Canada
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, Faculty of Medicine, University of Oslo, 0349 Oslo, Norway
- Department of Cancer Genetics, Institute for Cancer Research, Oslo University Hospital Radiumhospitalet, 0372 Oslo, Norway
| |
Collapse
|
44
|
Cheng X, Hou Y, Nie Y, Zhang Y, Huang H, Liu H, Sun X. Nucleosome Positioning of Intronless Genes in the Human Genome. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1111-1121. [PMID: 26415210 DOI: 10.1109/tcbb.2015.2476811] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Nucleosomes, the basic units of chromatin, are involved in transcription regulation and DNA replication. Intronless genes, which constitute 3 percent of the human genome, differ from intron-containing genes in evolution and function. Our analysis reveals that nucleosome positioning shows a distinct pattern in intronless and intron-containing genes. The nucleosome occupancy upstream of transcription start sites of intronless genes is lower than that of intron-containing genes. In contrast, high occupancy and well positioned nucleosomes are observed along the gene body of intronless genes, which is perfectly consistent with the barrier nucleosome model. Intronless genes have a significantly lower expression level than intron-containing genes and most of them are not expressed in CD4+ T cell lines and GM12878 cell lines, which results from their tissue specificity. However, the highly expressed genes are at the same expression level between the two types of genes. The highly expressed intronless genes require a higher density of RNA Pol II in an elongating state to compensate for the lack of introns. Additionally, 5' and 3' nucleosome depleted regions of highly expressed intronless genes are deeper than those of highly expressed intron-containing genes.
Collapse
|
45
|
Brunet FG, Audit B, Drillon G, Argoul F, Volff JN, Arneodo A. Evidence for DNA Sequence Encoding of an Accessible Nucleosomal Array across Vertebrates. Biophys J 2018; 114:2308-2316. [PMID: 29580552 PMCID: PMC6028776 DOI: 10.1016/j.bpj.2018.02.025] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Revised: 02/07/2018] [Accepted: 02/20/2018] [Indexed: 12/15/2022] Open
Abstract
Nucleosome-depleted regions around which nucleosomes order following the "statistical" positioning scenario were recently shown to be encoded in the DNA sequence in human. This intrinsic nucleosomal ordering strongly correlates with oscillations in the local GC content as well as with the interspecies and intraspecies mutation profiles, revealing the existence of both positive and negative selection. In this letter, we show that these predicted nucleosome inhibitory energy barriers (NIEBs) with compacted neighboring nucleosomes are indeed ubiquitous to all vertebrates tested. These 1 kb-sized chromatin patterns are widely distributed along vertebrate chromosomes, overall covering more than a third of the genome. We have previously observed in human deviations from neutral evolution at these genome-wide distributed regions, which we interpreted as a possible indication of the selection of an open, accessible, and dynamic nucleosomal array to constitutively facilitate the epigenetic regulation of nuclear functions in a cell-type-specific manner. As a first, very appealing observation supporting this hypothesis, we report evidence of a strong association between NIEB borders and the poly(A) tails of Alu sequences in human. These results suggest that NIEBs provide adequate chromatin patterns favorable to the integration of Alu retrotransposons and, more generally to various transposable elements in the genomes of primates and other vertebrates.
Collapse
Affiliation(s)
- Frédéric G Brunet
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, Lyon, France
| | - Benjamin Audit
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France
| | - Guénola Drillon
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France
| | - Françoise Argoul
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France; LOMA, Université de Bordeaux, CNRS UMR 5798, Talence, France
| | - Jean-Nicolas Volff
- Institut de Génomique Fonctionnelle de Lyon, Univ Lyon, CNRS UMR 5242, Ecole Normale Supérieure de Lyon, Univ Claude Bernard Lyon 1, Lyon, France
| | - Alain Arneodo
- Univ Lyon, ENS de Lyon, Univ Claude Bernard Lyon 1, CNRS Laboratoire de Physique, Lyon, France; LOMA, Université de Bordeaux, CNRS UMR 5798, Talence, France.
| |
Collapse
|
46
|
Viral proteins as a potential driver of histone depletion in dinoflagellates. Nat Commun 2018; 9:1535. [PMID: 29670105 PMCID: PMC5906630 DOI: 10.1038/s41467-018-03993-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Accepted: 03/26/2018] [Indexed: 12/22/2022] Open
Abstract
Within canonical eukaryotic nuclei, DNA is packaged with highly conserved histone proteins into nucleosomes, which facilitate DNA condensation and contribute to genomic regulation. Yet the dinoflagellates, a group of unicellular algae, are a striking exception to this otherwise universal feature as they have largely abandoned histones and acquired apparently viral-derived substitutes termed DVNPs (dinoflagellate-viral-nucleoproteins). Despite the magnitude of this transition, its evolutionary drivers remain unknown. Here, using Saccharomyces cerevisiae as a model, we show that DVNP impairs growth and antagonizes chromatin by localizing to histone binding sites, displacing nucleosomes, and impairing transcription. Furthermore, DVNP toxicity can be relieved through histone depletion and cells diminish their histones in response to DVNP expression suggesting that histone reduction could have been an adaptive response to these viral proteins. These findings provide insights into eukaryotic chromatin evolution and highlight the potential for horizontal gene transfer to drive the divergence of cellular systems.
Collapse
|
47
|
Chen S, Li K, Cao W, Wang J, Zhao T, Huan Q, Yang YF, Wu S, Qian W. Codon-Resolution Analysis Reveals a Direct and Context-Dependent Impact of Individual Synonymous Mutations on mRNA Level. Mol Biol Evol 2018; 34:2944-2958. [PMID: 28961875 PMCID: PMC5850819 DOI: 10.1093/molbev/msx229] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Codon usage bias (CUB) refers to the observation that synonymous codons are not used equally frequently in a genome. CUB is stronger in more highly expressed genes, a phenomenon commonly explained by stronger natural selection on translational accuracy and/or efficiency among these genes. Nevertheless, this phenomenon could also occur if CUB regulates gene expression at the mRNA level, a hypothesis that has not been tested until recently. Here, we attempt to quantify the impact of synonymous mutations on mRNA level in yeast using 3,556 synonymous variants of a heterologous gene encoding green fluorescent protein (GFP) and 523 synonymous variants of an endogenous gene TDH3. We found that mRNA level was positively correlated with CUB among these synonymous variants, demonstrating a direct role of CUB in regulating transcript concentration, likely via regulating mRNA degradation rate, as our additional experiments suggested. More importantly, we quantified the effects of individual synonymous mutations on mRNA level and found them dependent on 1) CUB and 2) mRNA secondary structure, both in proximal sequence contexts. Our study reveals the pleiotropic effects of synonymous codon usage and provides an additional explanation for the well-known correlation between CUB and gene expression level.
Collapse
Affiliation(s)
- Siyu Chen
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Ke Li
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
| | - Wenqing Cao
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Jia Wang
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China.,Sino-Danish Center for Education and Research, Beijing, China
| | - Tong Zhao
- Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Qing Huan
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
| | - Yu-Fei Yang
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Shaohuan Wu
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Wenfeng Qian
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China.,Sino-Danish Center for Education and Research, Beijing, China
| |
Collapse
|
48
|
Jia Y, Li H, Wang J, Meng H, Yang Z. Spectrum structures and biological functions of 8-mers in the human genome. Genomics 2018. [PMID: 29522801 DOI: 10.1016/j.ygeno.2018.03.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The spectra of k-mer frequencies can reveal the structures and evolution of genome sequences. We confirmed that the trimodal spectrum of 8-mers in human genome sequences is distinguished only by CG2, CG1 and CG0 8-mer sets, containing 2,1 or 0 CpG, respectively. This phenomenon is called independent selection law. The three types of CG 8-mers were considered as different functional elements. We conjectured that (1) nucleosome binding motifs are mainly characterized by CG1 8-mers and (2) the core structural units of CpG island sequences are predominantly characterized by CG2 8-mers. To validate our conjectures, nucleosome occupied sequences and CGI sequences were extracted, then the sequence parameters were constructed through the information of the three CG 8-mer sets respectively. ROC analysis showed that CG1 8-mers are more preference in nucleosome occupied segments (AUC > 0.7) and CG2 8-mers are more preference in CGI sequences (AUC > 0.99). This validates our conjecture in principle.
Collapse
Affiliation(s)
- Yun Jia
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China; College of Science, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Hong Li
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China.
| | - Jingfeng Wang
- College of Science, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Hu Meng
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Zhenhua Yang
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| |
Collapse
|
49
|
Zhang Q, Oh DH, DiTusa SF, RamanaRao MV, Baisakh N, Dassanayake M, Smith AP. Rice nucleosome patterns undergo remodeling coincident with stress-induced gene expression. BMC Genomics 2018; 19:97. [PMID: 29373953 PMCID: PMC5787291 DOI: 10.1186/s12864-017-4397-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Accepted: 12/19/2017] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND Formation of nucleosomes along eukaryotic DNA has an impact on transcription. Major transcriptional changes occur in response to low external phosphate (Pi) in plants, but the involvement of chromatin-level mechanisms in Pi starvation responses have not been investigated. RESULTS We mapped nucleosomes along with transcriptional changes after 24-h of Pi starvation in rice (Oryza sativa) by deep sequencing of micrococcal nuclease digested chromatin and ribosome-depleted RNA. We demonstrated that nucleosome patterns at rice genes were affected by both cis- and trans-determinants, including GC content and transcription. Also, categorizing rice genes by nucleosome patterns across the transcription start site (TSS) revealed nucleosome patterns that correlated with distinct functional categories of genes. We further demonstrated that Pi starvation resulted in numerous dynamic nucleosomes, which were enhanced at genes differentially expressed in response to Pi starvation. CONCLUSIONS We demonstrate that rice nucleosome patterns are suggestive of gene functions, and reveal a link between chromatin remodeling and transcriptional changes in response to deficiency of a major macronutrient. Our findings help to enhance the understanding towards eukaryotic gene regulation at the chromatin level.
Collapse
Affiliation(s)
- Qi Zhang
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Dong-Ha Oh
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Sandra Feuer DiTusa
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Mangu V RamanaRao
- School of Plant, Environmental, and Soil Sciences, Louisiana State University Agricultural Center, Baton Rouge, LA, 70803, USA
| | - Niranjan Baisakh
- School of Plant, Environmental, and Soil Sciences, Louisiana State University Agricultural Center, Baton Rouge, LA, 70803, USA
| | - Maheshi Dassanayake
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Aaron P Smith
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA.
| |
Collapse
|
50
|
High-Resolution Genome-Wide Mapping of Nucleosome Positioning and Occupancy Level Using Paired-End Sequencing Technology. Methods Mol Biol 2018; 1528:229-243. [PMID: 27854025 DOI: 10.1007/978-1-4939-6630-1_14] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
Because of its profound influence on DNA accessibility for protein binding and thus on the regulation of diverse biological processes, nucleosome positioning has been studied for many years. In the past decade, high-throughput sequencing technologies have opened new perspectives in this research field by allowing the study of nucleosome positioning and occupancy on a genome-wide scale, therefore providing understanding on important aspects of chromatin packaging, as well as on various chromatin-template processes like transcription. In this chapter, we provide the protocol of MNase sequencing for the genome-wide mapping of nucleosomes using MNase to generate mononucleosomal DNA fragments and next-generation sequencing technology to identify their individual location.
Collapse
|