1
|
Sahrhage M, Paul NB, Beißbarth T, Haubrock M. The importance of DNA sequence for nucleosome positioning in transcriptional regulation. Life Sci Alliance 2024; 7:e202302380. [PMID: 38830772 PMCID: PMC11147951 DOI: 10.26508/lsa.202302380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 05/15/2024] [Accepted: 05/16/2024] [Indexed: 06/05/2024] Open
Abstract
Nucleosome positioning is a key factor for transcriptional regulation. Nucleosomes regulate the dynamic accessibility of chromatin and interact with the transcription machinery at every stage. Influences to steer nucleosome positioning are diverse, and the according importance of the DNA sequence in contrast to active chromatin remodeling has been the subject of long discussion. In this study, we evaluate the functional role of DNA sequence for all major elements along the process of transcription. We developed a random forest classifier based on local DNA structure that assesses the sequence-intrinsic support for nucleosome positioning. On this basis, we created a simple data resource that we applied genome-wide to the human genome. In our comprehensive analysis, we found a special role of DNA in mediating the competition of nucleosomes with cis-regulatory elements, in enabling steady transcription, for positioning of stable nucleosomes in exons, and for repelling nucleosomes during transcription termination. In contrast, we relate these findings to concurrent processes that generate strongly positioned nucleosomes in vivo that are not mediated by sequence, such as energy-dependent remodeling of chromatin.
Collapse
Affiliation(s)
- Malte Sahrhage
- Department of Medical Bioinformatics, University Medical Center, Göttingen, Germany
| | - Niels Benjamin Paul
- Department of Medical Bioinformatics, University Medical Center, Göttingen, Germany
- Department of Cardiology and Pneumology, University Medical Center, Göttingen, Germany
| | - Tim Beißbarth
- Department of Medical Bioinformatics, University Medical Center, Göttingen, Germany
| | - Martin Haubrock
- Department of Medical Bioinformatics, University Medical Center, Göttingen, Germany
| |
Collapse
|
2
|
Routhier E, Joubert A, Westbrook A, Pierre E, Lancrey A, Cariou M, Boulé JB, Mozziconacci J. In silico design of DNA sequences for in vivo nucleosome positioning. Nucleic Acids Res 2024; 52:6802-6810. [PMID: 38828788 PMCID: PMC11229325 DOI: 10.1093/nar/gkae468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 04/24/2024] [Accepted: 05/29/2024] [Indexed: 06/05/2024] Open
Abstract
The computational design of synthetic DNA sequences with designer in vivo properties is gaining traction in the field of synthetic genomics. We propose here a computational method which combines a kinetic Monte Carlo framework with a deep mutational screening based on deep learning predictions. We apply our method to build regular nucleosome arrays with tailored nucleosomal repeat lengths (NRL) in yeast. Our design was validated in vivo by successfully engineering and integrating thousands of kilobases long tandem arrays of computationally optimized sequences which could accommodate NRLs much larger than the yeast natural NRL (namely 197 and 237 bp, compared to the natural NRL of ∼165 bp). RNA-seq results show that transcription of the arrays can occur but is not driven by the NRL. The computational method proposed here delineates the key sequence rules for nucleosome positioning in yeast and should be easily applicable to other sequence properties and other genomes.
Collapse
Affiliation(s)
- Etienne Routhier
- Laboratoire de Physique Théorique, CNRS, Sorbonne Université, Paris, France de la Matière Condensée, CNRS, Sorbonne Université, Paris, France
| | - Alexandra Joubert
- Structure et Instabilité des Génomes, Museum National d’Histoire Naturelle, CNRS, INSERM, Paris, France
| | - Alex Westbrook
- Structure et Instabilité des Génomes, Museum National d’Histoire Naturelle, CNRS, INSERM, Paris, France
| | - Edgard Pierre
- Laboratoire de Physique Théorique, CNRS, Sorbonne Université, Paris, France de la Matière Condensée, CNRS, Sorbonne Université, Paris, France
| | - Astrid Lancrey
- Structure et Instabilité des Génomes, Museum National d’Histoire Naturelle, CNRS, INSERM, Paris, France
| | - Marie Cariou
- Acquisition et Analyse de données pour l’histoire naturelle, Museum National d’Histoire Naturelle, CNRS, Paris, France
| | - Jean-Baptiste Boulé
- Structure et Instabilité des Génomes, Museum National d’Histoire Naturelle, CNRS, INSERM, Paris, France
| | - Julien Mozziconacci
- Laboratoire de Physique Théorique, CNRS, Sorbonne Université, Paris, France de la Matière Condensée, CNRS, Sorbonne Université, Paris, France
- Structure et Instabilité des Génomes, Museum National d’Histoire Naturelle, CNRS, INSERM, Paris, France
- Acquisition et Analyse de données pour l’histoire naturelle, Museum National d’Histoire Naturelle, CNRS, Paris, France
- Institut Universitaire de France, Paris, France
| |
Collapse
|
3
|
Abstract
The tremendous amount of biological sequence data available, combined with the recent methodological breakthrough in deep learning in domains such as computer vision or natural language processing, is leading today to the transformation of bioinformatics through the emergence of deep genomics, the application of deep learning to genomic sequences. We review here the new applications that the use of deep learning enables in the field, focusing on three aspects: the functional annotation of genomes, the sequence determinants of the genome functions and the possibility to write synthetic genomic sequences.
Collapse
|
4
|
Nucleosome positioning on large tandem DNA repeats of the ’601’ sequence engineered in Saccharomyces cerevisiae. J Mol Biol 2022; 434:167497. [DOI: 10.1016/j.jmb.2022.167497] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 02/04/2022] [Accepted: 02/04/2022] [Indexed: 12/13/2022]
|
5
|
Li K, Carroll M, Vafabakhsh R, Wang XA, Wang JP. OUP accepted manuscript. Nucleic Acids Res 2022; 50:3142-3154. [PMID: 35288750 PMCID: PMC8989542 DOI: 10.1093/nar/gkac162] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 02/16/2022] [Accepted: 02/23/2022] [Indexed: 11/16/2022] Open
Abstract
DNA mechanical properties play a critical role in every aspect of DNA-dependent biological processes. Recently a high throughput assay named loop-seq has been developed to quantify the intrinsic bendability of a massive number of DNA fragments simultaneously. Using the loop-seq data, we develop a software tool, DNAcycP, based on a deep-learning approach for intrinsic DNA cyclizability prediction. We demonstrate DNAcycP predicts intrinsic DNA cyclizability with high fidelity compared to the experimental data. Using an independent dataset from in vitro selection for enrichment of loopable sequences, we further verified the predicted cyclizability score, termed C-score, can well distinguish DNA fragments with different loopability. We applied DNAcycP to multiple species and compared the C-scores with available high-resolution chemical nucleosome maps. Our analyses showed that both yeast and mouse genomes share a conserved feature of high DNA bendability spanning nucleosome dyads. Additionally, we extended our analysis to transcription factor binding sites and surprisingly found that the cyclizability is substantially elevated at CTCF binding sites in the mouse genome. We further demonstrate this distinct mechanical property is conserved across mammalian species and is inherent to CTCF binding DNA motif.
Collapse
Affiliation(s)
- Keren Li
- Department of Statistics, Northwestern University, 633 Clark Street, Evanston, IL 60208, USA
- NSF-Simons Center for Quantitative Biology, Northwestern University, Evanston, IL 60208, USA
| | - Matthew Carroll
- Weinberg College IT Solutions (WITS), Northwestern University, 633 Clark Street, Evanston, IL 60208, USA
| | - Reza Vafabakhsh
- Department of Molecular Biosciences, Northwestern University, Evanston, IL 60208, USA
| | - Xiaozhong A Wang
- Correspondence may also be addressed to Xiaozhong A. Wang. Tel: +1 847 467 4897;
| | - Ji-Ping Wang
- To whom correspondence should be addressed. Tel: +1 847 467 6896;
| |
Collapse
|