1
|
Yao YM, Miodownik I, O'Hagan MP, Jbara M, Afek A. Deciphering the dynamic code: DNA recognition by transcription factors in the ever-changing genome. Transcription 2024:1-25. [PMID: 39033307 DOI: 10.1080/21541264.2024.2379161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 07/03/2024] [Indexed: 07/23/2024] Open
Abstract
Transcription factors (TFs) intricately navigate the vast genomic landscape to locate and bind specific DNA sequences for the regulation of gene expression programs. These interactions occur within a dynamic cellular environment, where both DNA and TF proteins experience continual chemical and structural perturbations, including epigenetic modifications, DNA damage, mechanical stress, and post-translational modifications (PTMs). While many of these factors impact TF-DNA binding interactions, understanding their effects remains challenging and incomplete. This review explores the existing literature on these dynamic changes and their potential impact on TF-DNA interactions.
Collapse
Affiliation(s)
- Yumi Minyi Yao
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Irina Miodownik
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Michael P O'Hagan
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Muhammad Jbara
- School of Chemistry, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Ariel Afek
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
2
|
Yang M, Zhang S, Zheng Z, Zhang P, Liang Y, Tang S. Employing bimodal representations to predict DNA bendability within a self-supervised pre-trained framework. Nucleic Acids Res 2024; 52:e33. [PMID: 38375921 PMCID: PMC11014357 DOI: 10.1093/nar/gkae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/10/2024] [Accepted: 02/01/2024] [Indexed: 02/21/2024] Open
Abstract
The bendability of genomic DNA, which measures the DNA looping rate, is crucial for numerous biological processes of DNA. Recently, an advanced high-throughput technique known as 'loop-seq' has made it possible to measure the inherent cyclizability of DNA fragments. However, quantifying the bendability of large-scale DNA is costly, laborious, and time-consuming. To close the gap between rapidly evolving large language models and expanding genomic sequence information, and to elucidate the DNA bendability's impact on critical regulatory sequence motifs such as super-enhancers in the human genome, we introduce an innovative computational model, named MIXBend, to forecast the DNA bendability utilizing both nucleotide sequences and physicochemical properties. In MIXBend, a pre-trained language model DNABERT and convolutional neural network with attention mechanism are utilized to construct both sequence- and physicochemical-based extractors for the sophisticated refinement of DNA sequence representations. These bimodal DNA representations are then fed to a k-mer sequence-physicochemistry matching module to minimize the semantic gap between each modality. Lastly, a self-attention fusion layer is employed for the prediction of DNA bendability. In conclusion, the experimental results validate MIXBend's superior performance relative to other state-of-the-art methods. Additionally, MIXBend reveals both novel and known motifs from the yeast. Moreover, MIXBend discovers significant bendability fluctuations within super-enhancer regions and transcription factors binding sites in the human genome.
Collapse
Affiliation(s)
- Minghao Yang
- Bioscience and Biomedical Engineering Thrust, System Hub, Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511466, China
| | - Shichen Zhang
- Bioscience and Biomedical Engineering Thrust, System Hub, Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511466, China
| | - Zhihang Zheng
- Bioscience and Biomedical Engineering Thrust, System Hub, Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511466, China
| | - Pengfei Zhang
- Bioscience and Biomedical Engineering Thrust, System Hub, Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511466, China
| | - Yan Liang
- School of Artificial Intelligence, South China Normal University, Foshan 528225, China
| | - Shaojun Tang
- Bioscience and Biomedical Engineering Thrust, System Hub, Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511466, China
- Division of Life Science, Hong Kong University of Science and Technology, Hong Kong SAR 999077, China
| |
Collapse
|
3
|
Back G, Walther D. Predictions of DNA mechanical properties at a genomic scale reveal potentially new functional roles of DNA flexibility. NAR Genom Bioinform 2023; 5:lqad097. [PMID: 37954573 PMCID: PMC10632188 DOI: 10.1093/nargab/lqad097] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 09/28/2023] [Accepted: 10/25/2023] [Indexed: 11/14/2023] Open
Abstract
Mechanical properties of DNA have been implied to influence many of its biological functions. Recently, a new high-throughput method, called loop-seq, which allows measuring the intrinsic bendability of DNA fragments, has been developed. Using loop-seq data, we created a deep learning model to explore the biological significance of local DNA flexibility in a range of different species from different kingdoms. Consistently, we observed a characteristic and largely dinucleotide-composition-driven change of local flexibility near transcription start sites. In the presence of a TATA-box, a pronounced peak of high flexibility can be observed. Furthermore, depending on the transcription factor investigated, flanking-sequence-dependent DNA flexibility was identified as a potential factor influencing DNA binding. Compared to randomized genomic sequences, depending on species and taxa, actual genomic sequences were observed both with increased and lowered flexibility. Furthermore, in Arabidopsis thaliana, mutation rates, both de novo and fixed, were found to be associated with relatively rigid sequence regions. Our study presents a range of significant correlations between characteristic DNA mechanical properties and genomic features, the significance of which with regard to detailed molecular relevance awaits further theoretical and experimental exploration.
Collapse
Affiliation(s)
- Georg Back
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, Potsdam-Golm 14476, Germany
| | - Dirk Walther
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, Potsdam-Golm 14476, Germany
| |
Collapse
|
4
|
Natarajan AK, Ryssy J, Kuzyk A. A DNA origami-based device for investigating DNA bending proteins by transmission electron microscopy. NANOSCALE 2023; 15:3212-3218. [PMID: 36722916 DOI: 10.1039/d2nr05366g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
The DNA origami technique offers precise positioning of nanoscale objects with high accuracy. This has facilitated the development of DNA origami-based functional nanomechanical devices that enable the investigation of DNA-protein interactions at the single particle level. Herein, we used the DNA origami technique to fabricate a nanoscale device for studying DNA bending proteins. For a proof of concept, we used TATA-box binding protein (TBP) to evaluate our approach. Upon binding to the TATA box, TBP causes a bend to DNA of ∼90°. Our device translates this bending into an angular change that is readily observable with a conventional transmission electron microscope (TEM). Furthermore, we investigated the roles of transcription factor II A (TF(II)A) and transcription factor II B (TF(II)B). Our results indicate that TF(II)A introduces additional bending, whereas TF(II)B does not significantly alter the TBP-DNA structure. Our approach can be readily adopted to a wide range of DNA-bending proteins and will aid the development of DNA-origami-based devices tailored for the investigation of DNA-protein interactions.
Collapse
Affiliation(s)
- Ashwin Karthick Natarajan
- Department of Neuroscience and Biomedical Engineering, Aalto University, School of Science, P.O. Box 12200, FI-00076 Aalto, Finland.
| | - Joonas Ryssy
- Department of Neuroscience and Biomedical Engineering, Aalto University, School of Science, P.O. Box 12200, FI-00076 Aalto, Finland.
| | - Anton Kuzyk
- Department of Neuroscience and Biomedical Engineering, Aalto University, School of Science, P.O. Box 12200, FI-00076 Aalto, Finland.
| |
Collapse
|
5
|
Khan SR, Sakib S, Rahman MS, Samee MAH. DeepBend: An interpretable model of DNA bendability. iScience 2023; 26:105945. [PMID: 36866046 PMCID: PMC9971889 DOI: 10.1016/j.isci.2023.105945] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 12/05/2022] [Accepted: 01/05/2023] [Indexed: 01/09/2023] Open
Abstract
The bendability of genomic DNA impacts chromatin packaging and protein-DNA binding. However, we do not have a comprehensive understanding of the motifs influencing DNA bendability. Recent high-throughput technologies such as Loop-Seq offer an opportunity to address this gap but the lack of accurate and interpretable machine learning models still remains. Here we introduce DeepBend, a convolutional neural network model with convolutions designed to directly capture the motifs underlying DNA bendability and their periodic occurrences or relative arrangements that modulate bendability. DeepBend consistently performs on par with alternative models while giving an extra edge through mechanistic interpretations. Besides confirming the known motifs of DNA bendability, DeepBend also revealed several novel motifs and showed how the spatial patterns of motif occurrences influence bendability. DeepBend's genome-wide prediction of bendability further showed how bendability is linked to chromatin conformation and revealed the motifs controlling the bendability of topologically associated domains and their boundaries.
Collapse
Affiliation(s)
- Samin Rahman Khan
- Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| | - Sadman Sakib
- Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| | - M. Sohel Rahman
- Bangladesh University of Engineering and Technology, Dhaka, Bangladesh,Corresponding author
| | | |
Collapse
|
6
|
Regulation of Transcription Factor NF-κB in Its Natural Habitat: The Nucleus. Cells 2021; 10:cells10040753. [PMID: 33805563 PMCID: PMC8066257 DOI: 10.3390/cells10040753] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 03/24/2021] [Accepted: 03/24/2021] [Indexed: 01/11/2023] Open
Abstract
Activation of the transcription factor NF-κB elicits an individually tailored transcriptional response in order to meet the particular requirements of specific cell types, tissues, or organs. Control of the induction kinetics, amplitude, and termination of gene expression involves multiple layers of NF-κB regulation in the nucleus. Here we discuss some recent advances in our understanding of the mutual relations between NF-κB and chromatin regulators also in the context of different levels of genome organization. Changes in the 3D folding of the genome, as they occur during senescence or in cancer cells, can causally contribute to sustained increases in NF-κB activity. We also highlight the participation of NF-κB in the formation of hierarchically organized super enhancers, which enable the coordinated expression of co-regulated sets of NF-κB target genes. The identification of mechanisms allowing the specific regulation of NF-κB target gene clusters could potentially enable targeted therapeutic interventions, allowing selective interference with subsets of the NF-κB response without a complete inactivation of this key signaling system.
Collapse
|
7
|
Klancher CA, Minasov G, Podicheti R, Rusch DB, Dalia TN, Satchell KJF, Neiditch MB, Dalia AB. The ChiS-Family DNA-Binding Domain Contains a Cryptic Helix-Turn-Helix Variant. mBio 2021; 12:e03287-20. [PMID: 33727356 PMCID: PMC8092284 DOI: 10.1128/mbio.03287-20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Accepted: 02/10/2021] [Indexed: 11/20/2022] Open
Abstract
Sequence-specific DNA-binding domains (DBDs) are conserved in all domains of life. These proteins carry out a variety of cellular functions, and there are a number of distinct structural domains already described that allow for sequence-specific DNA binding, including the ubiquitous helix-turn-helix (HTH) domain. In the facultative pathogen Vibrio cholerae, the chitin sensor ChiS is a transcriptional regulator that is critical for the survival of this organism in its marine reservoir. We recently showed that ChiS contains a cryptic DBD in its C terminus. This domain is not homologous to any known DBD, but it is a conserved domain present in other bacterial proteins. Here, we present the crystal structure of the ChiS DBD at a resolution of 1.28 Å. We find that the ChiS DBD contains an HTH domain that is structurally similar to those found in other DNA-binding proteins, like the LacI repressor. However, one striking difference observed in the ChiS DBD is that the canonical tight turn of the HTH is replaced with an insertion containing a β-sheet, a variant which we term the helix-sheet-helix. Through systematic mutagenesis of all positively charged residues within the ChiS DBD, we show that residues within and proximal to the ChiS helix-sheet-helix are critical for DNA binding. Finally, through phylogenetic analyses we show that the ChiS DBD is found in diverse proteobacterial proteins that exhibit distinct domain architectures. Together, these results suggest that the structure described here represents the prototypical member of the ChiS-family of DBDs.IMPORTANCE Regulating gene expression is essential in all domains of life. This process is commonly facilitated by the activity of DNA-binding transcription factors. There are diverse structural domains that allow proteins to bind to specific DNA sequences. The structural basis underlying how some proteins bind to DNA, however, remains unclear. Previously, we showed that in the major human pathogen Vibrio cholerae, the transcription factor ChiS directly regulates gene expression through a cryptic DNA-binding domain. This domain lacked homology to any known DNA-binding protein. In the current study, we determined the structure of the ChiS DNA-binding domain (DBD) and found that the ChiS-family DBD is a cryptic variant of the ubiquitous helix-turn-helix (HTH) domain. We further demonstrate that this domain is conserved in diverse proteins that may represent a novel group of transcriptional regulators.
Collapse
Affiliation(s)
| | - George Minasov
- Center for Structural Genomics of Infectious Diseases, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Ram Podicheti
- Center for Genomics and Bioinformatics, Indiana University, Bloomington, Indiana, USA
| | - Douglas B Rusch
- Center for Genomics and Bioinformatics, Indiana University, Bloomington, Indiana, USA
| | - Triana N Dalia
- Department of Biology, Indiana University, Bloomington, Indiana, USA
| | - Karla J F Satchell
- Center for Structural Genomics of Infectious Diseases, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA
| | - Matthew B Neiditch
- Department of Microbiology, Biochemistry, and Molecular Genetics, New Jersey Medical School, Rutgers Biomedical Health Sciences, Newark, New Jersey, USA
| | - Ankur B Dalia
- Department of Biology, Indiana University, Bloomington, Indiana, USA
| |
Collapse
|