1
|
Zhang Y, Wang Z, Zeng Y, Zhou J, Zou Q. High-resolution transcription factor binding sites prediction improved performance and interpretability by deep learning method. Brief Bioinform 2021; 22:6322761. [PMID: 34272562 DOI: 10.1093/bib/bbab273] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/19/2021] [Accepted: 06/25/2021] [Indexed: 11/14/2022] Open
Abstract
Transcription factors (TFs) are essential proteins in regulating the spatiotemporal expression of genes. It is crucial to infer the potential transcription factor binding sites (TFBSs) with high resolution to promote biology and realize precision medicine. Recently, deep learning-based models have shown exemplary performance in the prediction of TFBSs at the base-pair level. However, the previous models fail to integrate nucleotide position information and semantic information without noisy responses. Thus, there is still room for improvement. Moreover, both the inner mechanism and prediction results of these models are challenging to interpret. To this end, the Deep Attentive Encoder-Decoder Neural Network (D-AEDNet) is developed to identify the location of TFs-DNA binding sites in DNA sequences. In particular, our model adopts Skip Architecture to leverage the nucleotide position information in the encoder and removes noisy responses in the information fusion process by Attention Gate. Simultaneously, the Transcription Factor Motif Discovery based on Sliding Window (TF-MoDSW), an approach to discover TFs-DNA binding motifs by utilizing the output of neural networks, is proposed to understand the biological meaning of the predicted result. On ChIP-exo datasets, experimental results show that D-AEDNet has better performance than competing methods. Besides, we authenticate that Attention Gate can improve the interpretability of our model by ways of visualization analysis. Furthermore, we confirm that ability of D-AEDNet to learn TFs-DNA binding motifs outperform the state-of-the-art methods and availability of TF-MoDSW to discover biological sequence motifs in TFs-DNA interaction by conducting experiment on ChIP-seq datasets.
Collapse
Affiliation(s)
- Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Zixuan Wang
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Yuanqi Zeng
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Jiliu Zhou
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, 610054, Chengdu, China
| |
Collapse
|
2
|
Lin Y, Wu K, Jia F, Chen L, Wang Z, Zhang Y, Luo Q, Liu S, Qi L, Li N, Dong P, Gao F, Zheng W, Fang X, Zhao Y, Wang F. Single cell imaging reveals cisplatin regulating interactions between transcription (co)factors and DNA. Chem Sci 2021; 12:5419-5429. [PMID: 34163767 PMCID: PMC8179581 DOI: 10.1039/d0sc06760a] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 02/24/2021] [Indexed: 12/21/2022] Open
Abstract
Cisplatin is an extremely successful anticancer drug, and is commonly thought to target DNA. However, the way in which cisplatin-induced DNA lesions regulate interactions between transcription factors/cofactors and genomic DNA remains unclear. Herein, we developed a dual-modal microscopy imaging strategy to investigate, in situ, the formation of ternary binding complexes of the transcription cofactor HMGB1 and transcription factor Smad3 with cisplatin crosslinked DNA in single cells. We utilized confocal microscopy imaging to map EYFP-fused HMGB1 and fluorescent dye-stained DNA in single cells, followed by the visualization of cisplatin using high spatial resolution (200-350 nm) time of flight secondary ion mass spectrometry (ToF-SIMS) imaging of the same cells. The superposition of the fluorescence and the mass spectrometry (MS) signals indicate the formation of HMGB1-Pt-DNA ternary complexes in the cells. More significantly, for the first time, similar integrated imaging revealed that the cisplatin lesions at Smad-binding elements, for example GGC(GC)/(CG) and AGAC, disrupted the interactions of Smad3 with DNA, which was evidenced by the remarkable reduction in the expression of Smad-specific luciferase reporters subjected to cisplatin treatment. This finding suggests that Smad3 and its related signalling pathway are most likely involved in the intracellular response to cisplatin induced DNA damage.
Collapse
Affiliation(s)
- Yu Lin
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, National Centre for Mass Spectrometry in Beijing, CAS Key Laboratory of Analytical Chemistry for Living Biosystems, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 People's Republic of China
| | - Kui Wu
- Key Laboratory of Hubei Province for Coal Conversion and New Carbon Materials, School of Chemistry and Chemical Engineering, Wuhan University of Science and Technology Wuhan 430081 People's Republic of China
| | - Feifei Jia
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, National Centre for Mass Spectrometry in Beijing, CAS Key Laboratory of Analytical Chemistry for Living Biosystems, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 People's Republic of China
| | - Ling Chen
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, National Centre for Mass Spectrometry in Beijing, CAS Key Laboratory of Analytical Chemistry for Living Biosystems, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 People's Republic of China
| | - Zhaoying Wang
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, National Centre for Mass Spectrometry in Beijing, CAS Key Laboratory of Analytical Chemistry for Living Biosystems, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 People's Republic of China
| | - Yanyan Zhang
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, National Centre for Mass Spectrometry in Beijing, CAS Key Laboratory of Analytical Chemistry for Living Biosystems, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 People's Republic of China
| | - Qun Luo
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, National Centre for Mass Spectrometry in Beijing, CAS Key Laboratory of Analytical Chemistry for Living Biosystems, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 People's Republic of China
- University of Chinese Academy of Sciences Beijing 100049 People's Republic of China
| | - Suyan Liu
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, National Centre for Mass Spectrometry in Beijing, CAS Key Laboratory of Analytical Chemistry for Living Biosystems, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 People's Republic of China
| | - Luyu Qi
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, National Centre for Mass Spectrometry in Beijing, CAS Key Laboratory of Analytical Chemistry for Living Biosystems, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 People's Republic of China
- University of Chinese Academy of Sciences Beijing 100049 People's Republic of China
| | - Nan Li
- University of Chinese Academy of Sciences Beijing 100049 People's Republic of China
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, Key Laboratory of Molecular Nanostructures and Nanotechnology, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 P. R. China
| | - Pu Dong
- China Telecom Corporation Limited Beijing Research Institute Beijing 100035 People's Republic of China
| | - Fei Gao
- China Telecom Corporation Limited Beijing Research Institute Beijing 100035 People's Republic of China
| | - Wei Zheng
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, National Centre for Mass Spectrometry in Beijing, CAS Key Laboratory of Analytical Chemistry for Living Biosystems, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 People's Republic of China
| | - Xiaohong Fang
- University of Chinese Academy of Sciences Beijing 100049 People's Republic of China
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, Key Laboratory of Molecular Nanostructures and Nanotechnology, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 P. R. China
| | - Yao Zhao
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, National Centre for Mass Spectrometry in Beijing, CAS Key Laboratory of Analytical Chemistry for Living Biosystems, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 People's Republic of China
| | - Fuyi Wang
- Beijing National Laboratory for Molecular Sciences, CAS Research/Education Center for Excellence in Molecular Sciences, National Centre for Mass Spectrometry in Beijing, CAS Key Laboratory of Analytical Chemistry for Living Biosystems, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 People's Republic of China
- University of Chinese Academy of Sciences Beijing 100049 People's Republic of China
- College of Traditional Chinese Medicine, Shandong University of Traditional Chinese Medicine Jinan 250355 People's Republic of China
| |
Collapse
|
3
|
Biochemical characteristics of the chondrocyte-enriched SNORC protein and its transcriptional regulation by SOX9. Sci Rep 2020; 10:7790. [PMID: 32385306 PMCID: PMC7210984 DOI: 10.1038/s41598-020-64640-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Accepted: 04/16/2020] [Indexed: 11/08/2022] Open
Abstract
Snorc (Small NOvel Rich in Cartilage) has been identified as a chondrocyte-specific gene in the mouse. Yet little is known about the SNORC protein biochemical properties, and mechanistically how the gene is regulated transcriptionally in a tissue-specific manner. The goals of the present study were to shed light on those important aspects. The chondrocyte nature of Snorc expression was confirmed in mouse and rat tissues, in differentiated (day 7) ATDC5, and in RCS cells where it was constitutive. Topological mapping and biochemical analysis brought experimental evidences that SNORC is a type I protein carrying a chondroitin sulfate (CS) attached to serine 44. The anomalous migration of SNORC on SDS-PAGE was due to its primary polypeptide features, suggesting no additional post-translational modifications apart from the CS glycosaminoglycan. A highly conserved SOX9-binding enhancer located in intron 1 was necessary to drive transcription of Snorc in the mouse, rat, and human. The enhancer was active independently of orientation and whether located in a heterologous promoter or intron. Crispr-mediated inactivation of the enhancer in RCS cells caused reduction of Snorc. Transgenic mice carrying the intronic multimerized enhancer drove high expression of a βGeo reporter in chondrocytes, but not in the hypertrophic zone. Altogether these data confirmed the chondrocyte-specific nature of Snorc and revealed dependency on the intronic enhancer binding of SOX9 for transcription.
Collapse
|
4
|
Sharma V, Majumdar S. Comparative analysis of ChIP-exo peak-callers: impact of data quality, read duplication and binding subtypes. BMC Bioinformatics 2020; 21:65. [PMID: 32085702 PMCID: PMC7035708 DOI: 10.1186/s12859-020-3403-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Accepted: 02/10/2020] [Indexed: 01/26/2023] Open
Abstract
Background ChIP (Chromatin immunoprecipitation)-exo has emerged as an important and versatile improvement over conventional ChIP-seq as it reduces the level of noise, maps the transcription factor (TF) binding location in a very precise manner, upto single base-pair resolution, and enables binding mode prediction. Availability of numerous peak-callers for analyzing ChIP-exo reads has motivated the need to assess their performance and report which tool executes reasonably well for the task. Results This study has focussed on comparing peak-callers that report direct binding events with those that report indirect binding events. The effect of strandedness of reads and duplication of data on the performance of peak-callers has been investigated. The number of peaks reported by each peak-caller is compared followed by a comparison of the annotated motifs present in the reported peaks. The significance of peaks is assessed based on the presence of a motif in top peaks. Indirect binding tools have been compared on the basis of their ability to identify annotated motifs and predict mode of protein-DNA interaction. Conclusion By studying the output of the peak-callers investigated in this study, it is concluded that the tools that use self-learning algorithms, i.e. the tools that estimate all the essential parameters from the aligned reads, perform better than the algorithms which require formation of peak-pairs. The latest tools that account for indirect binding of TFs appear to be an upgrade over the available tools, as they are able to reveal valuable information about the mode of binding in addition to direct binding. Furthermore, the quality of ChIP-exo reads have important consequences on the output of data analysis.
Collapse
Affiliation(s)
- Vasudha Sharma
- Discipline of Biological Engineering, Indian Institute of Technology Gandhinagar, Palaj, Gujarat, 382355, India
| | - Sharmistha Majumdar
- Discipline of Biological Engineering, Indian Institute of Technology Gandhinagar, Palaj, Gujarat, 382355, India.
| |
Collapse
|
5
|
Perreault AA, Sprunger DM, Venters BJ. Epigenetic and transcriptional profiling of triple negative breast cancer. Sci Data 2019; 6:190033. [PMID: 30835260 PMCID: PMC6400101 DOI: 10.1038/sdata.2019.33] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2018] [Accepted: 01/22/2019] [Indexed: 12/16/2022] Open
Abstract
The human HCC1806 cell line is frequently used as a preclinical model for triple negative breast cancer (TNBC). Given that dysregulated epigenetic mechanisms are involved in cancer pathogenesis, emerging therapeutic strategies target chromatin regulators, such as histone deacetylases. A comprehensive understanding of the epigenome and transcription profiling in HCC1806 provides the framework for evaluating efficacy and molecular mechanisms of epigenetic therapies. Thus, to study the interplay of transcription and chromatin in the HCC1806 preclinical model, we performed nascent transcription profiling using Precision Run-On coupled to sequencing (PRO-seq). Additionally, we mapped the genome-wide locations for RNA polymerase II (Pol II), the histone variant H2A.Z, seven histone modifications, and CTCF using ChIP-exo. ChIP-exonuclease (ChIP-exo) is a refined version of ChIP-seq with near base pair precision mapping of protein-DNA interactions. In this Data Descriptor, we present detailed information on experimental design, data generation, quality control analysis, and data validation. We discuss how these data lay the foundation for future analysis to understand the relationship between the nascent transcription and chromatin.
Collapse
Affiliation(s)
- Andrea A. Perreault
- Chemical and Physical Biology Program at Vanderbilt University, Nashville, TN, USA
| | - Danielle M. Sprunger
- Department of Molecular Physiology and Biophysics, Vanderbilt Genetics Institute, Vanderbilt Ingram Cancer Center, Vanderbilt University, Nashville, TN, USA
| | - Bryan J. Venters
- Department of Molecular Physiology and Biophysics, Vanderbilt Genetics Institute, Vanderbilt Ingram Cancer Center, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|