1
|
Wang S, Wang W. Interpretable prediction of mRNA abundance from promoter sequence using contextual regression models. NAR Genom Bioinform 2024; 6:lqae055. [PMID: 38807713 PMCID: PMC11131020 DOI: 10.1093/nargab/lqae055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 04/08/2024] [Accepted: 05/12/2024] [Indexed: 05/30/2024] Open
Abstract
While machine learning models have been successfully applied to predicting gene expression from promoter sequences, it remains a great challenge to derive intuitive interpretation of the model and reveal DNA motif grammar such as motif cooperation and distance constraint between motif sites. Previous interpretation approaches are often time-consuming or have difficulty to learn the combinatory rules. In this work, we designed interpretable neural network models to predict the mRNA expression levels from DNA sequences. By applying the Contextual Regression framework we developed, we extracted weighted features to cluster samples into different groups, which have different gene expression levels. We performed motif analysis in each cluster and found motifs with active or repressive regulation on gene expression. By comparing the co-occurrence locations of discovered motifs, we also uncovered multiple grammars of motif combination including communities of cooperative motifs and distance constraints between motif pairs. These results revealed new insights of the regulatory architecture of promoter sequences.
Collapse
Affiliation(s)
- Song Wang
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0359, USA
| | - Wei Wang
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0359, USA
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093-0359, USA
| |
Collapse
|
2
|
Perevalova AM, Gulyaeva LF, Pustylnyak VO. Roles of Interferon Regulatory Factor 1 in Tumor Progression and Regression: Two Sides of a Coin. Int J Mol Sci 2024; 25:2153. [PMID: 38396830 PMCID: PMC10889282 DOI: 10.3390/ijms25042153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Revised: 02/05/2024] [Accepted: 02/07/2024] [Indexed: 02/25/2024] Open
Abstract
IRF1 is a transcription factor well known for its role in IFN signaling. Although IRF1 was initially identified for its involvement in inflammatory processes, there is now evidence that it provides a function in carcinogenesis as well. IRF1 has been shown to affect several important antitumor mechanisms, such as induction of apoptosis, cell cycle arrest, remodeling of tumor immune microenvironment, suppression of telomerase activity, suppression of angiogenesis and others. Nevertheless, the opposite effects of IRF1 on tumor growth have also been demonstrated. In particular, the "immune checkpoint" molecule PD-L1, which is responsible for tumor immune evasion, has IRF1 as a major transcriptional regulator. These and several other properties of IRF1, including its proposed association with response and resistance to immunotherapy and several chemotherapeutic drugs, make it a promising object for further research. Numerous mechanisms of IRF1 regulation in cancer have been identified, including genetic, epigenetic, transcriptional, post-transcriptional, and post-translational mechanisms, although their significance for tumor progression remains to be explored. This review will focus on the established tumor-suppressive and tumor-promoting functions of IRF1, as well as the molecular mechanisms of IRF1 regulation identified in various cancers.
Collapse
Affiliation(s)
- Alina M. Perevalova
- Zelman Institute for the Medicine and Psychology, Novosibirsk State University, Pirogova Street, 1, Novosibirsk 630090, Russia; (A.M.P.)
- Federal Research Center of Fundamental and Translational Medicine, Timakova Street, 2/12, Novosibirsk 630117, Russia
| | - Lyudmila F. Gulyaeva
- Zelman Institute for the Medicine and Psychology, Novosibirsk State University, Pirogova Street, 1, Novosibirsk 630090, Russia; (A.M.P.)
- Federal Research Center of Fundamental and Translational Medicine, Timakova Street, 2/12, Novosibirsk 630117, Russia
| | - Vladimir O. Pustylnyak
- Zelman Institute for the Medicine and Psychology, Novosibirsk State University, Pirogova Street, 1, Novosibirsk 630090, Russia; (A.M.P.)
- Federal Research Center of Fundamental and Translational Medicine, Timakova Street, 2/12, Novosibirsk 630117, Russia
| |
Collapse
|
3
|
Zhang M, Huang H, Li J, Wu Q. ZNF143 deletion alters enhancer/promoter looping and CTCF/cohesin geometry. Cell Rep 2024; 43:113663. [PMID: 38206813 DOI: 10.1016/j.celrep.2023.113663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 11/28/2023] [Accepted: 12/22/2023] [Indexed: 01/13/2024] Open
Abstract
The transcription factor ZNF143 contains a central domain of seven zinc fingers in a tandem array and is involved in 3D genome construction. However, the mechanism by which ZNF143 functions in chromatin looping remains unclear. Here, we show that ZNF143 directionally recognizes a diverse range of genomic sites directly within enhancers and promoters and is required for chromatin looping between these sites. In addition, ZNF143 is located between CTCF and cohesin at numerous CTCF sites, and ZNF143 removal narrows the space between CTCF and cohesin. Moreover, genetic deletion of ZNF143, in conjunction with acute CTCF degradation, reveals that ZNF143 and CTCF collaborate to regulate higher-order topological chromatin organization. Finally, CTCF depletion enlarges direct ZNF143 chromatin looping. Thus, ZNF143 is recruited by CTCF to the CTCF sites to regulate CTCF/cohesin configuration and TAD (topologically associating domain) formation, whereas directional recognition of genomic DNA motifs directly by ZNF143 itself regulates promoter activity via chromatin looping.
Collapse
Affiliation(s)
- Mo Zhang
- Center for Comparative Biomedicine, Ministry of Education Key Laboratory of Systems Biomedicine, State Key Laboratory of Medical Genomics, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China; WLA Laboratories, Shanghai 201203, China
| | - Haiyan Huang
- Center for Comparative Biomedicine, Ministry of Education Key Laboratory of Systems Biomedicine, State Key Laboratory of Medical Genomics, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China; WLA Laboratories, Shanghai 201203, China
| | - Jingwei Li
- Center for Comparative Biomedicine, Ministry of Education Key Laboratory of Systems Biomedicine, State Key Laboratory of Medical Genomics, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China; WLA Laboratories, Shanghai 201203, China
| | - Qiang Wu
- Center for Comparative Biomedicine, Ministry of Education Key Laboratory of Systems Biomedicine, State Key Laboratory of Medical Genomics, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China; WLA Laboratories, Shanghai 201203, China.
| |
Collapse
|
4
|
Hudaiberdiev S, Ovcharenko I. Sequence characteristics and an accurate model of abundant hyperactive loci in the human genome. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.05.527203. [PMID: 36945558 PMCID: PMC10028745 DOI: 10.1101/2023.02.05.527203] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Enhancers and promoters are classically considered to be bound by a small set of TFs in a sequence-specific manner. This assumption has come under increasing skepticism as the datasets of ChIP-seq assays of TFs have expanded. In particular, high-occupancy target (HOT) loci attract hundreds of TFs with seemingly no detectable correlation between ChIP-seq peaks and DNA-binding motif presence. Here, we used a set of 1,003 TF ChIP-seq datasets (HepG2, K562, H1) to analyze the patterns of ChIP-seq peak co-occurrence in combination with functional genomics datasets. We identified 43,891 HOT loci forming at the promoter (53%) and enhancer (47%) regions. HOT promoters regulate housekeeping genes, whereas HOT enhancers are involved in tissue-specific process regulation. HOT loci form the foundation of human super-enhancers and evolve under strong negative selection, with some of these loci being located in ultraconserved regions. Sequence-based classification analysis of HOT loci suggested that their formation is driven by the sequence features, and the density of mapped ChIP-seq peaks across TF-bound loci correlates with sequence features and the expression level of flanking genes. Based on the affinities to bind to promoters and enhancers we detected 5 distinct clusters of TFs that form the core of the HOT loci. We report an abundance of HOT loci in the human genome and a commitment of 51% of all TF ChIP-seq binding events to HOT locus formation thus challenging the classical model of enhancer activity and propose a model of HOT locus formation based on the existence of large transcriptional condensates.
Collapse
Affiliation(s)
- Sanjarbek Hudaiberdiev
- National Institute for Biotechnology and Information, National Library of Medicine, National Institutes of Health. Bethesda, MD
| | - Ivan Ovcharenko
- National Institute for Biotechnology and Information, National Library of Medicine, National Institutes of Health. Bethesda, MD
| |
Collapse
|
5
|
Tahara S, Tsuchiya T, Matsumoto H, Ozaki H. Transcription factor-binding k-mer analysis clarifies the cell type dependency of binding specificities and cis-regulatory SNPs in humans. BMC Genomics 2023; 24:597. [PMID: 37805453 PMCID: PMC10560430 DOI: 10.1186/s12864-023-09692-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 09/21/2023] [Indexed: 10/09/2023] Open
Abstract
BACKGROUND Transcription factors (TFs) exhibit heterogeneous DNA-binding specificities in individual cells and whole organisms under natural conditions, and de novo motif discovery usually provides multiple motifs, even from a single chromatin immunoprecipitation-sequencing (ChIP-seq) sample. Despite the accumulation of ChIP-seq data and ChIP-seq-derived motifs, the diversity of DNA-binding specificities across different TFs and cell types remains largely unexplored. RESULTS Here, we applied MOCCS2, our k-mer-based motif discovery method, to a collection of human TF ChIP-seq samples across diverse TFs and cell types, and systematically computed profiles of TF-binding specificity scores for all k-mers. After quality control, we compiled a set of TF-binding specificity score profiles for 2,976 high-quality ChIP-seq samples, comprising 473 TFs and 398 cell types. Using these high-quality samples, we confirmed that the k-mer-based TF-binding specificity profiles reflected TF- or TF-family dependent DNA-binding specificities. We then compared the binding specificity scores of ChIP-seq samples with the same TFs but with different cell type classes and found that half of the analyzed TFs exhibited differences in DNA-binding specificities across cell type classes. Additionally, we devised a method to detect differentially bound k-mers between two ChIP-seq samples and detected k-mers exhibiting statistically significant differences in binding specificity scores. Moreover, we demonstrated that differences in the binding specificity scores between k-mers on the reference and alternative alleles could be used to predict the effect of variants on TF binding, as validated by in vitro and in vivo assay datasets. Finally, we demonstrated that binding specificity score differences can be used to interpret disease-associated non-coding single-nucleotide polymorphisms (SNPs) as TF-affecting SNPs and provide candidates responsible for TFs and cell types. CONCLUSIONS Our study provides a basis for investigating the regulation of gene expression in a TF-, TF family-, or cell-type-dependent manner. Furthermore, our differential analysis of binding-specificity scores highlights noncoding disease-associated variants in humans.
Collapse
Affiliation(s)
- Saeko Tahara
- Bioinformatics Laboratory, Institute of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan
- School of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan
| | - Takaho Tsuchiya
- Bioinformatics Laboratory, Institute of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan
- Center for Artificial Intelligence Research, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan
| | - Hirotaka Matsumoto
- School of Information and Data Sciences, Nagasaki University, 1-14, Bunkyo-Machi, Nagasaki City, Nagasaki, 852-8521, Japan
- Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics, Wako, Saitama, 351-0198, Japan
| | - Haruka Ozaki
- Bioinformatics Laboratory, Institute of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan.
- Center for Artificial Intelligence Research, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki, 305-8577, Japan.
- Laboratory for Bioinformatics Research, RIKEN Center for Biosystems Dynamics, Wako, Saitama, 351-0198, Japan.
| |
Collapse
|
6
|
Vadnala RN, Hannenhalli S, Narlikar L, Siddharthan R. Transcription factors organize into functional groups on the linear genome and in 3D chromatin. Heliyon 2023; 9:e18211. [PMID: 37520992 PMCID: PMC10382302 DOI: 10.1016/j.heliyon.2023.e18211] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 07/11/2023] [Accepted: 07/11/2023] [Indexed: 08/01/2023] Open
Abstract
Transcription factors (TFs) and their binding sites have evolved to interact cooperatively or competitively with each other. Here we examine in detail, across multiple cell lines, such cooperation or competition among TFs both in sequential and spatial proximity (using chromatin conformation capture assays), considering in vivo binding data as well as TF binding motifs in DNA. We ascertain significantly co-occurring ("attractive") or avoiding ("repulsive") TF pairs using robust randomized models that retain the essential characteristics of the experimental data. Across human cell lines TFs organize into two groups, with intra-group attraction and inter-group repulsion. This is true for both sequential and spatial proximity, and for both in vivo binding and sequence motifs. Attractive TF pairs exhibit significantly more physical interactions suggesting an underlying mechanism. The two TF groups differ significantly in their genomic and network properties, as well in their function-while one group regulates housekeeping function, the other potentially regulates lineage-specific functions, that are disrupted in cancer. Weaker binding sites tend to occur in spatially interacting regions of the genome. Our results suggest that a complex pattern of spatial cooperativity of TFs and chromatin has evolved with the genome to support housekeeping and lineage-specific functions.
Collapse
Affiliation(s)
- Rakesh Netha Vadnala
- The Institute of Mathematical Sciences, Chennai, India
- Homi Bhabha National Institute, Mumbai, India
| | | | - Leelavati Narlikar
- Department of Data Science, Indian Institute of Science Education and Research, Pune, India
| | - Rahul Siddharthan
- The Institute of Mathematical Sciences, Chennai, India
- Homi Bhabha National Institute, Mumbai, India
| |
Collapse
|
7
|
Sekrecka A, Kluzek K, Sekrecki M, Boroujeni ME, Hassani S, Yamauchi S, Sada K, Wesoly J, Bluyssen HAR. Time-dependent recruitment of GAF, ISGF3 and IRF1 complexes shapes IFNα and IFNγ-activated transcriptional responses and explains mechanistic and functional overlap. Cell Mol Life Sci 2023; 80:187. [PMID: 37347298 DOI: 10.1007/s00018-023-04830-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 05/09/2023] [Accepted: 06/08/2023] [Indexed: 06/23/2023]
Abstract
To understand in detail the transcriptional and functional overlap of IFN-I- and IFN-II-activated responses, we used an integrative RNAseq-ChIPseq approach in Huh7.5 cells and characterized the genome-wide role of pSTAT1, pSTAT2, IRF9 and IRF1 in time-dependent ISG expression. For the first time, our results provide detailed insight in the timely steps of IFNα- and IFNγ-induced transcription, in which pSTAT1- and pSTAT2-containing ISGF3 and GAF-like complexes and IRF1 are recruited to individual or combined ISRE and GAS composite sites in a phosphorylation- and time-dependent manner. Interestingly, composite genes displayed a more heterogeneous expression pattern, as compared to GAS (early) and ISRE genes (late), with the time- and phosphorylation-dependent recruitment of GAF, ISGF3 and IRF1 after IFNα stimulation and GAF and IRF1 after IFNγ. Moreover, functional composite genes shared features of GAS and ISRE genes through transcription factor co-binding to closely located sites, and were able to sustain IFN responsiveness in STAT1-, STAT2-, IRF9-, IRF1- and IRF9/IRF1-mutant Huh7.5 cells compared to Wt cells. Thus, the ISRE + GAS composite site acted as a molecular switch, depending on the timely available components and transcription factor complexes. Consequently, STAT1, STAT2 and IRF9 were identified as functional composite genes that are part of a positive feedback loop controlling long-term IFNα and IFNγ responses. More important, in the absence of any one of the components, the positive feedback regulation of the ISGF3 and GAF components appeared to be preserved. Together, these findings provide further insight in the existence of a novel ISRE + GAS composite-dependent intracellular amplifier circuit prolonging ISG expression and controlling cellular responsiveness to different types of IFNs and subsequent antiviral activity. It also offers an explanation for the existing molecular and functional overlap between IFN-I- and IFN-II-activated ISG expression.
Collapse
Affiliation(s)
- Agata Sekrecka
- Human Molecular Genetics Research Unit, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Poznan, Poland
| | - Katarzyna Kluzek
- Human Molecular Genetics Research Unit, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Poznan, Poland
| | - Michal Sekrecki
- Human Molecular Genetics Research Unit, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Poznan, Poland
| | - Mahdi Eskandarian Boroujeni
- Human Molecular Genetics Research Unit, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Poznan, Poland
| | - Sanaz Hassani
- Human Molecular Genetics Research Unit, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Poznan, Poland
| | - Shota Yamauchi
- Department of Genome Science and Microbiology, Faculty of Medical Sciences, University of Fukui, Fukui, Japan
| | - Kiyonao Sada
- Department of Genome Science and Microbiology, Faculty of Medical Sciences, University of Fukui, Fukui, Japan
| | - Joanna Wesoly
- High Throughput Technologies Laboratory, Faculty of Biology, Adam Mickiewicz University, Poznan, Poland
| | - Hans A R Bluyssen
- Human Molecular Genetics Research Unit, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Poznan, Poland.
| |
Collapse
|
8
|
Zhao JY, Yuan XK, Luo RZ, Wang LX, Gu W, Yamane D, Feng H. Phospholipase A and acyltransferase 4/retinoic acid receptor responder 3 at the intersection of tumor suppression and pathogen restriction. Front Immunol 2023; 14:1107239. [PMID: 37063830 PMCID: PMC10102619 DOI: 10.3389/fimmu.2023.1107239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Accepted: 03/22/2023] [Indexed: 04/03/2023] Open
Abstract
Phospholipase A and acyltransferase (PLAAT) 4 is a class II tumor suppressor with phospholipid metabolizing abilities. It was characterized in late 2000s, and has since been referred to as ‘tazarotene-induced gene 3’ (TIG3) or ‘retinoic acid receptor responder 3’ (RARRES3) as a key downstream effector of retinoic acid signaling. Two decades of research have revealed the complexity of its function and regulatory roles in suppressing tumorigenesis. However, more recent findings have also identified PLAAT4 as a key anti-microbial effector enzyme acting downstream of interferon regulatory factor 1 (IRF1) and interferons (IFNs), favoring protection from virus and parasite infections. Unveiling the molecular mechanisms underlying its action may thus open new therapeutic avenues for the treatment of both cancer and infectious diseases. Herein, we aim to summarize a brief history of PLAAT4 discovery, its transcriptional regulation, and the potential mechanisms in tumor prevention and anti-pathogen defense, and discuss potential future directions of PLAAT4 research toward the development of therapeutic approaches targeting this enzyme with pleiotropic functions.
Collapse
Affiliation(s)
- Jian-Yong Zhao
- Hospital of Integrated Traditional Chinese and Western Medicine, Hebei University of Chinese Medicine, Cangzhou, Hebei, China
| | - Xiang-Kun Yuan
- Hospital of Integrated Traditional Chinese and Western Medicine, Hebei University of Chinese Medicine, Cangzhou, Hebei, China
| | - Rui-Zhen Luo
- Hospital of Integrated Traditional Chinese and Western Medicine, Hebei University of Chinese Medicine, Cangzhou, Hebei, China
| | - Li-Xin Wang
- Hospital of Integrated Traditional Chinese and Western Medicine, Hebei University of Chinese Medicine, Cangzhou, Hebei, China
| | - Wei Gu
- School of Medicine, Chongqing University, Chongqing, China
| | - Daisuke Yamane
- Department of Diseases and Infection, Tokyo Metropolitan Institute of Medical Science, Tokyo, Japan
- *Correspondence: Hui Feng, ; Daisuke Yamane,
| | - Hui Feng
- School of Medicine, Chongqing University, Chongqing, China
- *Correspondence: Hui Feng, ; Daisuke Yamane,
| |
Collapse
|
9
|
Liu S, Cao Y, Cui K, Tang Q, Zhao K. Hi-TrAC reveals division of labor of transcription factors in organizing chromatin loops. Nat Commun 2022; 13:6679. [PMID: 36335136 PMCID: PMC9637178 DOI: 10.1038/s41467-022-34276-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 10/20/2022] [Indexed: 11/08/2022] Open
Abstract
The three-dimensional genomic structure plays a critical role in gene expression, cellular differentiation, and pathological conditions. It is pivotal to elucidate fine-scale chromatin architectures, especially interactions of regulatory elements, to understand the temporospatial regulation of gene expression. In this study, we report Hi-TrAC as a proximity ligation-free, robust, and sensitive technique to profile genome-wide chromatin interactions at high-resolution among regulatory elements. Hi-TrAC detects chromatin looping among accessible regions at single nucleosome resolution. With almost half-million identified loops, we reveal a comprehensive interaction network of regulatory elements across the genome. After integrating chromatin binding profiles of transcription factors, we discover that cohesin complex and CTCF are responsible for organizing long-range chromatin loops, related to domain formation; whereas ZNF143 and HCFC1 are involved in structuring short-range chromatin loops between regulatory elements, which directly regulate gene expression. Thus, we introduce a methodology to identify a delicate and comprehensive network of cis-regulatory elements, revealing the complexity and a division of labor of transcription factors in organizing chromatin loops for genome organization and gene expression.
Collapse
Affiliation(s)
- Shuai Liu
- grid.94365.3d0000 0001 2297 5165Laboratory of Epigenome Biology, Systems Biology Center, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD USA
| | - Yaqiang Cao
- grid.94365.3d0000 0001 2297 5165Laboratory of Epigenome Biology, Systems Biology Center, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD USA
| | - Kairong Cui
- grid.94365.3d0000 0001 2297 5165Laboratory of Epigenome Biology, Systems Biology Center, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD USA
| | - Qingsong Tang
- grid.94365.3d0000 0001 2297 5165Laboratory of Epigenome Biology, Systems Biology Center, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD USA
| | - Keji Zhao
- grid.94365.3d0000 0001 2297 5165Laboratory of Epigenome Biology, Systems Biology Center, Division of Intramural Research, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD USA
| |
Collapse
|
10
|
Zhao Y. TFSyntax: a database of transcription factors binding syntax in mammalian genomes. Nucleic Acids Res 2022; 51:D306-D314. [PMID: 36200824 PMCID: PMC9825613 DOI: 10.1093/nar/gkac849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 09/10/2022] [Accepted: 09/21/2022] [Indexed: 01/29/2023] Open
Abstract
In mammals, transcriptional factors (TFs) drive gene expression by binding to regulatory elements in a cooperative manner. Deciphering the rules of such cooperation is crucial to obtain a full understanding of cellular homeostasis and development. Although this is a long-standing topic, there is no comprehensive database for biologists to access the syntax of TF binding sites. Here we present TFSyntax (https://tfsyntax.zhaopage.com), a database focusing on the arrangement of TF binding sites. TFSyntax maps the binding motif of 1299 human TFs and 890 mouse TFs across 382 cells and tissues, representing the most comprehensive TF binding map to date. In addition to location, TFSyntax defines motif positional preference, density and colocalization within accessible elements. Powered by a series of functional modules based on web interface, users can freely search, browse, analyze, and download data of interest. With comprehensive characterization of TF binding syntax across distinct tissues and cell types, TFSyntax represents a valuable resource and platform for studying the mechanism of transcriptional regulation and exploring how regulatory DNA variants cause disease.
Collapse
Affiliation(s)
- Yongbing Zhao
- To whom correspondence should be addressed. Tel: +1 301 480 5852;
| |
Collapse
|
11
|
Biswas A, Narlikar L. A universal framework for detecting cis-regulatory diversity in DNA regulatory regions. Genome Res 2021; 31:1646-1662. [PMID: 34285090 PMCID: PMC8415372 DOI: 10.1101/gr.274563.120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Accepted: 07/09/2021] [Indexed: 12/02/2022]
Abstract
High-throughput sequencing-based assays measure different biochemical activities pertaining to gene regulation, genome-wide. These activities include transcription factor (TF)–DNA binding, enhancer activity, open chromatin, and more. A major goal is to understand underlying sequence components, or motifs, that can explain the measured activity. It is usually not one motif but a combination of motifs bound by cooperatively acting proteins that confers activity to such regions. Furthermore, regions can be diverse, governed by different combinations of TFs/motifs. Current approaches do not take into account this issue of combinatorial diversity. We present a new statistical framework, cisDIVERSITY, which models regions as diverse modules characterized by combinations of motifs while simultaneously learning the motifs themselves. Because cisDIVERSITY does not rely on knowledge of motifs, modules, cell type, or organism, it is general enough to be applied to regions reported by most high-throughput assays. For example, in enhancer predictions resulting from different assays—GRO-cap, STARR-seq, and those measuring chromatin structure—cisDIVERSITY discovers distinct modules and combinations of TF binding sites, some specific to the assay. From protein–DNA binding data, cisDIVERSITY identifies potential cofactors of the profiled TF, whereas from ATAC-seq data, it identifies tissue-specific regulatory modules. Finally, analysis of single-cell ATAC-seq data suggests that regions open in one cell-state encode information about future states, with certain modules staying open and others closing down in the next time point.
Collapse
Affiliation(s)
- Anushua Biswas
- CSIR-National Chemical Laboratory, Academy of Scientific and Innovative Research
| | - Leelavati Narlikar
- CSIR-National Chemical Laboratory, Academy of Scientific and Innovative Research
| |
Collapse
|
12
|
NF-Y Subunits Overexpression in HNSCC. Cancers (Basel) 2021; 13:cancers13123019. [PMID: 34208636 PMCID: PMC8234210 DOI: 10.3390/cancers13123019] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 05/31/2021] [Accepted: 06/06/2021] [Indexed: 12/14/2022] Open
Abstract
Simple Summary Cancer cells have altered gene expression profiles. This is ultimately elicited by altered structure, expression or binding of transcription factors to regulatory regions of genomes. The CCAAT-binding trimer is a pioneer transcription factor involved in the activation of “cancer” genes. We and others have shown that the regulatory NF-YA subunit is overexpressed in epithelial cancers. Here, we examined large datasets of bulk gene expression profiles, as well as single-cell data, in head and neck squamous cell carcinomas by bioinformatic methods. We partitioned tumors according to molecular subtypes, mutations and positivity for HPV. We came to the conclusion that high levels of the histone-like subunits and the “short” NF-YAs isoform are protective in HPV-positive tumors. On the other hand, high levels of the “long” NF-YAl were found in the recently identified aggressive and metastasis-prone cell population undergoing partial epithelial to mesenchymal transition, p-EMT. Abstract NF-Y is the CCAAT-binding trimer formed by the histone fold domain (HFD), NF-YB/NF-YC and NF-YA. The CCAAT box is generally prevalent in promoters of “cancer” genes. We reported the overexpression of NF-YA in BRCA, LUAD and LUSC, and of all subunits in HCC. Altered splicing of NF-YA was found in breast and lung cancer. We analyzed RNA-seq datasets of TCGA and cell lines of head and neck squamous cell carcinomas (HNSCC). We partitioned all TCGA data into four subtypes, deconvoluted single-cell RNA-seq of tumors and derived survival curves. The CCAAT box was enriched in the promoters of overexpressed genes. The “short” NF-YAs was overexpressed in all subtypes and the “long” NF-YAl in Mesenchymal. The HFD subunits are overexpressed, except Basal (NF-YB) and Atypical (NF-YC); NF-YAl is increased in p53 mutated tumors. In HPV-positive tumors, high levels of NF-YAs, p16 and ΔNp63 correlate with better prognosis. Deconvolution of single cell RNA-seq (scRNA-seq) found a correlation of NF-YAl with Cancer Associated Fibroblasts (CAFs) and p-EMT cells, a population endowed with metastatic potential. We conclude that overexpression of HFD subunits and NF-YAs is protective in HPV-positive tumors; expression of NF-YAl is largely confined to mutp53 tumors and malignant p-EMT cells.
Collapse
|
13
|
Spiegel J, Cuesta SM, Adhikari S, Hänsel-Hertsch R, Tannahill D, Balasubramanian S. G-quadruplexes are transcription factor binding hubs in human chromatin. Genome Biol 2021; 22:117. [PMID: 33892767 PMCID: PMC8063395 DOI: 10.1186/s13059-021-02324-z] [Citation(s) in RCA: 109] [Impact Index Per Article: 36.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 03/24/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND The binding of transcription factors (TF) to genomic targets is critical in the regulation of gene expression. Short, double-stranded DNA sequence motifs are routinely implicated in TF recruitment, but many questions remain on how binding site specificity is governed. RESULTS Herein, we reveal a previously unappreciated role for DNA secondary structures as key features for TF recruitment. In a systematic, genome-wide study, we discover that endogenous G-quadruplex secondary structures (G4s) are prevalent TF binding sites in human chromatin. Certain TFs bind G4s with affinities comparable to double-stranded DNA targets. We demonstrate that, in a chromatin context, this binding interaction is competed out with a small molecule. Notably, endogenous G4s are prominent binding sites for a large number of TFs, particularly at promoters of highly expressed genes. CONCLUSIONS Our results reveal a novel non-canonical mechanism for TF binding whereby G4s operate as common binding hubs for many different TFs to promote increased transcription.
Collapse
Affiliation(s)
- Jochen Spiegel
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK
| | - Sergio Martínez Cuesta
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK
- Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK
- Present Address: Data Sciences and Quantitative Biology, Discovery Sciences, AstraZeneca, Cambridge, UK
| | - Santosh Adhikari
- Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK
| | - Robert Hänsel-Hertsch
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK
- Present Address: Center for Molecular Medicine Cologne, University of Cologne, 50931, Cologne, Germany
| | - David Tannahill
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK
| | - Shankar Balasubramanian
- Cancer Research UK Cambridge Institute, Li Ka Shing Centre, Robinson Way, Cambridge, CB2 0RE, UK.
- Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK.
- School of Clinical Medicine, University of Cambridge, Cambridge, CB2 0SP, UK.
| |
Collapse
|
14
|
Seo J, Koçak DD, Bartelt LC, Williams CA, Barrera A, Gersbach CA, Reddy TE. AP-1 subunits converge promiscuously at enhancers to potentiate transcription. Genome Res 2021; 31:538-550. [PMID: 33674350 PMCID: PMC8015846 DOI: 10.1101/gr.267898.120] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Accepted: 02/17/2021] [Indexed: 12/12/2022]
Abstract
The AP-1 transcription factor (TF) dimer contributes to many biological processes and environmental responses. AP-1 can be composed of many interchangeable subunits. Unambiguously determining the binding locations of these subunits in the human genome is challenging because of variable antibody specificity and affinity. Here, we definitively establish the genome-wide binding patterns of five AP-1 subunits by using CRISPR to introduce a common antibody tag on each subunit. We find limited evidence for strong dimerization preferences between subunits at steady state and find that, under a stimulus, dimerization patterns reflect changes in the transcriptome. Further, our analysis suggests that canonical AP-1 motifs indiscriminately recruit all AP-1 subunits to genomic sites, which we term AP-1 hotspots. We find that AP-1 hotspots are predictive of cell type–specific gene expression and of genomic responses to glucocorticoid signaling (more so than super-enhancers) and are significantly enriched in disease-associated genetic variants. Together, these results support a model where promiscuous binding of many AP-1 subunits to the same genomic location play a key role in regulating cell type–specific gene expression and environmental responses.
Collapse
Affiliation(s)
- Jungkyun Seo
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical Center, Durham, North Carolina 27708, USA.,Computational Biology and Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA.,Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27708, USA
| | - D Dewran Koçak
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27708, USA.,Department of Biomedical Engineering, Duke University, Durham, North Carolina 27708, USA
| | - Luke C Bartelt
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,University Program in Genetics and Genomics, Duke University, Durham, North Carolina 27708, USA
| | - Courtney A Williams
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27708, USA
| | - Alejandro Barrera
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical Center, Durham, North Carolina 27708, USA.,Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27708, USA
| | - Charles A Gersbach
- Computational Biology and Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA.,Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27708, USA.,Department of Biomedical Engineering, Duke University, Durham, North Carolina 27708, USA.,University Program in Genetics and Genomics, Duke University, Durham, North Carolina 27708, USA.,Department of Surgery, Duke University Medical Center, Durham, North Carolina 27708, USA
| | - Timothy E Reddy
- Department of Biostatistics and Bioinformatics, Division of Integrative Genomics, Duke University Medical Center, Durham, North Carolina 27708, USA.,Computational Biology and Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA.,Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA.,Center for Advanced Genomic Technologies, Duke University, Durham, North Carolina 27708, USA.,Department of Biomedical Engineering, Duke University, Durham, North Carolina 27708, USA.,University Program in Genetics and Genomics, Duke University, Durham, North Carolina 27708, USA.,Department of Molecular Genetics and Microbiology, Duke University, Durham, North Carolina 27708, USA
| |
Collapse
|
15
|
Feng H, Zhang YB, Gui JF, Lemon SM, Yamane D. Interferon regulatory factor 1 (IRF1) and anti-pathogen innate immune responses. PLoS Pathog 2021; 17:e1009220. [PMID: 33476326 PMCID: PMC7819612 DOI: 10.1371/journal.ppat.1009220] [Citation(s) in RCA: 122] [Impact Index Per Article: 40.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
The eponymous member of the interferon regulatory factor (IRF) family, IRF1, was originally identified as a nuclear factor that binds and activates the promoters of type I interferon genes. However, subsequent studies using genetic knockouts or RNAi-mediated depletion of IRF1 provide a much broader view, linking IRF1 to a wide range of functions in protection against invading pathogens. Conserved throughout vertebrate evolution, IRF1 has been shown in recent years to mediate constitutive as well as inducible host defenses against a variety of viruses. Fine-tuning of these ancient IRF1-mediated host defenses, and countering strategies by pathogens to disarm IRF1, play crucial roles in pathogenesis and determining the outcome of infection.
Collapse
Affiliation(s)
- Hui Feng
- Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Hebei Province Cangzhou Hospital of Integrated Traditional Chinese and Western Medicine, Cangzhou, Hebei, China
| | - Yi-Bing Zhang
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, China
| | - Jian-Fang Gui
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, China
| | - Stanley M. Lemon
- Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Medicine, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Microbiology & Immunology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- * E-mail: (SML); (DY)
| | - Daisuke Yamane
- Department of Diseases and Infection, Tokyo Metropolitan Institute of Medical Science, Setagaya-ku, Tokyo, Japan
- * E-mail: (SML); (DY)
| |
Collapse
|
16
|
Ronzio M, Bernardini A, Pavesi G, Mantovani R, Dolfini D. On the NF-Y regulome as in ENCODE (2019). PLoS Comput Biol 2020; 16:e1008488. [PMID: 33370256 PMCID: PMC7793273 DOI: 10.1371/journal.pcbi.1008488] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 01/08/2021] [Accepted: 11/04/2020] [Indexed: 11/19/2022] Open
Abstract
NF-Y is a trimeric Transcription Factor -TF- which binds with high selectivity to the conserved CCAAT element. Individual ChIP-seq analysis as well as ENCODE have progressively identified locations shared by other TFs. Here, we have analyzed data introduced by ENCODE over the last five years in K562, HeLa-S3 and GM12878, including several chromatin features, as well RNA-seq profiling of HeLa cells after NF-Y inactivation. We double the number of sequence-specific TFs and co-factors reported. We catalogue them in 4 classes based on co-association criteria, infer target genes categorizations, identify positional bias of binding sites and gene expression changes. Larger and novel co-associations emerge, specifically concerning subunits of repressive complexes as well as RNA-binding proteins. On the one hand, these data better define NF-Y association with single members of major classes of TFs, on the other, they suggest that it might have a wider role in the control of mRNA production. The ongoing ENCODE consortium represents a useful compendium of locations of TFs, chromatin marks, gene expression data. In previous reports, we identified modules of CCAAT-binding NF-Y with individual TFs. Here, we analyzed all 363 factors currently present: 68 with enrichment of CCAAT in their locations, 38 with overlap of peaks. New sequence-specific TFs, co-activators and co-repressors are reported. Co-association patterns correspond to specific targeted genes categorizations and gene expression changes, as assessed by RNA-seq after NF-Y inactivation. These data widen and better define a coherent model of synergy of NF-Y with selected groups of TFs and co-factors.
Collapse
Affiliation(s)
- Mirko Ronzio
- Dipartimento di Bioscienze, Università degli Studi di Milano, Milano, Italy
| | - Andrea Bernardini
- Dipartimento di Bioscienze, Università degli Studi di Milano, Milano, Italy
| | - Giulio Pavesi
- Dipartimento di Bioscienze, Università degli Studi di Milano, Milano, Italy
| | - Roberto Mantovani
- Dipartimento di Bioscienze, Università degli Studi di Milano, Milano, Italy
| | - Diletta Dolfini
- Dipartimento di Bioscienze, Università degli Studi di Milano, Milano, Italy
- * E-mail:
| |
Collapse
|
17
|
Bezzecchi E, Ronzio M, Mantovani R, Dolfini D. NF-Y Overexpression in Liver Hepatocellular Carcinoma (HCC). Int J Mol Sci 2020; 21:E9157. [PMID: 33271832 PMCID: PMC7731131 DOI: 10.3390/ijms21239157] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Revised: 11/24/2020] [Accepted: 11/24/2020] [Indexed: 12/24/2022] Open
Abstract
NF-Y is a pioneer trimeric transcription factor formed by the Histone Fold Domain (HFD) NF-YB/NF-YC subunits and NF-YA. Three subunits are required for DNA binding. CCAAT-specificity resides in NF-YA and transactivation resides in Q-rich domains of NF-YA and NF-YC. They are involved in alternative splicing (AS). We recently showed that NF-YA is overexpressed in breast and lung carcinomas. We report here on the overexpression of all subunits in the liver hepatocellular carcinoma (HCC) TCGA database, specifically the short NF-YAs and NF-YC2 (37 kDa) isoforms. This is observed at all tumor stages, in viral-infected samples and independently from the inflammatory status. Up-regulation of NF-YAs and NF-YC, but not NF-YB, is associated to tumors with mutant p53. We used a deep-learning-based method (DeepCC) to extend the partitioning of the three molecular clusters to all HCC TCGA tumors. In iCluster3, CCAAT is a primary matrix found in promoters of up-regulated genes, and cell-cycle pathways are enriched. Finally, clinical data indicate that, globally, only NF-YAs, but not HFD subunits, correlate with the worst prognosis; in iCluster1 patients, however, all subunits correlate. The data show a difference with other epithelial cancers, in that global overexpression of the three subunits is reported and clinically relevant in a subset of patients; yet, they further reinstate the regulatory role of the sequence-specific subunit.
Collapse
Affiliation(s)
| | | | | | - Diletta Dolfini
- Dipartimento di Bioscienze, Università degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy; (E.B.); (M.R.); (R.M.)
| |
Collapse
|
18
|
Yamada N, Rossi MJ, Farrell N, Pugh BF, Mahony S. Alignment and quantification of ChIP-exo crosslinking patterns reveal the spatial organization of protein-DNA complexes. Nucleic Acids Res 2020; 48:11215-11226. [PMID: 32747934 PMCID: PMC7672471 DOI: 10.1093/nar/gkaa618] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 06/25/2020] [Accepted: 07/13/2020] [Indexed: 12/12/2022] Open
Abstract
The ChIP-exo assay precisely delineates protein-DNA crosslinking patterns by combining chromatin immunoprecipitation with 5' to 3' exonuclease digestion. Within a regulatory complex, the physical distance of a regulatory protein to DNA affects crosslinking efficiencies. Therefore, the spatial organization of a protein-DNA complex could potentially be inferred by analyzing how crosslinking signatures vary between its subunits. Here, we present a computational framework that aligns ChIP-exo crosslinking patterns from multiple proteins across a set of coordinately bound regulatory regions, and which detects and quantifies protein-DNA crosslinking events within the aligned profiles. By producing consistent measurements of protein-DNA crosslinking strengths across multiple proteins, our approach enables characterization of relative spatial organization within a regulatory complex. Applying our approach to collections of ChIP-exo data, we demonstrate that it can recover aspects of regulatory complex spatial organization at yeast ribosomal protein genes and yeast tRNA genes. We also demonstrate the ability to quantify changes in protein-DNA complex organization across conditions by applying our approach to analyze Drosophila Pol II transcriptional components. Our results suggest that principled analyses of ChIP-exo crosslinking patterns enable inference of spatial organization within protein-DNA complexes.
Collapse
Affiliation(s)
- Naomi Yamada
- Center for Eukaryotic Gene Regulation, Department of Biochemistry & Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Matthew J Rossi
- Center for Eukaryotic Gene Regulation, Department of Biochemistry & Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Nina Farrell
- Center for Eukaryotic Gene Regulation, Department of Biochemistry & Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - B Franklin Pugh
- Center for Eukaryotic Gene Regulation, Department of Biochemistry & Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Shaun Mahony
- Center for Eukaryotic Gene Regulation, Department of Biochemistry & Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
19
|
Huning L, Kunkel GR. The ubiquitous transcriptional protein ZNF143 activates a diversity of genes while assisting to organize chromatin structure. Gene 2020; 769:145205. [PMID: 33031894 DOI: 10.1016/j.gene.2020.145205] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 09/24/2020] [Accepted: 09/29/2020] [Indexed: 10/23/2022]
Abstract
Zinc Finger Protein 143 (ZNF143) is a pervasive C2H2 zinc-finger transcriptional activator protein regulating the efficiency of eukaryotic promoter regions. ZNF143 is able to activate transcription at both protein coding genes and small RNA genes transcribed by either RNA polymerase II or RNA polymerase III. Target genes regulated by ZNF143 are involved in an array of different cellular processes including both cancer and development. Although a key player in regulating eukaryotic genes, the molecular mechanism by with ZNF143 binds and activates genes transcribed by two different polymerases is still relatively unknown. In addition to its role as a transcriptional regulator, recent genomics experiments have implicated ZNF143 as a potential co-factor involved in chromatin looping and establishing higher order structure within the genome. This review focuses primarily on possible activation mechanisms of promoters by ZNF143, with less emphasis on the role of ZNF143 in cancer and development, and its function in establishing higher order chromatin contacts within the genome.
Collapse
Affiliation(s)
- Laura Huning
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX 77843-2128, USA
| | - Gary R Kunkel
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX 77843-2128, USA.
| |
Collapse
|
20
|
Hu G, Dong X, Gong S, Song Y, Hutchins AP, Yao H. Systematic screening of CTCF binding partners identifies that BHLHE40 regulates CTCF genome-wide distribution and long-range chromatin interactions. Nucleic Acids Res 2020; 48:9606-9620. [PMID: 32885250 PMCID: PMC7515718 DOI: 10.1093/nar/gkaa705] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2020] [Revised: 07/27/2020] [Accepted: 08/14/2020] [Indexed: 11/14/2022] Open
Abstract
CTCF plays a pivotal role in mediating chromatin interactions, but it does not do so alone. A number of factors have been reported to co-localize with CTCF and regulate CTCF loops, but no comprehensive analysis of binding partners has been performed. This prompted us to identify CTCF loop participants and regulators by co-localization analysis with CTCF. We screened all factors that had ChIP-seq data in humans by co-localization analysis with human super conserved CTCF (hscCTCF) binding sites, and identified many new factors that overlapped with hscCTCF binding sites. Combined with CTCF loop information, we observed that clustered factors could promote CTCF loops. After in-depth mining of each factor, we found that many factors might have the potential to promote CTCF loops. Our data further demonstrated that BHLHE40 affected CTCF loops by regulating CTCF binding. Together, this study revealed that many factors have the potential to participate in or regulate CTCF loops, and discovered a new role for BHLHE40 in modulating CTCF loop formation.
Collapse
Affiliation(s)
- Gongcheng Hu
- CAS Key Laboratory of Regenerative Biology, Joint School of Life Sciences, State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou Medical University, Guangzhou 510530, China.,Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China.,Bioland Laboratory (Guangzhou Regenerative Medicine and Health GuangDong Laboratory), Guangzhou 510005, China.,Institute of Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaotao Dong
- CAS Key Laboratory of Regenerative Biology, Joint School of Life Sciences, State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou Medical University, Guangzhou 510530, China.,Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China.,Bioland Laboratory (Guangzhou Regenerative Medicine and Health GuangDong Laboratory), Guangzhou 510005, China.,Institute of Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shixin Gong
- CAS Key Laboratory of Regenerative Biology, Joint School of Life Sciences, State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou Medical University, Guangzhou 510530, China.,Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China.,Bioland Laboratory (Guangzhou Regenerative Medicine and Health GuangDong Laboratory), Guangzhou 510005, China.,Institute of Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yawei Song
- CAS Key Laboratory of Regenerative Biology, Joint School of Life Sciences, State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou Medical University, Guangzhou 510530, China.,Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China.,Bioland Laboratory (Guangzhou Regenerative Medicine and Health GuangDong Laboratory), Guangzhou 510005, China.,Institute of Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing 100101, China
| | - Andrew P Hutchins
- Department of Biology, Southern University of Science and Technology, Shenzhen 518055, China
| | - Hongjie Yao
- CAS Key Laboratory of Regenerative Biology, Joint School of Life Sciences, State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou Medical University, Guangzhou 510530, China.,Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China.,Bioland Laboratory (Guangzhou Regenerative Medicine and Health GuangDong Laboratory), Guangzhou 510005, China.,Institute of Stem Cell and Regeneration, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
21
|
Andreani T, Albrecht S, Fontaine JF, Andrade-Navarro MA. Computational identification of cell-specific variable regions in ChIP-seq data. Nucleic Acids Res 2020; 48:e53. [PMID: 32187374 PMCID: PMC7229859 DOI: 10.1093/nar/gkaa180] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Revised: 02/04/2020] [Accepted: 03/10/2020] [Indexed: 11/25/2022] Open
Abstract
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is used to identify genome-wide DNA regions bound by proteins. Given one ChIP-seq experiment with replicates, binding sites not observed in all the replicates will usually be interpreted as noise and discarded. However, the recent discovery of high-occupancy target (HOT) regions suggests that there are regions where binding of multiple transcription factors can be identified. To investigate ChIP-seq variability, we developed a reproducibility score and a method that identifies cell-specific variable regions in ChIP-seq data by integrating replicated ChIP-seq experiments for multiple protein targets on a particular cell type. Using our method, we found variable regions in human cell lines K562, GM12878, HepG2, MCF-7 and in mouse embryonic stem cells (mESCs). These variable-occupancy target regions (VOTs) are CG dinucleotide rich, and show enrichment at promoters and R-loops. They overlap significantly with HOT regions, but are not blacklisted regions producing non-specific binding ChIP-seq peaks. Furthermore, in mESCs, VOTs are conserved among placental species suggesting that they could have a function important for this taxon. Our method can be useful to point to such regions along the genome in a given cell type of interest, to improve the downstream interpretative analysis before follow-up experiments.
Collapse
Affiliation(s)
- Tommaso Andreani
- Faculty of Biology, Johannes Gutenberg University of Mainz, 55128 Mainz, Germany.,Institute of Molecular Biology (IMB), 55128 Mainz, Germany
| | - Steffen Albrecht
- Faculty of Biology, Johannes Gutenberg University of Mainz, 55128 Mainz, Germany
| | - Jean-Fred Fontaine
- Faculty of Biology, Johannes Gutenberg University of Mainz, 55128 Mainz, Germany
| | | |
Collapse
|
22
|
Osmala M, Lähdesmäki H. Enhancer prediction in the human genome by probabilistic modelling of the chromatin feature patterns. BMC Bioinformatics 2020; 21:317. [PMID: 32689977 PMCID: PMC7370432 DOI: 10.1186/s12859-020-03621-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Accepted: 06/19/2020] [Indexed: 12/11/2022] Open
Abstract
Background The binding sites of transcription factors (TFs) and the localisation of histone modifications in the human genome can be quantified by the chromatin immunoprecipitation assay coupled with next-generation sequencing (ChIP-seq). The resulting chromatin feature data has been successfully adopted for genome-wide enhancer identification by several unsupervised and supervised machine learning methods. However, the current methods predict different numbers and different sets of enhancers for the same cell type and do not utilise the pattern of the ChIP-seq coverage profiles efficiently. Results In this work, we propose a PRobabilistic Enhancer PRedictIoN Tool (PREPRINT) that assumes characteristic coverage patterns of chromatin features at enhancers and employs a statistical model to account for their variability. PREPRINT defines probabilistic distance measures to quantify the similarity of the genomic query regions and the characteristic coverage patterns. The probabilistic scores of the enhancer and non-enhancer samples are utilised to train a kernel-based classifier. The performance of the method is demonstrated on ENCODE data for two cell lines. The predicted enhancers are computationally validated based on the transcriptional regulatory protein binding sites and compared to the predictions obtained by state-of-the-art methods. Conclusion PREPRINT performs favorably to the state-of-the-art methods, especially when requiring the methods to predict a larger set of enhancers. PREPRINT generalises successfully to data from cell type not utilised for training, and often the PREPRINT performs better than the previous methods. The PREPRINT enhancers are less sensitive to the choice of prediction threshold. PREPRINT identifies biologically validated enhancers not predicted by the competing methods. The enhancers predicted by PREPRINT can aid the genome interpretation in functional genomics and clinical studies.
Collapse
Affiliation(s)
- Maria Osmala
- Department of Computer Science, Aalto University, Konemiehentie 2, Espoo, 02150, Finland.
| | - Harri Lähdesmäki
- Department of Computer Science, Aalto University, Konemiehentie 2, Espoo, 02150, Finland
| |
Collapse
|
23
|
IGAP-integrative genome analysis pipeline reveals new gene regulatory model associated with nonspecific TF-DNA binding affinity. Comput Struct Biotechnol J 2020; 18:1270-1286. [PMID: 32612751 PMCID: PMC7303559 DOI: 10.1016/j.csbj.2020.05.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 05/17/2020] [Accepted: 05/19/2020] [Indexed: 11/23/2022] Open
Abstract
The human genome is regulated in a multi-dimensional way. While biophysical factors like Non-specific Transcription factor Binding Affinity (nTBA) act at DNA sequence level, other factors act above sequence levels such as histone modifications and 3-D chromosomal interactions. This multidimensionality of regulation requires many of these factors for a proper understanding of the regulatory landscape of the human genome. Here, we propose a new biophysical model for estimating nTBA. Integration of nTBA with chromatin modifications and chromosomal interactions, using a new Integrative Genome Analysis Pipeline (IGAP), reveals additive effects of nTBA to regulatory DNA sequences and identifies three types of genomic zones in the human genome (Inactive Genomic Zones, Poised Genomic Zones, and Active Genomic Zones). It also unveils a novel long distance gene regulatory model: chromosomal interactions reduce the physical distance between the high occupancy target (HOT) regions that results in high nTBA to DNA in the area, which in turn attract TFs to such regions with higher binding potential. These findings will help to elucidate the three-dimensional diffusion process that TFs use during their search for the right targets.
Collapse
|
24
|
Jiang D, Deng J, Dong C, Ma X, Xiao Q, Zhou B, Yang C, Wei L, Conran C, Zheng SL, Ng IOL, Yu L, Xu J, Sham PC, Qi X, Hou J, Ji Y, Cao G, Li M. Knowledge-based analyses reveal new candidate genes associated with risk of hepatitis B virus related hepatocellular carcinoma. BMC Cancer 2020; 20:403. [PMID: 32393195 PMCID: PMC7216662 DOI: 10.1186/s12885-020-06842-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Accepted: 04/07/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Recent genome-wide association studies (GWASs) have suggested several susceptibility loci of hepatitis B virus (HBV)-related hepatocellular carcinoma (HCC) by statistical analysis at individual single-nucleotide polymorphisms (SNPs). However, these loci only explain a small fraction of HBV-related HCC heritability. In the present study, we aimed to identify additional susceptibility loci of HBV-related HCC using advanced knowledge-based analysis. METHODS We performed knowledge-based analysis (including gene- and gene-set-based association tests) on variant-level association p-values from two existing GWASs of HBV-related HCC. Five different types of gene-sets were collected for the association analysis. A number of SNPs within the gene prioritized by the knowledge-based association tests were selected to replicate genetic associations in an independent sample of 965 cases and 923 controls. RESULTS The gene-based association analysis detected four genes significantly or suggestively associated with HBV-related HCC risk: SLC39A8, GOLGA8M, SMIM31, and WHAMMP2. The gene-set-based association analysis prioritized two promising gene sets for HCC, cell cycle G1/S transition and NOTCH1 intracellular domain regulates transcription. Within the gene sets, three promising candidate genes (CDC45, NCOR1 and KAT2A) were further prioritized for HCC. Among genes of liver-specific expression, multiple genes previously implicated in HCC were also highlighted. However, probably due to small sample size, none of the genes prioritized by the knowledge-based association analyses were successfully replicated by variant-level association test in the independent sample. CONCLUSIONS This comprehensive knowledge-based association mining study suggested several promising genes and gene-sets associated with HBV-related HCC risks, which would facilitate follow-up functional studies on the pathogenic mechanism of HCC.
Collapse
Affiliation(s)
- Deke Jiang
- State Key Laboratory of Organ Failure Research, Guangdong Key Laboratory of Viral Hepatitis Research, Institutes of Liver Diseases Research of Guangdong Province, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Jiaen Deng
- Department of Psychiatry, the University of Hong Kong, Pokfulam, Hong Kong
| | | | - Xiaopin Ma
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai, China
| | - Qianyi Xiao
- Center for Genomic Translational Medicine and Prevention, School of Public Health, Fudan University, Shanghai, China
| | - Bin Zhou
- State Key Laboratory of Organ Failure Research, Guangdong Key Laboratory of Viral Hepatitis Research, Institutes of Liver Diseases Research of Guangdong Province, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Chou Yang
- State Key Laboratory of Organ Failure Research, Guangdong Key Laboratory of Viral Hepatitis Research, Institutes of Liver Diseases Research of Guangdong Province, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Lin Wei
- Program of Computational Genomics & Medicine, NorthShore University HealthSystem, Evanston, IL, USA.,Department of Public Health Sciences, University of Chicago, Chicago, IL, USA
| | - Carly Conran
- Program for Personalized Cancer Care, NorthShore University HealthSystem, Pritzker School of Medicine, University of Chicago, Evanston, IL, USA
| | - S Lilly Zheng
- Program of Computational Genomics & Medicine, NorthShore University HealthSystem, Evanston, IL, USA
| | - Irene Oi-Lin Ng
- Department of Pathology, the University of Hong Kong, Pokfulam, Hong Kong.,State Key Laboratory of Liver Research, the University of Hong Kong, Pokfulam, Hong Kong
| | - Long Yu
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai, China
| | - Jianfeng Xu
- Program of Computational Genomics & Medicine, NorthShore University HealthSystem, Evanston, IL, USA
| | - Pak C Sham
- The Centre for Genomic Sciences, the University of Hong Kong, Pokfulam, Hong Kong
| | - Xiaolong Qi
- State Key Laboratory of Organ Failure Research, Guangdong Key Laboratory of Viral Hepatitis Research, Institutes of Liver Diseases Research of Guangdong Province, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Jinlin Hou
- State Key Laboratory of Organ Failure Research, Guangdong Key Laboratory of Viral Hepatitis Research, Institutes of Liver Diseases Research of Guangdong Province, Department of Infectious Diseases and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Yuan Ji
- Department of Public Health Sciences, University of Chicago, Chicago, IL, USA
| | - Guangwen Cao
- Department of Epidemiology, Second Military Medical University, Shanghai, China.
| | - Miaoxin Li
- Department of Psychiatry, the University of Hong Kong, Pokfulam, Hong Kong. .,The Centre for Genomic Sciences, the University of Hong Kong, Pokfulam, Hong Kong. .,State Key Laboratory for Cognitive and Brain Sciences, the University of Hong Kong, Pokfulam, Hong Kong. .,Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China. .,Key Laboratory of Tropical Disease Control (SYSU), Ministry of Education, Guangzhou, China.
| |
Collapse
|
25
|
Wei X, Zhu X, Jiang L, Huang X, Zhang Y, Zhao D, Du Y. Recent advances in understanding the role of hypoxia-inducible factor 1α in renal fibrosis. Int Urol Nephrol 2020; 52:1287-1295. [PMID: 32378138 DOI: 10.1007/s11255-020-02474-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Accepted: 04/15/2020] [Indexed: 12/11/2022]
Abstract
Renal fibrosis is the most common pathological manifestation of chronic kidney disease (CKD), and with numerous influencing factors, its pathogenesis is complex. Epithelial-mesenchymal transition (EMT) is known to promote the progression of renal fibrosis via alterations in the secreted proteome. Moreover, blocking or even reversing EMT can effectively reduce the degree of fibrosis. As such, targeting the key molecules responsible for promoting EMT may be an effective strategy for inhibiting renal fibrosis. Research in recent years has demonstrated that hypoxia-inducible factor 1α (HIF-1α) acts to promote renal fibrosis through regulation of EMT. However, the relationship between HIF-1α and EMT remains incompletely understood. In the present review, the underlying mechanism of the interaction between HIF-1α and EMT is explored to provide novel insight into the pathogenesis of renal fibrosis and new ideas for early targeted intervention.
Collapse
Affiliation(s)
- Xuejiao Wei
- Department of Nephrology, The First Hospital of Jilin University, 71 XinMin Street, Changchun, Jilin, China
| | - Xiaoyu Zhu
- Department of Nephrology, The First Hospital of Jilin University, 71 XinMin Street, Changchun, Jilin, China
| | - Lili Jiang
- Department of Nephrology, The First Hospital of Jilin University, 71 XinMin Street, Changchun, Jilin, China
| | - Xiu Huang
- Department of Nephrology, The First Hospital of Jilin University, 71 XinMin Street, Changchun, Jilin, China
| | - Yangyang Zhang
- Department of Nephrology, The First Hospital of Jilin University, 71 XinMin Street, Changchun, Jilin, China
| | - Dan Zhao
- Department of Nephrology, The First Hospital of Jilin University, 71 XinMin Street, Changchun, Jilin, China
| | - Yujun Du
- Department of Nephrology, The First Hospital of Jilin University, 71 XinMin Street, Changchun, Jilin, China.
| |
Collapse
|
26
|
SNP rs17079281 decreases lung cancer risk through creating an YY1-binding site to suppress DCBLD1 expression. Oncogene 2020; 39:4092-4102. [PMID: 32231272 PMCID: PMC7220863 DOI: 10.1038/s41388-020-1278-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Revised: 03/13/2020] [Accepted: 03/17/2020] [Indexed: 12/24/2022]
Abstract
Genome-wide association studies (GWAS) have identified numerous genetic variants that are associated with lung cancer risk, but the biological mechanisms underlying these associations remain largely unknown. Here we investigated the functional relevance of a genetic region in 6q22.2 which was identified to be associated with lung cancer risk in our previous GWAS. We performed linkage disequilibrium (LD) analysis and bioinformatic prediction to screen functional SNPs linked to a tagSNP in 6q22.2 loci, followed by two case-control studies and a meta-analysis with 4403 cases and 5336 controls to identify if these functional SNPs were associated with lung cancer risk. A novel SNP rs17079281 in the DCBLD1 promoter was identified to be associated with lung cancer risk in Chinese populations. Compared with those with C allele, patients with T allele had lower risk of adenocarcinoma (adjusted OR = 0.86; 95% CI: 0.80–0.92), but not squamous cell carcinoma (adjusted OR = 0.99; 95% CI: 0.91–1.10), and patients with the C/T or T/T genotype had lower levels of DCBLD1 expression than those with C/C genotype in lung adenocarcinoma tissues. We performed functional assays to characterize its biological relevance. The results showed that the T allele of rs17079281 had higher binding affinity to transcription factor YY1 than the C allele, which suppressed DCBLD1 expression. DCBLD1 behaved like an oncogene, promoting tumor growth by influencing cell cycle progression. These findings suggest that the functional variant rs17079281C>T decreased lung adenocarcinoma risk by creating an YY1-binding site to suppress DCBLD1 expression, which may serve as a biomarker for assessing lung cancer susceptibility.
Collapse
|
27
|
Ronzio M, Zambelli F, Dolfini D, Mantovani R, Pavesi G. Integrating Peak Colocalization and Motif Enrichment Analysis for the Discovery of Genome-Wide Regulatory Modules and Transcription Factor Recruitment Rules. Front Genet 2020; 11:72. [PMID: 32153638 PMCID: PMC7046753 DOI: 10.3389/fgene.2020.00072] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Accepted: 01/22/2020] [Indexed: 12/14/2022] Open
Abstract
Chromatin immunoprecipitation followed by next-generation sequencing (ChIP-Seq) has opened new avenues of research in the genome-wide characterization of regulatory DNA-protein interactions at the genetic and epigenetic level. As a consequence, it has become the de facto standard for studies on the regulation of transcription, and literally thousands of data sets for transcription factors and cofactors in different conditions and species are now available to the scientific community. However, while pipelines and best practices have been established for the analysis of a single experiment, there is still no consensus on the best way to perform an integrated analysis of multiple datasets in the same condition, in order to identify the most relevant and widespread regulatory modules composed by different transcription factors and cofactors. We present here a computational pipeline for this task, that integrates peak summit colocalization, a novel statistical framework for the evaluation of its significance, and motif enrichment analysis. We show examples of its application to ENCODE data, that led to the identification of relevant regulatory modules composed of different factors, as well as the organization on DNA of the binding motifs responsible for their recruitment.
Collapse
Affiliation(s)
- Mirko Ronzio
- Dipartimento di Bioscienze, Università di Milano, Milan, Italy
| | | | - Diletta Dolfini
- Dipartimento di Bioscienze, Università di Milano, Milan, Italy
| | | | - Giulio Pavesi
- Dipartimento di Bioscienze, Università di Milano, Milan, Italy
| |
Collapse
|
28
|
Diwadkar AR, Kan M, Himes BE. Facilitating Analysis of Publicly Available ChIP-Seq Data for Integrative Studies. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2020; 2019:371-379. [PMID: 32308830 PMCID: PMC7153109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
ChIP-Seq, a technique that allows for quantification of DNA sequences bound by transcription factors or histones, has been widely used to characterize genome-wide DNA-protein binding at baseline and induced by specific exposures. Integrating results of multiple ChIP-Seq datasets is a convenient approach to identify robust DNA- protein binding sites and determine their cell-type specificity. We developed brocade, a computational pipeline for reproducible analysis of publicly available ChIP-Seq data that creates R markdown reports containing information on datasets downloaded, quality control metrics, and differential binding results. Glucocorticoids are commonly used anti-inflammatory drugs with tissue-specific effects that are not fully understood. We demonstrate the utility of brocade via the analysis of five ChIP-Seq datasets involving glucocorticoid receptor (GR), a transcription factor that mediates glucocorticoid response, to identify cell type-specific and shared GR binding sites across the five cell types. Our results show that brocade facilitates analysis of individual ChIP-Seq datasets and comparative studies involving multiple datasets.
Collapse
Affiliation(s)
- Avantika R Diwadkar
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, US
| | - Mengyuan Kan
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, US
| | - Blanca E Himes
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, US
| |
Collapse
|
29
|
Liu H, Zhao H, Lin H, Li Z, Xue H, Zhang Y, Lu J. Relationship of COL9A1 and SOX9 Genes with Genetic Susceptibility of Postmenopausal Osteoporosis. Calcif Tissue Int 2020; 106:248-255. [PMID: 31732751 DOI: 10.1007/s00223-019-00629-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Accepted: 10/29/2019] [Indexed: 12/19/2022]
Abstract
As one of the most common types of osteoporosis, postmenopausal osteoporosis (PMOP) is caused by both genetic and environmental factors. Previous studies have indicated that SOX9 activity is tightly regulated to ensure normal bone mineral density (BMD) in the adult skeleton, and the COL9A1 promoter region can be transactivated by SOX9. In this study, we aimed to investigate the potential association between PMOP and the COL9A1 and SOX9 genes. A total of 10,443 postmenopausal women, including 2288 patients and 3557 controls in the discovery stage and 1566 patients and 3032 controls in the validation stage, were recruited. Forty-three tag SNPs (36 in COL9A1 and 7 in SOX9) were selected for genotyping to evaluate the association of the SOX9 gene with PMOP and BMD. Association and bioinformatics analyses were performed for PMOP. BMD and serum level of SOX9 were also utilized as quantitative phenotypes in further analyses. SNP rs73354570 of SOX9 was significantly associated with PMOP in both discovery stages (OR 1.24 [1.10-1.39], P = 3.56 × 10-4, χ2 = 12.75) and combined samples (OR 1.25 [1.15-1.37], P = 5.25 × 10-7, χ2 = 25.17). Further analyses showed that the SNP was also significantly associated with BMD and serum levels of the SOX9 protein. Our results provide further supportive evidence for the association of the SOX9 gene with PMOP and of the SOX9 gene with the variation of BMD in postmenopausal Han Chinese women. This study supports a role for SOX9 in the etiology of PMOP, adding to the current understanding of the susceptibility of osteoporosis.
Collapse
Affiliation(s)
- Hongliang Liu
- Department of Orthopedic, The First Affiliated Hospital of Xi'an Jiaotong University, No.277, Yanta West Road, Xi'an, 710061, Shaanxi, China
- Department of Trauma Orthopedics, Honghui Hospital, Xi'an Jiaotong University Health Science Center, No.555, Youyi East Road, Xi'an, 710054, Shaanxi, China
| | - Hongmou Zhao
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University Health Science Center, No.555, Youyi East Road, Xi'an, 710054, Shaanxi, China
| | - Hua Lin
- Department of Trauma Orthopedics, Honghui Hospital, Xi'an Jiaotong University Health Science Center, No.555, Youyi East Road, Xi'an, 710054, Shaanxi, China
| | - Zhong Li
- Department of Trauma Orthopedics, Honghui Hospital, Xi'an Jiaotong University Health Science Center, No.555, Youyi East Road, Xi'an, 710054, Shaanxi, China
| | - Hanzhong Xue
- Department of Trauma Orthopedics, Honghui Hospital, Xi'an Jiaotong University Health Science Center, No.555, Youyi East Road, Xi'an, 710054, Shaanxi, China
| | - Yunzhi Zhang
- Department of Orthopaedic Surgery, The Second Affiliated Hospital of Xi'an Jiaotong University, No.157, Xiwu Road, Xi'an, 710004, Shaanxi, China
| | - Jun Lu
- Department of Internal Medicine, Honghui Hospital, Xi'an Jiaotong University Health Science Center, No.555, Youyi East Road, Xi'an, 710054, Shaanxi, China.
| |
Collapse
|
30
|
Bezzecchi E, Ronzio M, Semeghini V, Andrioletti V, Mantovani R, Dolfini D. NF-YA Overexpression in Lung Cancer: LUAD. Genes (Basel) 2020; 11:genes11020198. [PMID: 32075093 PMCID: PMC7074112 DOI: 10.3390/genes11020198] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 02/10/2020] [Indexed: 12/14/2022] Open
Abstract
The trimeric transcription factor (TF) NF-Y regulates the CCAAT box, a DNA element enriched in promoters of genes overexpressed in many types of cancer. The regulatory NF-YA is present in two major isoforms, NF-YAl ("long") and NF-YAs ("short"). There is growing indication that NF-YA levels are increased in tumors. Here, we report interrogation of RNA-Seq TCGA (The Cancer Genome Atlas)-all 576 samples-and GEO (Gene Expression Ominibus) datasets of lung adenocarcinoma (LUAD). NF-YAs is overexpressed in the three subtypes, proliferative, inflammatory, and TRU (terminal respiratory unit). CCAAT is enriched in promoters of tumor differently expressed genes (DEG) and in the proliferative/inflammatory intersection, matching with KEGG (Kyoto Encyclopedia of Genes and Genomes) terms cell-cycle and signaling. Increasing levels of NF-YAs are observed from low to high CpG island methylator phenotypes (CIMP). We identified 166 genes overexpressed in LUAD cell lines with low NF-YAs/NF-YAl ratios: applying this centroid to TCGA samples faithfully predicted tumors' isoform ratio. This signature lacks CCAAT in promoters. Finally, progression-free intervals and hazard ratios concurred with the worst prognosis of patients with either a low or high NF-YAs/NF-YAl ratio. In conclusion, global overexpression of NF-YAs is documented in LUAD and is associated with aggressive tumor behavior; however, a similar prognosis is recorded in tumors with high levels of NF-YAl and overexpressed CCAAT-less genes.
Collapse
Affiliation(s)
- Eugenia Bezzecchi
- Dipartimento di Bioscienze, Università degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy
| | - Mirko Ronzio
- Dipartimento di Bioscienze, Università degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy
| | - Valentina Semeghini
- Dipartimento di Bioscienze, Università degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy
| | - Valentina Andrioletti
- Internal Medicine VIII, University Hospital Tübingen. Otfried-Müller-Str. 14, 72076 Tübingen, Germany
| | - Roberto Mantovani
- Dipartimento di Bioscienze, Università degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy
| | - Diletta Dolfini
- Dipartimento di Bioscienze, Università degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy
- Correspondence: ; Tel.: +39-02-50315005
| |
Collapse
|
31
|
Li J, Yin Y, Zhang M, Cui J, Zhang Z, Zhang Z, Sun D. GsmPlot: a web server to visualize epigenome data in NCBI. BMC Bioinformatics 2020; 21:55. [PMID: 32050905 PMCID: PMC7017537 DOI: 10.1186/s12859-020-3386-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 01/24/2020] [Indexed: 12/31/2022] Open
Abstract
Background Epigenetic regulation is essential in regulating gene expression across a variety of biological processes. Many high-throughput sequencing technologies have been widely used to generate epigenetic data, such as histone modification, transcription factor binding sites, DNA modifications, chromatin accessibility, and etc. A large scale of epigenetic data is stored in NCBI Gene Expression Omnibus (GEO). However, it is a great challenge to reanalyze these large scale and complex data, especially for researchers who do not specialize in bioinformatics skills or do not have access to expensive computational infrastructure. Results GsmPlot can simply accept GSM IDs to automatically download NCBI data or can accept user’s private bigwig files as input to plot the concerned data on promoters, exons or any other user-defined genome locations and generate UCSC visualization tracks. By linking public data repository and private data, GsmPlot can spark data-driven ideas and hence promote the epigenetic research. Conclusions GsmPlot web server allows convenient visualization and efficient exploration of any NCBI epigenetic data in any genomic region without need of any bioinformatics skills or special computing resources. GsmPlot is freely available at https://gsmplot.deqiangsun.org/.
Collapse
Affiliation(s)
- Jia Li
- Center for Epigenetics & Disease Prevention, Institute of Biosciences and Technology, Texas A&M University College of Medicine, Houston, TX, 77030, USA
| | - Yue Yin
- Center for Epigenetics & Disease Prevention, Institute of Biosciences and Technology, Texas A&M University College of Medicine, Houston, TX, 77030, USA
| | - Mutian Zhang
- Center for Epigenetics & Disease Prevention, Institute of Biosciences and Technology, Texas A&M University College of Medicine, Houston, TX, 77030, USA
| | - Jie Cui
- Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, 430071, China
| | - Zhenhai Zhang
- Center for Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China
| | - Zhiyong Zhang
- The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China.
| | - Deqiang Sun
- Center for Epigenetics & Disease Prevention, Institute of Biosciences and Technology, Texas A&M University College of Medicine, Houston, TX, 77030, USA.
| |
Collapse
|
32
|
Wreczycka K, Franke V, Uyar B, Wurmus R, Bulut S, Tursun B, Akalin A. HOT or not: examining the basis of high-occupancy target regions. Nucleic Acids Res 2019; 47:5735-5745. [PMID: 31114922 PMCID: PMC6582337 DOI: 10.1093/nar/gkz460] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Revised: 05/02/2019] [Accepted: 05/13/2019] [Indexed: 01/16/2023] Open
Abstract
High-occupancy target (HOT) regions are segments of the genome with unusually high number of transcription factor binding sites. These regions are observed in multiple species and thought to have biological importance due to high transcription factor occupancy. Furthermore, they coincide with house-keeping gene promoters and consequently associated genes are stably expressed across multiple cell types. Despite these features, HOT regions are solely defined using ChIP-seq experiments and shown to lack canonical motifs for transcription factors that are thought to be bound there. Although, ChIP-seq experiments are the golden standard for finding genome-wide binding sites of a protein, they are not noise free. Here, we show that HOT regions are likely to be ChIP-seq artifacts and they are similar to previously proposed ‘hyper-ChIPable’ regions. Using ChIP-seq data sets for knocked-out transcription factors, we demonstrate presence of false positive signals on HOT regions. We observe sequence characteristics and genomic features that are discriminatory of HOT regions, such as GC/CpG-rich k-mers, enrichment of RNA–DNA hybrids (R-loops) and DNA tertiary structures (G-quadruplex DNA). The artificial ChIP-seq enrichment on HOT regions could be associated to these discriminatory features. Furthermore, we propose strategies to deal with such artifacts for the future ChIP-seq studies.
Collapse
Affiliation(s)
- Katarzyna Wreczycka
- Bioinformatics and Omics Data Science Platform, Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Vedran Franke
- Bioinformatics and Omics Data Science Platform, Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Bora Uyar
- Bioinformatics and Omics Data Science Platform, Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Ricardo Wurmus
- Bioinformatics and Omics Data Science Platform, Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Selman Bulut
- Gene Regulation and Cell Fate Decision in C. elegans, Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Baris Tursun
- Gene Regulation and Cell Fate Decision in C. elegans, Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Berlin, Germany
| | - Altuna Akalin
- Bioinformatics and Omics Data Science Platform, Berlin Institute for Medical Systems Biology, Max-Delbrück Center for Molecular Medicine, Berlin, Germany
| |
Collapse
|
33
|
Kang X, Tian B, Zhang L, Ge Z, Zhao Y, Zhang Y. Relationship of common variants in MPP7, TIMP2 and CASP8 genes with the risk of chronic achilles tendinopathy. Sci Rep 2019; 9:17627. [PMID: 31772230 PMCID: PMC6879592 DOI: 10.1038/s41598-019-54097-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 11/08/2019] [Indexed: 11/08/2022] Open
Abstract
Previous etiologic studies have indicated that both environmental and genetic factors play important roles in the occurrence and development of chronic Achilles tendinopathy (AT). A recent study documented the results of the largest genome-wide association study for chronic AT to date, indicating that MPP7, TIMP2 and CASP8 may be involved in the occurrence and development of chronic AT. In this study, we aimed to investigate whether MPP7, TIMP2 and CASP8 were associated with susceptibility to chronic AP in a Han Chinese population. A total of 3,680 study subjects comprised 1,288 chronic AT cases, and 2,392 healthy controls were recruited. Forty-four tag SNPs (7 from CASP8, 20 from MPP7, and 17 from TIMP2) were genotyped in the study. Genetic association analyses were performed at both single marker and haplotype levels. Functional consequences of significant SNPs were examined in the RegulomeDB and GTEx databases. Two SNPs, SNP rs1937810 (OR [95%CI] = 1.20 [1.09-1.32], χ2 = 13.50, P = 0.0002) in MPP7 and rs4789932 (OR [95%CI] = 1.24 [1.12-1.37], χ2 = 17.98, P = 2.23 × 10-5) in TIMP2, were significantly associated with chronic AT. Significant eQTL signals for SNP rs4789932 on TIMP2 were identified in human heart and artery tissues. Our results provide further supportive evidence for the association of the TIMP2 and MPP7 genes with chronic AT, which supports important roles for TIMP2 and MPP7 in the etiology of chronic AT, adding to the current understanding of the susceptibility of chronic AT.
Collapse
Affiliation(s)
- Xin Kang
- Department of Orthopedics, the First Affiliated Hospital of Xi'an Jiao Tong University, Xi'an, Shaanxi, China
- Department of Sports Medicine, Honghui Hospital, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Bin Tian
- Department of Sports Medicine, Honghui Hospital, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Liang Zhang
- Department of Sports Medicine, Honghui Hospital, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Zhaogang Ge
- Department of Sports Medicine, Honghui Hospital, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Yang Zhao
- Department of Sports Medicine, Honghui Hospital, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | - Yingang Zhang
- Department of Orthopedics, the First Affiliated Hospital of Xi'an Jiao Tong University, Xi'an, Shaanxi, China.
| |
Collapse
|
34
|
Mo D, Li J, Peng L, Liu Z, Wang J, Yuan J. Genetic Polymorphisms on 4q21.1 Contributed to the Risk of Hashimoto's Thyroiditis. Genet Test Mol Biomarkers 2019; 23:837-842. [PMID: 31750736 DOI: 10.1089/gtmb.2019.0125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Background: Hashimoto's thyroiditis (HT) is a common autoimmune disease characterized by lymphoid infiltration of the thyroid gland, including both T- and B-cells. Early studies have shown that HT is a complex disorder affected by both environmental and genetic factors. Recently, the single nucleotide polymorphism (SNP) rs2276886 associated with the CXCL9 gene was identified as associated with autoimmune thyroid disease susceptibility in Japanese populations. The aim of the present study was to validate this result for HT in a Chinese Han population. Methods: Study subjects, including 688 HT cases and 1456 healthy controls, were recruited, and 10 SNPs located within the CXCL9 gene were genotyped. Genetic association analyses were performed by fitting logistic models. Bioinformatics tools, including RegulomeDB and GTEx were utilized to investigate the functional consequences of the SNPs found to be significantly associated with HT. Results: SNP rs2276886 was identified as significantly associated with the risk of HT (odds ratio [OR] = 1.25, p = 0.0006). No significant expression quantitative trait loci (eQTL) signals could be identified for CXCL9. Significant eQTL signals were found for other genes, including ART3, CXCL10, CXCL11, NAAA, PPEF2, and SCARB2. This SNP physically maps to the CXCL9 gene region; however, further bioinformatic analyses indicated that this SNP might be associated with the gene NAAA. Conclusions: The rs2276886 SNP was found to be significantly associated with HT susceptibility. However, our findings suggest that this SNP which maps to the chromosomal region 4q21.1 likely effects the NAAA gene (as opposed to the CXCL9 gene), but still contributes to the susceptibility to HT in Han Chinese populations.
Collapse
Affiliation(s)
- Dachao Mo
- Department of General Surgery, Dongguan Tungwah Hospital, Dongguan, China
| | - Junjiu Li
- Department of General Surgery, Dongguan Tungwah Hospital, Dongguan, China
| | - Liang Peng
- Department of General Surgery, Dongguan Tungwah Hospital, Dongguan, China
| | - Zhiyuan Liu
- Department of General Surgery, Dongguan Tungwah Hospital, Dongguan, China
| | - Jieyun Wang
- Department of General Surgery, Dongguan Tungwah Hospital, Dongguan, China
| | - Jiru Yuan
- Department of General Surgery, Dongguan Tungwah Hospital, Dongguan, China
| |
Collapse
|
35
|
Bezzecchi E, Ronzio M, Dolfini D, Mantovani R. NF-YA Overexpression in Lung Cancer: LUSC. Genes (Basel) 2019; 10:genes10110937. [PMID: 31744190 PMCID: PMC6895822 DOI: 10.3390/genes10110937] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2019] [Revised: 11/04/2019] [Accepted: 11/13/2019] [Indexed: 12/12/2022] Open
Abstract
The CCAAT box is recognized by the trimeric transcription factor NF-Y, whose NF-YA subunit is present in two major splicing isoforms, NF-YAl (“long”) and NF-YAs (“short”). Little is known about the expression levels of NF-Y subunits in tumors, and nothing in lung cancer. By interrogating RNA-seq TCGA and GEO datasets, we found that, unlike NF-YB/NF-YC, NF-YAs is overexpressed in lung squamous cell carcinomas (LUSC). The ratio of the two isoforms changes from normal to cancer cells, with NF-YAs becoming predominant in the latter. NF-YA increased expression correlates with common proliferation markers. We partitioned all 501 TCGA LUSC tumors in the four molecular cohorts and verified that NF-YAs is similarly overexpressed. We analyzed global and subtype-specific RNA-seq data and found that CCAAT is the most abundant DNA matrix in promoters of genes overexpressed in all subtypes. Enriched Gene Ontology terms are cell-cycle and signaling. Survival curves indicate a worse clinical outcome for patients with increasing global amounts of NF-YA; same with hazard ratios with very high and, surprisingly, very low NF-YAs/NF-YAl ratios. We then analyzed gene expression in this latter cohort and identified a different, pro-migration signature devoid of CCAAT. We conclude that overexpression of the NF-Y regulatory subunit in LUSC has the scope of increasing CCAAT-dependent, proliferative (NF-YAshigh) or CCAAT-less, pro-migration (NF-YAlhigh) genes. The data further reinstate the importance of analysis of single isoforms of TFs involved in tumor development.
Collapse
|
36
|
Gheorghe M, Sandve GK, Khan A, Chèneby J, Ballester B, Mathelier A. A map of direct TF-DNA interactions in the human genome. Nucleic Acids Res 2019; 47:e21. [PMID: 30517703 PMCID: PMC6393237 DOI: 10.1093/nar/gky1210] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2018] [Revised: 10/31/2018] [Accepted: 11/20/2018] [Indexed: 12/11/2022] Open
Abstract
Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is the most popular assay to identify genomic regions, called ChIP-seq peaks, that are bound in vivo by transcription factors (TFs). These regions are derived from direct TF-DNA interactions, indirect binding of the TF to the DNA (through a co-binding partner), nonspecific binding to the DNA, and noise/bias/artifacts. Delineating the bona fide direct TF-DNA interactions within the ChIP-seq peaks remains challenging. We developed a dedicated software, ChIP-eat, that combines computational TF binding models and ChIP-seq peaks to automatically predict direct TF-DNA interactions. Our work culminated with predicted interactions covering >4% of the human genome, obtained by uniformly processing 1983 ChIP-seq peak data sets from the ReMap database for 232 unique TFs. The predictions were a posteriori assessed using protein binding microarray and ChIP-exo data, and were predominantly found in high quality ChIP-seq peaks. The set of predicted direct TF-DNA interactions suggested that high-occupancy target regions are likely not derived from direct binding of the TFs to the DNA. Our predictions derived co-binding TFs supported by protein-protein interaction data and defined cis-regulatory modules enriched for disease- and trait-associated SNPs. We provide this collection of direct TF-DNA interactions and cis-regulatory modules through the UniBind web-interface (http://unibind.uio.no).
Collapse
Affiliation(s)
- Marius Gheorghe
- Centre for Molecular Medicine Norway (NCMM), University of Oslo, Oslo, Norway
| | | | - Aziz Khan
- Centre for Molecular Medicine Norway (NCMM), University of Oslo, Oslo, Norway
| | - Jeanne Chèneby
- Aix Marseille Université, INSERM, TAGC, Marseille, France
| | | | - Anthony Mathelier
- Centre for Molecular Medicine Norway (NCMM), University of Oslo, Oslo, Norway.,Department of Cancer Genetics, Institute for Cancer Research, Radiumhospitalet, Oslo, Norway
| |
Collapse
|
37
|
Diehl AG, Boyle AP. CGIMP: Real-time exploration and covariate projection for self-organizing map datasets. JOURNAL OF OPEN SOURCE SOFTWARE 2019; 4:1520. [PMID: 32500114 PMCID: PMC7272009 DOI: 10.21105/joss.01520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Affiliation(s)
- Adam G Diehl
- Department of Computational Medicine and Bioinformatics, University of Michigan
| | - Alan P Boyle
- Department of Computational Medicine and Bioinformatics, University of Michigan
- Department of Human Genetics, University of Michigan
| |
Collapse
|
38
|
Xiao R, Chen JY, Liang Z, Luo D, Chen G, Lu ZJ, Chen Y, Zhou B, Li H, Du X, Yang Y, San M, Wei X, Liu W, Lécuyer E, Graveley BR, Yeo GW, Burge CB, Zhang MQ, Zhou Y, Fu XD. Pervasive Chromatin-RNA Binding Protein Interactions Enable RNA-Based Regulation of Transcription. Cell 2019; 178:107-121.e18. [PMID: 31251911 PMCID: PMC6760001 DOI: 10.1016/j.cell.2019.06.001] [Citation(s) in RCA: 197] [Impact Index Per Article: 39.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2018] [Revised: 03/21/2019] [Accepted: 05/31/2019] [Indexed: 01/03/2023]
Abstract
Increasing evidence suggests that transcriptional control and chromatin activities at large involve regulatory RNAs, which likely enlist specific RNA-binding proteins (RBPs). Although multiple RBPs have been implicated in transcription control, it has remained unclear how extensively RBPs directly act on chromatin. We embarked on a large-scale RBP ChIP-seq analysis, revealing widespread RBP presence in active chromatin regions in the human genome. Like transcription factors (TFs), RBPs also show strong preference for hotspots in the genome, particularly gene promoters, where their association is frequently linked to transcriptional output. Unsupervised clustering reveals extensive co-association between TFs and RBPs, as exemplified by YY1, a known RNA-dependent TF, and RBM25, an RBP involved in splicing regulation. Remarkably, RBM25 depletion attenuates all YY1-dependent activities, including chromatin binding, DNA looping, and transcription. We propose that various RBPs may enhance network interaction through harnessing regulatory RNAs to control transcription.
Collapse
Affiliation(s)
- Rui Xiao
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA; Medical Research Institute, Wuhan University, Wuhan, Hubei 430071, China.
| | - Jia-Yu Chen
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Zhengyu Liang
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA; MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China
| | - Daji Luo
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA; School of Basic Medical Sciences, Wuhan University, Wuhan, Hubei 430071, China
| | - Geng Chen
- College of Life Sciences and Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Zhi John Lu
- MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China
| | - Yang Chen
- MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China
| | - Bing Zhou
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Hairi Li
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Xian Du
- College of Life Sciences and Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Yang Yang
- College of Life Sciences and Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Mingkui San
- Medical Research Institute, Wuhan University, Wuhan, Hubei 430071, China
| | - Xintao Wei
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, UConn Health Science Center, Farmington, CT 06030, USA
| | - Wen Liu
- School of Pharmaceutical Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Eric Lécuyer
- Institut de Recherches Cliniques de Montréal, Département de Biochimie and Médecine Moléculaire, Université de Montréal, Montréal, QC H2W 1R7, Canada
| | - Brenton R Graveley
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, UConn Health Science Center, Farmington, CT 06030, USA
| | - Gene W Yeo
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Christopher B Burge
- Program in Computational and Systems Biology, Department of Biology, MIT, Cambridge, MA 02139, USA
| | - Michael Q Zhang
- MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China; Department of Biological Sciences, Center for Systems Biology, University of Texas, Dallas, TX 75080, USA
| | - Yu Zhou
- College of Life Sciences and Institute for Advanced Studies, Wuhan University, Wuhan, Hubei 430072, China
| | - Xiang-Dong Fu
- Department of Cellular and Molecular Medicine, Institute of Genomic Medicine, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
39
|
Nie G, Wen X, Liang X, Zhao H, Li Y, Lu J. Additional evidence supports association of common genetic variants in MMP3 and TIMP2 with increased risk of chronic Achilles tendinopathy susceptibility. J Sci Med Sport 2019; 22:1074-1078. [PMID: 31208828 DOI: 10.1016/j.jsams.2019.05.021] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2018] [Revised: 04/13/2019] [Accepted: 05/30/2019] [Indexed: 12/25/2022]
Abstract
OBJECTIVES To systematically evaluate the effects of matrix metalloproteinase-3 (MMP3) and tissue inhibitor of metalloproteinase-2 (TIMP2) on chronic Achilles tendinopathy (AT) susceptibility. Chronic AT is one of the most prevalent and severe injuries in athletes. Early studies suggested that tendon extracellular matrix (ECM) may be involved in the pathogenesis of chronic AT. MMP3 is an important member of the MMP family and is important to ECM integrity. In addition, tissue inhibitor of metalloproteinase-2 (TIMP2) can indirectly limit the activity of MMP3 activity. DESIGN Case-control genetic association study. METHODS A total of 1084 chronic AT patients and 2188 controls with Chinese Han ancestry were recruited. Twenty-one SNPs, 4 mapped to MMP3 and 17 mapped to TIMP2, were selected and genotyped. Genetic association analyses and eQTL analyses were performed. In addition, we also examined the potential effects of epistasis using a case-only study design. RESULTS Two SNPs, rs679620 (OR=0.82, P=0.0006, MMP3) and rs4789932 (OR=1.2, P=0.0002, TIMP2) were identified to be significantly associated with chronic AT risk. No significant results were obtained from epistasis analyses. SNP rs4789932 was identified to be strongly associated with the gene expression level of TIMP2 in two types of human tissues: atrial appendage (P=0.0003) and tibial artery (P=0.0009). CONCLUSIONS We have identified genetic polymorphisms in MMP3 and TIMP2 to be significantly associated with chronic AT risk. Further eQTL analyses indicated that SNP rs4789932 of TIMP2 was related to the gene expression levels of TIMP2. These results suggest important roles for MMP3 and TIMP2 in the pathophysiology of chronic AT.
Collapse
Affiliation(s)
- Guanghua Nie
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, China
| | - Xiaodong Wen
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, China
| | - Xiaojun Liang
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, China
| | - Hongmou Zhao
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, China
| | - Yi Li
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, China
| | - Jun Lu
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, China.
| |
Collapse
|
40
|
Yang J, Wang J, Liang X, Zhao H, Lu J, Ma Q, Tian F. Relationship Between Genetic Polymorphisms of the TNF Gene and Hallux Valgus Susceptibility. Genet Test Mol Biomarkers 2019; 23:380-386. [PMID: 31063409 DOI: 10.1089/gtmb.2018.0269] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Background: Hallux valgus (HV) is a type of forefoot deformity affecting ∼23% of adults. Previous studies have shown that HV is highly heritable. Tumor necrosis factor (TNF) is an important proinflammatory cytokine involved in bone remodeling and plays essential roles in osteoarthritis and chronic inflammatory bone diseases, including HV. Methods: A total of 1,788 Chinese women comprising 637 HV subjects and 1,151 controls were recruited. Twelve single nucleotide polymorphisms (SNPs) located in TNF and its promoter regions were selected and genotyped. Genetic association analyses were performed to investigate potential susceptibility SNPs. Bioinformatic and expression quantitative trait loci (eQTL) analyses were conducted to examine the functional consequences of the SNPs identified as being significantly associated with HV. Results: SNP rs1800629, which is located at the 5' end of the promoter region of TNF, was identified as significantly associated with HV status in Chinese women (OR = 0.56, p = 2.12 × 10-6). Bioinformatic analyses using RegulomeDB indicated that this SNP has important functional significance, but subsequent eQTL analyses did not identify a significant association between rs1800629 and TNF gene expression. In addition, 26 genes with cis-eQTL for rs1800629 were identified. Conclusions: This study identified a susceptibility SNP for HV located within the promoter region of the TNF gene. Bioinformatic and eQTL analyses linked this SNP to 26 genes but not to TNF. Functional studies are needed to more fully characterize the effects of this SNP.
Collapse
Affiliation(s)
- Jie Yang
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, Xi'an, China
| | - Junhu Wang
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, Xi'an, China
| | - Xiaojun Liang
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, Xi'an, China
| | - Hongmou Zhao
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, Xi'an, China
| | - Jun Lu
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, Xi'an, China
| | - Qiang Ma
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, Xi'an, China
| | - Feng Tian
- Department of Foot and Ankle Surgery, Honghui Hospital, Xi'an Jiaotong University, Xi'an, China
| |
Collapse
|
41
|
Jiang LG, Li B, Liu SX, Wang HW, Li CP, Song SH, Beatty M, Zastrow-Hayes G, Yang XH, Qin F, He Y. Characterization of Proteome Variation During Modern Maize Breeding. Mol Cell Proteomics 2019; 18:263-276. [PMID: 30409858 PMCID: PMC6356080 DOI: 10.1074/mcp.ra118.001021] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Revised: 11/06/2018] [Indexed: 12/21/2022] Open
Abstract
The success of modern maize breeding has been demonstrated by remarkable increases in productivity with tremendous modification of agricultural phenotypes over the last century. Although the underlying genetic changes of the maize adaptation from tropical to temperate regions have been extensively studied, our knowledge is limited regarding the accordance of protein and mRNA expression levels accompanying such adaptation. Here we conducted an integrative analysis of proteomic and transcriptomic changes in a maize association panel. The minimum extent of correlation between protein and RNA levels suggests that variation in mRNA expression is often not indicative of protein expression at a population scale. This is corroborated by the observation that mRNA- and protein-based coexpression networks are relatively independent of each other, and many pQTLs arise without the presence of corresponding eQTLs. Importantly, compared with transcriptome, the subtypes categorized by the proteome show a markedly high accuracy to resemble the genomic subpopulation. These findings suggest that proteome evolved under a greater evolutionary constraint than transcriptome during maize adaptation from tropical to temperate regions. Overall, the integrated multi-omics analysis provides a functional context to interpret gene expression variation during modern maize breeding.
Collapse
Affiliation(s)
- Lu-Guang Jiang
- MOE Key Laboratory of Crop Heterosis and Utilization, National Maize Improvement Center of China, China Agricultural University, Beijing 100094, China
| | - Bo Li
- MOE Key Laboratory of Crop Heterosis and Utilization, National Maize Improvement Center of China, China Agricultural University, Beijing 100094, China
| | - Sheng-Xue Liu
- College of Biological Sciences, China Agricultural University, Beijing 100094, China
| | - Hong-Wei Wang
- Agricultural College, Hubei Collaborative Innovation Center for Grain Industry, Yangtze University, Hubei 434000, China
| | - Cui-Ping Li
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Shu-Hui Song
- BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | | | | | - Xiao-Hong Yang
- MOE Key Laboratory of Crop Heterosis and Utilization, National Maize Improvement Center of China, China Agricultural University, Beijing 100094, China
| | - Feng Qin
- College of Biological Sciences, China Agricultural University, Beijing 100094, China;.
| | - Yan He
- MOE Key Laboratory of Crop Heterosis and Utilization, National Maize Improvement Center of China, China Agricultural University, Beijing 100094, China;.
| |
Collapse
|
42
|
Keilwagen J, Posch S, Grau J. Accurate prediction of cell type-specific transcription factor binding. Genome Biol 2019; 20:9. [PMID: 30630522 PMCID: PMC6327544 DOI: 10.1186/s13059-018-1614-y] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Accepted: 12/18/2018] [Indexed: 01/11/2023] Open
Abstract
Prediction of cell type-specific, in vivo transcription factor binding sites is one of the central challenges in regulatory genomics. Here, we present our approach that earned a shared first rank in the "ENCODE-DREAM in vivo Transcription Factor Binding Site Prediction Challenge" in 2017. In post-challenge analyses, we benchmark the influence of different feature sets and find that chromatin accessibility and binding motifs are sufficient to yield state-of-the-art performance. Finally, we provide 682 lists of predicted peaks for a total of 31 transcription factors in 22 primary cell types and tissues and a user-friendly version of our approach, Catchitt, for download.
Collapse
Affiliation(s)
- Jens Keilwagen
- Institute for Biosafety in Plant Biotechnology, Julius Kühn-Institut (JKI) - Federal Research Centre for Cultivated Plants, Erwin-Baur-Straße 27, Quedlinburg, 06484 Germany
| | - Stefan Posch
- Institute of Computer Science, Martin Luther University Halle–Wittenberg, Von-Seckendorff-Platz 1, Halle (Saale), 06120 Germany
| | - Jan Grau
- Institute of Computer Science, Martin Luther University Halle–Wittenberg, Von-Seckendorff-Platz 1, Halle (Saale), 06120 Germany
| |
Collapse
|
43
|
Zhang L, Xue G, Liu J, Li Q, Wang Y. Revealing transcription factor and histone modification co-localization and dynamics across cell lines by integrating ChIP-seq and RNA-seq data. BMC Genomics 2018; 19:914. [PMID: 30598100 PMCID: PMC6311957 DOI: 10.1186/s12864-018-5278-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Background Interactions among transcription factors (TFs) and histone modifications (HMs) play an important role in the precise regulation of gene expression. The context specificity of those interactions and further its dynamics in normal and disease remains largely unknown. Recent development in genomics technology enables transcription profiling by RNA-seq and protein’s binding profiling by ChIP-seq. Integrative analysis of the two types of data allows us to investigate TFs and HMs interactions both from the genome co-localization and downstream target gene expression. Results We propose a integrative pipeline to explore the co-localization of 55 TFs and 11 HMs and its dynamics in human GM12878 and K562 by matched ChIP-seq and RNA-seq data from ENCODE. We classify TFs and HMs into three types based on their binding enrichment around transcription start site (TSS). Then a set of statistical indexes are proposed to characterize the TF-TF and TF-HM co-localizations. We found that Rad21, SMC3, and CTCF co-localized across five cell lines. High resolution Hi-C data in GM12878 shows that they associate most of the Hi-C peak loci with a specific CTCF-motif “anchor” and supports that CTCF, SMC3, and RAD2 co-localization serves important role in 3D chromatin structure. Meanwhile, 17 TF-TF pairs are highly dynamic between GM12878 and K562. We then build SVM models to correlate high and low expression level of target genes with TF binding and HM strength. We found that H3k9ac, H3k27ac, and three TFs (ELF1, TAF1, and POL2) are predictive with the accuracy about 85~92%. Conclusion We propose a pipeline to analyze the co-localization of TF and HM and their dynamics across cell lines from ChIP-seq, and investigate their regulatory potency by RNA-seq. The integrative analysis of two level data reveals new insight for the cooperation of TFs and HMs and is helpful in understanding cell line specificity of TF/HM interactions. Electronic supplementary material The online version of this article (10.1186/s12864-018-5278-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lirong Zhang
- School of Physical Science and Technology, Inner Mongolia University, Hohhot, Inner Mongolia, 010021, China.
| | - Gaogao Xue
- School of Physical Science and Technology, Inner Mongolia University, Hohhot, Inner Mongolia, 010021, China
| | - Junjie Liu
- School of Physical Science and Technology, Inner Mongolia University, Hohhot, Inner Mongolia, 010021, China
| | - Qianzhong Li
- School of Physical Science and Technology, Inner Mongolia University, Hohhot, Inner Mongolia, 010021, China.
| | - Yong Wang
- CEMS, NCMIS, MDIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China. .,School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China. .,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.
| |
Collapse
|
44
|
Völkel S, Stielow B, Finkernagel F, Berger D, Stiewe T, Nist A, Suske G. Transcription factor Sp2 potentiates binding of the TALE homeoproteins Pbx1:Prep1 and the histone-fold domain protein Nf-y to composite genomic sites. J Biol Chem 2018; 293:19250-19262. [PMID: 30337366 DOI: 10.1074/jbc.ra118.005341] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Revised: 10/17/2018] [Indexed: 11/06/2022] Open
Abstract
Different transcription factors operate together at promoters and enhancers to regulate gene expression. Transcription factors either bind directly to their target DNA or are tethered to it by other proteins. The transcription factor Sp2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. Hence, Sp2 is strikingly different from its closely related paralogs Sp1 and Sp3, but how Sp2 recognizes its targets is unknown. Here, we sought to gain more detailed insights into the genomic targeting mechanism of Sp2. ChIP-exo sequencing in mouse embryonic fibroblasts revealed genomic binding of Sp2 to a composite motif where a recognition sequence for TALE homeoproteins and a recognition sequence for the trimeric histone-fold domain protein nuclear transcription factor Y (Nf-y) are separated by 11 bp. We identified a complex consisting of the TALE homeobox protein Prep1, its partner PBX homeobox 1 (Pbx1), and Nf-y as the major partners in Sp2-promoter interactions. We found that the Pbx1:Prep1 complex together with Nf-y recruits Sp2 to co-occupied regulatory elements. In turn, Sp2 potentiates binding of Pbx1:Prep1 and Nf-y. We also found that the Sp-box, a short sequence motif close to the Sp2 N terminus, is crucial for Sp2's cofactor function. Our findings reveal a mechanism by which the DNA binding-independent activity of Sp2 potentiates genomic loading of Pbx1:Prep1 and Nf-y to composite motifs present in many promoters of highly expressed genes.
Collapse
Affiliation(s)
- Sara Völkel
- From the Institute of Molecular Biology and Tumor Research (IMT) and
| | - Bastian Stielow
- From the Institute of Molecular Biology and Tumor Research (IMT) and
| | | | - Dana Berger
- From the Institute of Molecular Biology and Tumor Research (IMT) and
| | - Thorsten Stiewe
- the Genomics Core Facility, Center for Tumor Biology and Immunology (ZTI), Philipps-University of Marburg, 35043 Marburg, Germany
| | - Andrea Nist
- the Genomics Core Facility, Center for Tumor Biology and Immunology (ZTI), Philipps-University of Marburg, 35043 Marburg, Germany
| | - Guntram Suske
- From the Institute of Molecular Biology and Tumor Research (IMT) and
| |
Collapse
|
45
|
Zhang T, Zhu L, Ni T, Liu D, Chen G, Yan Z, Lin H, Guan F, Rice JP. Voltage-gated calcium channel activity and complex related genes and schizophrenia: A systematic investigation based on Han Chinese population. J Psychiatr Res 2018; 106:99-105. [PMID: 30308413 DOI: 10.1016/j.jpsychires.2018.09.020] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Revised: 09/27/2018] [Accepted: 09/28/2018] [Indexed: 12/19/2022]
Abstract
Schizophrenia (SCZ) is a devastating mental disorder affecting approximately 1% of the worldwide population. Early studies have indicated that genetics plays an important role in the onset and development of SCZ. Accumulating evidence supports that SCZ is linked to abnormalities of synapse transmission and synaptic plasticity. Voltage-gated calcium channel (VGCC) subunits are critical for mediating intracellular Ca2 + influx and therefore are responsible for changing neuronal excitability and synaptic plasticity. To systematically investigate the role of calcium signaling genes in SCZ susceptibility, we conducted a case-control study that included 2518 SCZ patients and 7521 healthy controls with Chinese Han ancestry. Thirty-seven VGCC genes, including 363 tag single nucleotide polymorphisms (SNPs), were examined. Our study replicated the following previously identified susceptible loci: CACNA1C, CACNB2, OPRM1, GRM7 and PDE4B. In addition, several novel loci including CACNA2D1, PDE4D, NALCN, and CACNA2D3 were also identified to be associated with SCZ in our Han Chinese sample. Combined with GTEx eQTL data, we have shown that CASQ2, ITGAV, and TMC2 can be also added into the prioritization list of SCZ susceptible genes. Two-way interaction analyses identified widespread gene-by-gene interactions among VGCC activity and complex-related genes for the susceptibility of SCZ. Further sequencing based studies are still needed to unravel potential contributions of schizophrenia risk from rare or low frequency variants of these candidate genes.
Collapse
Affiliation(s)
- Tianxiao Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China
| | - Li Zhu
- Department of Forensic Psychiatry, School of Medicine & Forensics, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China; Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China
| | - Tong Ni
- Department of Forensic Psychiatry, School of Medicine & Forensics, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China; Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China
| | - Dan Liu
- Department of Forensic Psychiatry, School of Medicine & Forensics, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China; Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China
| | - Gang Chen
- Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China; Department of Forensic Pathology, School of Medicine & Forensics, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China
| | - Zhilan Yan
- Department of Forensic Psychiatry, School of Medicine & Forensics, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China; Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China
| | - Huali Lin
- Xi'an Mental Health Center, 15 Yanyin Road, Xi'an, Shaanxi, 710086, China
| | - Fanglin Guan
- Department of Forensic Psychiatry, School of Medicine & Forensics, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China; Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Xi'an Jiaotong University Health Science Center, 76 Yanta West Road, Xi'an, Shaanxi, 710061, China.
| | - John P Rice
- Department of Psychiatry, School of Medicine, Washington University in St. Louis, 63124, USA
| |
Collapse
|
46
|
Ng FSL, Ruau D, Wernisch L, Göttgens B. A graphical model approach visualizes regulatory relationships between genome-wide transcription factor binding profiles. Brief Bioinform 2018; 19:162-173. [PMID: 27780826 PMCID: PMC5496675 DOI: 10.1093/bib/bbw102] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2015] [Indexed: 11/16/2022] Open
Abstract
Integrated analysis of multiple genome-wide transcription factor (TF)-binding profiles will be vital to advance our understanding of the global impact of TF binding. However, existing methods for measuring similarity in large numbers of chromatin immunoprecipitation assays with sequencing (ChIP-seq), such as correlation, mutual information or enrichment analysis, are limited in their ability to display functionally relevant TF relationships. In this study, we propose the use of graphical models to determine conditional independence between TFs and showed that network visualization provides a promising alternative to distinguish ‘direct’ versus ‘indirect’ TF interactions. We applied four algorithms to measure ‘direct’ dependence to a compendium of 367 mouse haematopoietic TF ChIP-seq samples and obtained a consensus network known as a ‘TF association network’ where edges in the network corresponded to likely causal pairwise relationships between TFs. The ‘TF association network’ illustrates the role of TFs in developmental pathways, is reminiscent of combinatorial TF regulation, corresponds to known protein–protein interactions and indicates substantial TF-binding reorganization in leukemic cell types. With the rapid increase in TF ChIP-Seq data sets, the approach presented here will be a powerful tool to study transcriptional programmes across a wide range of biological systems.
Collapse
Affiliation(s)
- Felicia S L Ng
- Department of Haematology, Wellcome Trust and MRC Cambridge Stem Cell Institute & Cambridge Institute for Medical Research, Hills Road, Cambridge, UK
| | - David Ruau
- Department of Haematology, Wellcome Trust and MRC Cambridge Stem Cell Institute & Cambridge Institute for Medical Research, Hills Road, Cambridge, UK
| | - Lorenz Wernisch
- Department of Haematology, Wellcome Trust and MRC Cambridge Stem Cell Institute & Cambridge Institute for Medical Research, Hills Road, Cambridge, UK
| | - Berthold Göttgens
- Department of Haematology, Wellcome Trust and MRC Cambridge Stem Cell Institute & Cambridge Institute for Medical Research, Hills Road, Cambridge, UK
- Corresponding author: Berthold Gottgens, Department of Haematology, Wellcome Trust and MRC Cambridge Stem Cell Institute & Cambridge Institute for Medical Research, Hills Road, Cambridge CB2 0XY, UK. Tel: 01223-336829; Fax: 01223-762670; E-mail:
| |
Collapse
|
47
|
Specificity landscapes unmask submaximal binding site preferences of transcription factors. Proc Natl Acad Sci U S A 2018; 115:E10586-E10595. [PMID: 30341220 DOI: 10.1073/pnas.1811431115] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
We have developed Differential Specificity and Energy Landscape (DiSEL) analysis to comprehensively compare DNA-protein interactomes (DPIs) obtained by high-throughput experimental platforms and cutting edge computational methods. While high-affinity DNA binding sites are identified by most methods, DiSEL uncovered nuanced sequence preferences displayed by homologous transcription factors. Pairwise analysis of 726 DPIs uncovered homolog-specific differences at moderate- to low-affinity binding sites (submaximal sites). DiSEL analysis of variants of 41 transcription factors revealed that many disease-causing mutations result in allele-specific changes in binding site preferences. We focused on a set of highly homologous factors that have different biological roles but "read" DNA using identical amino acid side chains. Rather than direct readout, our results indicate that DNA noncontacting side chains allosterically contribute to sculpt distinct sequence preferences among closely related members of transcription factor families.
Collapse
|
48
|
Recurrent Neural Network for Predicting Transcription Factor Binding Sites. Sci Rep 2018; 8:15270. [PMID: 30323198 PMCID: PMC6189047 DOI: 10.1038/s41598-018-33321-1] [Citation(s) in RCA: 99] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Accepted: 09/25/2018] [Indexed: 12/23/2022] Open
Abstract
It is well known that DNA sequence contains a certain amount of transcription factors (TF) binding sites, and only part of them are identified through biological experiments. However, these experiments are expensive and time-consuming. To overcome these problems, some computational methods, based on k-mer features or convolutional neural networks, have been proposed to identify TF binding sites from DNA sequences. Although these methods have good performance, the context information that relates to TF binding sites is still lacking. Research indicates that standard recurrent neural networks (RNN) and its variants have better performance in time-series data compared with other models. In this study, we propose a model, named KEGRU, to identify TF binding sites by combining Bidirectional Gated Recurrent Unit (GRU) network with k-mer embedding. Firstly, DNA sequences are divided into k-mer sequences with a specified length and stride window. And then, we treat each k-mer as a word and pre-trained word representation model though word2vec algorithm. Thirdly, we construct a deep bidirectional GRU model for feature learning and classification. Experimental results have shown that our method has better performance compared with some state-of-the-art methods. Additional experiments about embedding strategy show that k-mer embedding will be helpful to enhance model performance. The robustness of KEGRU is proved by experiments with different k-mer length, stride window and embedding vector dimension.
Collapse
|
49
|
Szalaj P, Plewczynski D. Three-dimensional organization and dynamics of the genome. Cell Biol Toxicol 2018; 34:381-404. [PMID: 29568981 PMCID: PMC6133016 DOI: 10.1007/s10565-018-9428-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Accepted: 03/11/2018] [Indexed: 12/30/2022]
Abstract
Genome is a complex hierarchical structure, and its spatial organization plays an important role in its function. Chromatin loops and topological domains form the basic structural units of this multiscale organization and are essential to orchestrate complex regulatory networks and transcription mechanisms. They also form higher-order structures such as chromosomal compartments and chromosome territories. Each level of this intrinsic architecture is governed by principles and mechanisms that we only start to understand. In this review, we summarize the current view of the genome architecture on the scales ranging from chromatin loops to the whole genome. We describe cell-to-cell variability, links between genome reorganization and various genomic processes, such as chromosome X inactivation and cell differentiation, and the interplay between different experimental techniques.
Collapse
Affiliation(s)
- Przemyslaw Szalaj
- Centre for Innovative Research, Medical University of Bialystok, Białystok, Poland.
- I-BioStat, Hasselt University, Hasselt, Belgium.
- Centre of New Technologies, University of Warsaw, Warsaw, Poland.
| | - Dariusz Plewczynski
- Centre for Innovative Research, Medical University of Bialystok, Białystok, Poland
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| |
Collapse
|
50
|
Cai X, Yi X, Zhang Y, Zhang D, Zhi L, Liu H. Genetic susceptibility of postmenopausal osteoporosis on sulfide quinone reductase-like gene. Osteoporos Int 2018; 29:2041-2047. [PMID: 29855663 DOI: 10.1007/s00198-018-4575-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Accepted: 05/14/2018] [Indexed: 10/14/2022]
Abstract
UNLABELLED Postmenopausal osteoporosis is a major health problem with important genetic factors in postmenopausal women. We explored the relationship between SQRDL and osteoporosis in a cohort of 1006 patients and 2027 controls from Han Chinese postmenopausal women. Our evidence supported the significant role of SQRDL in the etiology of postmenopausal osteoporosis. INTRODUCTION Postmenopausal osteoporosis (PMOP) is a metabolic bone disease leading to progressive bone loss and the deterioration of the bone microarchitecture. The sulfide-quinone reductase-like protein is an important enzyme regulating the cellular hydrogen sulfide levels, and it can regulate bone metabolism balance in postmenopausal women. In this study, we aimed to investigate whether SQRDL is associated with susceptibility to PMOP in the Han Chinese population. METHODS A total of 3033 postmenopausal women, comprised of 1006 cases and 2027 controls, were recruited in the study. Twenty-two SNPs were selected for genotyping to evaluate the association of SQRDL gene with BMD and PMOP. Association analyses in both single marker and haplotype levels were performed for PMOP. Bone mineral density (BMD) was also utilized as a quantitative phenotype in further analyses. Bioinformatics tools were applied to predict the functional consequences of targeted polymorphisms in SQRDL. RESULTS The SNP rs1044032 (P = 6.42 × 10-5, OR = 0.80) was identified as significantly associated with PMOP. Three SNPs (rs1044032, rs2028589, and rs12913151) were found to be significantly associated with BMD. Although limited functional significance can be obtained for these polymorphisms, significant hits for association with PMOP were found. Moreover, further association analyses with BMD identified three SNPs with significantly independent effects. CONCLUSIONS Our evidence supported the significant role of SQRDL in the etiology of PMOP and suggest that it may be a genetic risk factor for BMD and osteoporosis in Han Chinese postmenopausal women.
Collapse
Affiliation(s)
- X Cai
- Department of Orthopaedic Surgery, The Second Affiliated Hospital of Xi'an Jiaotong University, No.157, Xiwu Road, Xi'an, 710004, Shaanxi, China
| | - X Yi
- Department of Pediatrics, The Second Affiliated Hospital of Xi'an Jiaotong University, No.157, Xiwu Road, Xi'an, 710004, Shaanxi, China
| | - Y Zhang
- Department of Orthopaedic Surgery, The Second Affiliated Hospital of Xi'an Jiaotong University, No.157, Xiwu Road, Xi'an, 710004, Shaanxi, China
| | - D Zhang
- Department of Orthopedic, The First Affiliated Hospital of Xi'an Jiaotong University, No.277, Yanta West Road, Xi'an, 710061, Shaanxi, China
| | - L Zhi
- Department of Joint Surgery, Honghui Hospital, Xi'an Jiaotong University Health Science Center, No.555, Youyi East Road, Xi'an, 710054, Shaanxi, China
| | - H Liu
- Department of Trauma, Honghui Hospital, Xi'an Jiaotong University Health Science Center, No.555, Youyi East Road, Xi'an, 710054, Shaanxi, China.
| |
Collapse
|