1
|
Deyneko IV. BestCRM: An Exhaustive Search for Optimal Cis-Regulatory Modules in Promoters Accelerated by the Multidimensional Hash Function. Int J Mol Sci 2024; 25:1903. [PMID: 38339181 PMCID: PMC10856692 DOI: 10.3390/ijms25031903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 01/24/2024] [Accepted: 01/26/2024] [Indexed: 02/12/2024] Open
Abstract
The concept of cis-regulatory modules located in gene promoters represents today's vision of the organization of gene transcriptional regulation. Such modules are a combination of two or more single, short DNA motifs. The bioinformatic identification of such modules belongs to so-called NP-hard problems with extreme computational complexity, and therefore, simplifications, assumptions, and heuristics are usually deployed to tackle the problem. In practice, this requires, first, many parameters to be set before the search, and second, it leads to the identification of locally optimal results. Here, a novel method is presented, aimed at identifying the cis-regulatory elements in gene promoters based on an exhaustive search of all the feasible modules' configurations. All required parameters are automatically estimated using positive and negative datasets. To be computationally efficient, the search is accelerated using a multidimensional hash function, allowing the search to complete in a few hours on a regular laptop (for example, a CPU Intel i7, 3.2 GH, 32 Gb RAM). Tests on an established benchmark and real data show better performance of BestCRM compared to the available methods according to several metrics like specificity, sensitivity, AUC, etc. A great practical advantage of the method is its minimum number of input parameters-apart from positive and negative promoters, only a desired level of module presence in promoters is required.
Collapse
Affiliation(s)
- Igor V Deyneko
- K.A. Timiryazev Institute of Plant Physiology RAS, 35 Botanicheskaya Str., Moscow 127276, Russia
| |
Collapse
|
2
|
Zhang L, Yang Y, Chai L, Li Q, Liu J, Lin H, Liu L. A deep learning model to identify gene expression level using cobinding transcription factor signals. Brief Bioinform 2021; 23:6447678. [PMID: 34864886 DOI: 10.1093/bib/bbab501] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 10/13/2021] [Accepted: 11/01/2021] [Indexed: 01/02/2023] Open
Abstract
Gene expression is directly controlled by transcription factors (TFs) in a complex combination manner. It remains a challenging task to systematically infer how the cooperative binding of TFs drives gene activity. Here, we quantitatively analyzed the correlation between TFs and surveyed the TF interaction networks associated with gene expression in GM12878 and K562 cell lines. We identified six TF modules associated with gene expression in each cell line. Furthermore, according to the enrichment characteristics of TFs in these TF modules around a target gene, a convolutional neural network model, called TFCNN, was constructed to identify gene expression level. Results showed that the TFCNN model achieved a good prediction performance for gene expression. The average of the area under receiver operating characteristics curve (AUC) can reach up to 0.975 and 0.976, respectively in GM12878 and K562 cell lines. By comparison, we found that the TFCNN model outperformed the prediction models based on SVM and LDA. This is due to the TFCNN model could better extract the combinatorial interaction among TFs. Further analysis indicated that the abundant binding of regulatory TFs dominates expression of target genes, while the cooperative interaction between TFs has a subtle regulatory effects. And gene expression could be regulated by different TF combinations in a nonlinear way. These results are helpful for deciphering the mechanism of TF combination regulating gene expression.
Collapse
Affiliation(s)
- Lirong Zhang
- School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Yanchao Yang
- School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Lu Chai
- School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Qianzhong Li
- School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Junjie Liu
- School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Hao Lin
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Li Liu
- School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| |
Collapse
|
3
|
Chen Y, He R, Han Z, Wu Y, Wang Q, Zhu X, Huang Z, Ye J, Tang Y, Huang H, Chen J, Shan H, Xiao F. Cooperation of ATF4 and CTCF promotes adipogenesis through transcriptional regulation. Cell Biol Toxicol 2021; 38:741-763. [PMID: 33950334 DOI: 10.1007/s10565-021-09608-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 04/23/2021] [Indexed: 12/12/2022]
Abstract
Adipogenesis is a multi-step process orchestrated by activation of numerous TFs, whose cooperation and regulatory network remain elusive. Activating transcription factor 4 (ATF4) is critical for adipogenesis, yet its regulatory network is unclarified. Here, we mapped genome-wide ATF4 binding landscape and its regulatory network by Chip-seq and RNA-seq and found ATF4 directly modulated transcription of genes enriching in fat cell differentiation. Motifs of TFs especially CTCF were found from ATF4 binding sites, suggesting a direct role of ATF4 in regulating adipogenesis associated with CTCF and other TFs. Deletion of CTCF attenuated adipogenesis while overexpression enhanced adipocyte differentiation, indicating CTCF is indispensable for adipogenesis. Intriguingly, combined analysis of Chip-seq data of these two TFs showed that ATF4 co-localized with CTCF in the promoters of key adipogenic genes including Cebpd and PPARg and co-regulated their transactivation. Moreover, ATF4 directly regulated CTCF expression and interacted with CTCF in differentiated 3T3-L1 cells. In vivo, downregulation of ATF4 suppressed the expression of CTCF, Cebpd, and PPARg, leading to reduced adipose tissue expansion in refeeding mice. Consistently, mRNA expression of ATF4 and CTCF was positively correlated with each other in human subcutaneous adipose tissue and inversely associated with BMI, indicating a possible involvement of these two TFs in adipose development. Taken together, our data propose for the first time that ATF4 and CTCF work cooperatively to control adipogenesis and adipose development via orchestrating transcription of adipogenic genes. Our findings reveal novel therapeutic targets in obesity treatment.
Collapse
Affiliation(s)
- Yingchun Chen
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, the Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, Guangdong Province, 519000, People's Republic of China.,Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, 53002, People's Republic of China
| | - Rongquan He
- Department of Oncology, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, 530021, People's Republic of China
| | - Zhiqiang Han
- Department of Plastic and Aesthetic Surgery, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, 530021, People's Republic of China
| | - Yanyan Wu
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, the Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, Guangdong Province, 519000, People's Republic of China
| | - Qiuyan Wang
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, 53002, People's Republic of China
| | - Xiujuan Zhu
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, 53002, People's Republic of China
| | - Zhiguang Huang
- Center for Genomic and Personalized Medicine, Guangxi Medical University, Nanning, Guangxi Zhuang Autonomous Region, 53002, People's Republic of China
| | - Juan Ye
- Department of Infectious Diseases, the Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, Guangdong Province, 519000, People's Republic of China
| | - Yao Tang
- Department of Infectious Diseases, the Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, Guangdong Province, 519000, People's Republic of China
| | - Hongbin Huang
- Department of Infectious Diseases, the Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, Guangdong Province, 519000, People's Republic of China
| | - Jianxu Chen
- Department of Infectious Diseases, the Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, Guangdong Province, 519000, People's Republic of China
| | - Hong Shan
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, the Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, Guangdong Province, 519000, People's Republic of China.
| | - Fei Xiao
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, the Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, Guangdong Province, 519000, People's Republic of China. .,Department of Infectious Diseases, the Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, Guangdong Province, 519000, People's Republic of China.
| |
Collapse
|
4
|
Ishikawa Y, Pieczonka TD, Bragiel-Pieczonka AM, Seta H, Ohkuri T, Sasanuma Y, Nonaka Y. Long-Term Oral Administration of LLHK, LHK, and HK Alters Gene Expression Profile and Restores Age-Dependent Atrophy and Dysfunction of Rat Salivary Glands. Biomedicines 2020; 8:biomedicines8020038. [PMID: 32093221 PMCID: PMC7168239 DOI: 10.3390/biomedicines8020038] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2020] [Revised: 02/16/2020] [Accepted: 02/18/2020] [Indexed: 12/13/2022] Open
Abstract
Xerostomia, also known as dry mouth, is caused by a reduction in salivary secretion and by changes in the composition of saliva associated with the malfunction of salivary glands. Xerostomia decreases quality of life. In the present study, we investigated the effects of peptides derived from β-lactoglobulin C on age-dependent atrophy, gene expression profiles, and the dysfunction of salivary glands. Long-term oral administration of Leu57-Leu58-His59-Lys60 (LLHK), Leu58-His59-Lys60 (LHK) and His59-Lys60 (HK) peptides induced salivary secretion and prevented and/or reversed the age-dependent atrophy of salivary glands in older rats. The transcripts of 78 genes were upregulated and those of 81 genes were downregulated by more than 2.0-fold (p ≤ 0.05) after LHK treatment. LHK upregulated major salivary protein genes such as proline-rich proteins (Prpmp5, Prb3, Prp2, Prb1, Prp15), cystatins (Cst5, Cyss, Vegp2), amylases (Amy1a, Amy2a3), and lysozyme (Lyzl1), suggesting that LLHK, LHK, and HK restored normal salivary function. The AP-2 transcription factor gene (Tcfap2b) was also induced significantly by LHK treatment. These results suggest that LLHK, LHK, and HK-administration may prevent and/or reverse the age-dependent atrophy and functional decline of salivary glands by affecting gene expression.
Collapse
Affiliation(s)
- Yasuko Ishikawa
- Department of Medical Pharmacology, Institute of Biomedical Sciences, Tokushima University Graduate School, 3-18-15, Kuramoto-cho, Tokushima 770-8504, Japan; (T.D.P.); (A.M.B.-P.)
- Correspondence: or ; Tel.: +80-3928-9628
| | - Tomasz D Pieczonka
- Department of Medical Pharmacology, Institute of Biomedical Sciences, Tokushima University Graduate School, 3-18-15, Kuramoto-cho, Tokushima 770-8504, Japan; (T.D.P.); (A.M.B.-P.)
| | - Aneta M Bragiel-Pieczonka
- Department of Medical Pharmacology, Institute of Biomedical Sciences, Tokushima University Graduate School, 3-18-15, Kuramoto-cho, Tokushima 770-8504, Japan; (T.D.P.); (A.M.B.-P.)
| | - Harumichi Seta
- Suntory Global Innovation Center Ltd., Suntory World Research Center, 8-1-1 Seika-cho, Soraku-gun, Kyoto 619-0284, Japan; (H.S.); (T.O.); (Y.S.); (Y.N.)
| | - Tadahiro Ohkuri
- Suntory Global Innovation Center Ltd., Suntory World Research Center, 8-1-1 Seika-cho, Soraku-gun, Kyoto 619-0284, Japan; (H.S.); (T.O.); (Y.S.); (Y.N.)
| | - Yumi Sasanuma
- Suntory Global Innovation Center Ltd., Suntory World Research Center, 8-1-1 Seika-cho, Soraku-gun, Kyoto 619-0284, Japan; (H.S.); (T.O.); (Y.S.); (Y.N.)
| | - Yuji Nonaka
- Suntory Global Innovation Center Ltd., Suntory World Research Center, 8-1-1 Seika-cho, Soraku-gun, Kyoto 619-0284, Japan; (H.S.); (T.O.); (Y.S.); (Y.N.)
| |
Collapse
|
5
|
Steuernagel L, Meckbach C, Heinrich F, Zeidler S, Schmitt AO, Gültas M. Computational identification of tissue-specific transcription factor cooperation in ten cattle tissues. PLoS One 2019; 14:e0216475. [PMID: 31095599 PMCID: PMC6522001 DOI: 10.1371/journal.pone.0216475] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Accepted: 04/22/2019] [Indexed: 01/01/2023] Open
Abstract
Transcription factors (TFs) are a special class of DNA-binding proteins that orchestrate gene transcription by recruiting other TFs, co-activators or co-repressors. Their combinatorial interplay in higher organisms maintains homeostasis and governs cell identity by finely controlling and regulating tissue-specific gene expression. Despite the rich literature on the importance of cooperative TFs for deciphering the mechanisms of individual regulatory programs that control tissue specificity in several organisms such as human, mouse, or Drosophila melanogaster, to date, there is still need for a comprehensive study to detect specific TF cooperations in regulatory processes of cattle tissues. To address the needs of knowledge about specific combinatorial gene regulation in cattle tissues, we made use of three publicly available RNA-seq datasets and obtained tissue-specific gene (TSG) sets for ten tissues (heart, lung, liver, kidney, duodenum, muscle tissue, adipose tissue, colon, spleen and testis). By analyzing these TSG-sets, tissue-specific TF cooperations of each tissue have been identified. The results reveal that similar to the combinatorial regulatory events of model organisms, TFs change their partners depending on their biological functions in different tissues. Particularly with regard to preferential partner choice of the transcription factors STAT3 and NR2C2, this phenomenon has been highlighted with their five different specific cooperation partners in multiple tissues. The information about cooperative TFs could be promising: i) to understand the molecular mechanisms of regulating processes; and ii) to extend the existing knowledge on the importance of single TFs in cattle tissues.
Collapse
Affiliation(s)
- Lukas Steuernagel
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany
| | - Cornelia Meckbach
- Institute of Medical Bioinformatics, Goldschmidtstraße 1, University Medical Center Göttingen, Georg-August-University, 37077 Göttingen, Germany
| | - Felix Heinrich
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany
| | - Sebastian Zeidler
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany
| | - Armin O. Schmitt
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany
- Center for Integrated Breeding Research (CiBreed), Albrecht-Thaer-Weg 3, Georg-August University, 37075, Göttingen, Germany
| | - Mehmet Gültas
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany
- Center for Integrated Breeding Research (CiBreed), Albrecht-Thaer-Weg 3, Georg-August University, 37075, Göttingen, Germany
- * E-mail:
| |
Collapse
|
6
|
Li X, Yang J, Zhu S, Li Y, Chen W, Hu Z. Insight into the combinatorial transcriptional regulation on α-amylase gene in animal groups with different dietary nutrient content. Genomics 2019; 112:520-527. [PMID: 30965097 DOI: 10.1016/j.ygeno.2019.04.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2018] [Revised: 03/16/2019] [Accepted: 04/05/2019] [Indexed: 11/25/2022]
Abstract
Gene expression is generally regulated by multiple transcription factors (TFs). Despite previous findings of individual TFs regulating pancreatic α-amylase gene expression, the combinatorial transcriptional regulation is not fully understood. To gain insight into multiple TF regulation for pancreatic α-amylase gene, we employed a function conservation approach to predict interacting TFs regulating pancreatic α-amylase gene for 3 dietary animal groups. To this end, we have identified 77, 25, and 118 interacting TFs for herbivore, omnivore, and carnivore, respectively. Computational modeling of TF regulatory networks demonstrated that known pancreas-specific TFs (e.g. GR, NFAT, and PR) may play important roles in recruiting non pancreas-specific TFs to the TF-TF interaction networks, offering specificity and flexibility for controlling pancreatic α-amylase gene expression in different dietary animal groups. The findings from this study indicate that combinatorial transcriptional regulation could be a critical component controlling pancreatic α-amylase gene expression.
Collapse
Affiliation(s)
- Xinhui Li
- Pearl River Fisheries Research Institute, Chinese Academy of Fishery Science, Guangzhou 510380, China.
| | - Jiping Yang
- Pearl River Fisheries Research Institute, Chinese Academy of Fishery Science, Guangzhou 510380, China
| | - Shuli Zhu
- Pearl River Fisheries Research Institute, Chinese Academy of Fishery Science, Guangzhou 510380, China
| | - Yuefei Li
- Pearl River Fisheries Research Institute, Chinese Academy of Fishery Science, Guangzhou 510380, China
| | - Weitao Chen
- Pearl River Fisheries Research Institute, Chinese Academy of Fishery Science, Guangzhou 510380, China
| | - Zihua Hu
- Center for Computational Research, New York State Center of Excellence in Bioinformatics & Life Sciences, State University of New York at Buffalo, Buffalo, NY 14260, USA; Department of Ophthalmology, State University of New York at Buffalo, Buffalo, NY 14260, USA; Department of Biostatistics, State University of New York at Buffalo, Buffalo, NY 14260, USA; Department of Medicine, State University of New York at Buffalo, Buffalo, NY 14260, USA; SUNY Eye Institute, Buffalo, NY 14260, USA.
| |
Collapse
|
7
|
Beytebiere JR, Trott AJ, Greenwell BJ, Osborne CA, Vitet H, Spence J, Yoo SH, Chen Z, Takahashi JS, Ghaffari N, Menet JS. Tissue-specific BMAL1 cistromes reveal that rhythmic transcription is associated with rhythmic enhancer-enhancer interactions. Genes Dev 2019; 33:294-309. [PMID: 30804225 PMCID: PMC6411008 DOI: 10.1101/gad.322198.118] [Citation(s) in RCA: 87] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2018] [Accepted: 01/02/2019] [Indexed: 12/31/2022]
Abstract
The mammalian circadian clock relies on the transcription factor CLOCK:BMAL1 to coordinate the rhythmic expression of thousands of genes. Consistent with the various biological functions under clock control, rhythmic gene expression is tissue-specific despite an identical clockwork mechanism in every cell. Here we show that BMAL1 DNA binding is largely tissue-specific, likely because of differences in chromatin accessibility between tissues and cobinding of tissue-specific transcription factors. Our results also indicate that BMAL1 ability to drive tissue-specific rhythmic transcription is associated with not only the activity of BMAL1-bound enhancers but also the activity of neighboring enhancers. Characterization of physical interactions between BMAL1 enhancers and other cis-regulatory regions by RNA polymerase II chromatin interaction analysis by paired-end tag (ChIA-PET) reveals that rhythmic BMAL1 target gene expression correlates with rhythmic chromatin interactions. These data thus support that much of BMAL1 target gene transcription depends on BMAL1 capacity to rhythmically regulate a network of enhancers.
Collapse
Affiliation(s)
- Joshua R Beytebiere
- Department of Biology, Center for Biological Clocks Research, Texas A&M University, College Station, Texas 77843, USA
| | - Alexandra J Trott
- Department of Biology, Center for Biological Clocks Research, Texas A&M University, College Station, Texas 77843, USA
- Program of Genetics, Texas A&M University, College Station, Texas 77843, USA
| | - Ben J Greenwell
- Department of Biology, Center for Biological Clocks Research, Texas A&M University, College Station, Texas 77843, USA
- Program of Genetics, Texas A&M University, College Station, Texas 77843, USA
| | - Collin A Osborne
- Department of Biology, Center for Biological Clocks Research, Texas A&M University, College Station, Texas 77843, USA
- Program of Genetics, Texas A&M University, College Station, Texas 77843, USA
| | - Helene Vitet
- Department of Biology, Center for Biological Clocks Research, Texas A&M University, College Station, Texas 77843, USA
| | - Jessica Spence
- Department of Biology, Center for Biological Clocks Research, Texas A&M University, College Station, Texas 77843, USA
| | - Seung-Hee Yoo
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA
| | - Zheng Chen
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA
| | - Joseph S Takahashi
- Department of Neuroscience, Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA
| | - Noushin Ghaffari
- Center for Bioinformatics and Genomic Systems Engineering (CBGSE), Texas A&M AgriLife Research, College Station, Texas 77845, USA
- AgriLife Genomics and Bioinformatics, Texas A&M AgriLife Research, College Station, Texas 77845, USA
| | - Jerome S Menet
- Department of Biology, Center for Biological Clocks Research, Texas A&M University, College Station, Texas 77843, USA
- Program of Genetics, Texas A&M University, College Station, Texas 77843, USA
| |
Collapse
|
8
|
Zhang L, Xue G, Liu J, Li Q, Wang Y. Revealing transcription factor and histone modification co-localization and dynamics across cell lines by integrating ChIP-seq and RNA-seq data. BMC Genomics 2018; 19:914. [PMID: 30598100 PMCID: PMC6311957 DOI: 10.1186/s12864-018-5278-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Background Interactions among transcription factors (TFs) and histone modifications (HMs) play an important role in the precise regulation of gene expression. The context specificity of those interactions and further its dynamics in normal and disease remains largely unknown. Recent development in genomics technology enables transcription profiling by RNA-seq and protein’s binding profiling by ChIP-seq. Integrative analysis of the two types of data allows us to investigate TFs and HMs interactions both from the genome co-localization and downstream target gene expression. Results We propose a integrative pipeline to explore the co-localization of 55 TFs and 11 HMs and its dynamics in human GM12878 and K562 by matched ChIP-seq and RNA-seq data from ENCODE. We classify TFs and HMs into three types based on their binding enrichment around transcription start site (TSS). Then a set of statistical indexes are proposed to characterize the TF-TF and TF-HM co-localizations. We found that Rad21, SMC3, and CTCF co-localized across five cell lines. High resolution Hi-C data in GM12878 shows that they associate most of the Hi-C peak loci with a specific CTCF-motif “anchor” and supports that CTCF, SMC3, and RAD2 co-localization serves important role in 3D chromatin structure. Meanwhile, 17 TF-TF pairs are highly dynamic between GM12878 and K562. We then build SVM models to correlate high and low expression level of target genes with TF binding and HM strength. We found that H3k9ac, H3k27ac, and three TFs (ELF1, TAF1, and POL2) are predictive with the accuracy about 85~92%. Conclusion We propose a pipeline to analyze the co-localization of TF and HM and their dynamics across cell lines from ChIP-seq, and investigate their regulatory potency by RNA-seq. The integrative analysis of two level data reveals new insight for the cooperation of TFs and HMs and is helpful in understanding cell line specificity of TF/HM interactions. Electronic supplementary material The online version of this article (10.1186/s12864-018-5278-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lirong Zhang
- School of Physical Science and Technology, Inner Mongolia University, Hohhot, Inner Mongolia, 010021, China.
| | - Gaogao Xue
- School of Physical Science and Technology, Inner Mongolia University, Hohhot, Inner Mongolia, 010021, China
| | - Junjie Liu
- School of Physical Science and Technology, Inner Mongolia University, Hohhot, Inner Mongolia, 010021, China
| | - Qianzhong Li
- School of Physical Science and Technology, Inner Mongolia University, Hohhot, Inner Mongolia, 010021, China.
| | - Yong Wang
- CEMS, NCMIS, MDIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China. .,School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China. .,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China.
| |
Collapse
|
9
|
van Bömmel A, Love MI, Chung HR, Vingron M. coTRaCTE predicts co-occurring transcription factors within cell-type specific enhancers. PLoS Comput Biol 2018; 14:e1006372. [PMID: 30142147 PMCID: PMC6126874 DOI: 10.1371/journal.pcbi.1006372] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Revised: 09/06/2018] [Accepted: 07/17/2018] [Indexed: 02/06/2023] Open
Abstract
Cell-type specific gene expression is regulated by the combinatorial action of transcription factors (TFs). In this study, we predict transcription factor (TF) combinations that cooperatively bind in a cell-type specific manner. We first divide DNase hypersensitive sites into cell-type specifically open vs. ubiquitously open sites in 64 cell types to describe possible cell-type specific enhancers. Based on the pattern contrast between these two groups of sequences we develop "co-occurring TF predictor on Cell-Type specific Enhancers" (coTRaCTE) - a novel statistical method to determine regulatory TF co-occurrences. Contrasting the co-binding of TF pairs between cell-type specific and ubiquitously open chromatin guarantees the high cell-type specificity of the predictions. coTRaCTE predicts more than 2000 co-occurring TF pairs in 64 cell types. The large majority (70%) of these TF pairs is highly cell-type specific and overlaps in TF pair co-occurrence are highly consistent among related cell types. Furthermore, independently validated co-occurring and directly interacting TFs are significantly enriched in our predictions. Focusing on the regulatory network derived from the predicted co-occurring TF pairs in embryonic stem cells (ESCs) we find that it consists of three subnetworks with distinct functions: maintenance of pluripotency governed by OCT4, SOX2 and NANOG, regulation of early development governed by KLF4, STAT3, ZIC3 and ZNF148 and general functions governed by MYC, TCF3 and YY1. In summary, coTRaCTE predicts highly cell-type specific co-occurring TFs which reveal new insights into transcriptional regulatory mechanisms.
Collapse
Affiliation(s)
- Alena van Bömmel
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Michael I. Love
- Department of Biostatistics, Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Ho-Ryun Chung
- Otto Warburg Laboratory, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Philipps-Universität Marburg, Fachbereich Medizin, Institut für Medizinische Bioinformatik und Biostatistik, Marburg, Germany
| | - Martin Vingron
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
- * E-mail:
| |
Collapse
|
10
|
Deng Y, Zheng H, Yan Z, Liao D, Li C, Zhou J, Liao H. Full-Length Transcriptome Survey and Expression Analysis of Cassia obtusifolia to Discover Putative Genes Related to Aurantio-Obtusin Biosynthesis, Seed Formation and Development, and Stress Response. Int J Mol Sci 2018; 19:ijms19092476. [PMID: 30134624 PMCID: PMC6163539 DOI: 10.3390/ijms19092476] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Revised: 08/09/2018] [Accepted: 08/13/2018] [Indexed: 12/23/2022] Open
Abstract
The seed is the pharmaceutical and breeding organ of Cassia obtusifolia, a well-known medical herb containing aurantio-obtusin (a kind of anthraquinone), food, and landscape. In order to understand the molecular mechanism of the biosynthesis of aurantio-obtusin, seed formation and development, and stress response of C. obtusifolia, it is necessary to understand the genomics information. Although previous seed transcriptome of C. obtusifolia has been carried out by short-read next-generation sequencing (NGS) technology, the vast majority of the resulting unigenes did not represent full-length cDNA sequences and supply enough gene expression profile information of the various organs or tissues. In this study, fifteen cDNA libraries, which were constructed from the seed, root, stem, leaf, and flower (three repetitions with each organ) of C. obtusifolia, were sequenced using hybrid approach combining single-molecule real-time (SMRT) and NGS platform. More than 4,315,774 long reads with 9.66 Gb sequencing data and 361,427,021 short reads with 108.13 Gb sequencing data were generated by SMRT and NGS platform, respectively. 67,222 consensus isoforms were clustered from the reads and 81.73% (61,016) of which were longer than 1000 bp. Furthermore, the 67,222 consensus isoforms represented 58,106 nonredundant transcripts, 98.25% (57,092) of which were annotated and 25,573 of which were assigned to specific metabolic pathways by KEGG. CoDXS and CoDXR genes were directly used for functional characterization to validate the accuracy of sequences obtained from transcriptome. A total of 658 seed-specific transcripts indicated their special roles in physiological processes in seed. Analysis of transcripts which were involved in the early stage of anthraquinone biosynthesis suggested that the aurantio-obtusin in C. obtusifolia was mainly generated from isochorismate and Mevalonate/methylerythritol phosphate (MVA/MEP) pathway, and three reactions catalyzed by Menaquinone-specific isochorismate synthase (ICS), 1-deoxy-d-xylulose-5-phosphate synthase (DXS) and isopentenyl diphosphate (IPPS) might be the limited steps. Several seed-specific CYPs, SAM-dependent methyltransferase, and UDP-glycosyltransferase (UDPG) supplied promising candidate genes in the late stage of anthraquinone biosynthesis. In addition, four seed-specific transcriptional factors including three MYB Transcription Factor (MYB) and one MADS-box Transcription Factor (MADS) transcriptional factors) and alternative splicing might be involved with seed formation and development. Meanwhile, most members of Hsp20 genes showed high expression level in seed and flower; seven of which might have chaperon activities under various abiotic stresses. Finally, the expressional patterns of genes with particular interests showed similar trends in both transcriptome assay and qRT-PCR. In conclusion, this is the first full-length transcriptome sequencing reported in Caesalpiniaceae family, and thus providing a more complete insight into aurantio-obtusin biosynthesis, seed formation and development, and stress response as well in C. obtusifolia.
Collapse
Affiliation(s)
- Yin Deng
- School of Life Science and Engineering, Southwest Jiaotong University, Chengdu 610031, China.
| | - Hui Zheng
- School of Life Science and Engineering, Southwest Jiaotong University, Chengdu 610031, China.
| | - Zicheng Yan
- School of Life Science and Engineering, Southwest Jiaotong University, Chengdu 610031, China.
| | - Dongying Liao
- School of Life Science and Engineering, Southwest Jiaotong University, Chengdu 610031, China.
| | - Chaolin Li
- School of Life Science and Engineering, Southwest Jiaotong University, Chengdu 610031, China.
| | - Jiayu Zhou
- School of Life Science and Engineering, Southwest Jiaotong University, Chengdu 610031, China.
| | - Hai Liao
- School of Life Science and Engineering, Southwest Jiaotong University, Chengdu 610031, China.
| |
Collapse
|
11
|
Meckbach C, Wingender E, Gültas M. Removing Background Co-occurrences of Transcription Factor Binding Sites Greatly Improves the Prediction of Specific Transcription Factor Cooperations. Front Genet 2018; 9:189. [PMID: 29896218 PMCID: PMC5986914 DOI: 10.3389/fgene.2018.00189] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2016] [Accepted: 05/08/2018] [Indexed: 12/17/2022] Open
Abstract
Today, it is well-known that in eukaryotic cells the complex interplay of transcription factors (TFs) bound to the DNA of promoters and enhancers is the basis for precise and specific control of transcription. Computational methods have been developed for the identification of potentially cooperating TFs through the co-occurrence of their binding sites (TFBSs). One challenge of these methods is the differentiation of TFBS pairs that are specific for a given sequence set from those that are ubiquitously appearing, rendering the results highly dependent on the choice of a proper background set. Here, we present an extension of our previous PC-TraFF approach that estimates the background co-occurrence of any TF pair by preserving the (oligo-) nucleotide composition and, thus, the core of TFBSs in the sequences of interest. Applying our approach to a simulated data set with implanted TFBS pairs, we could successfully identify them as sequence-set specific under a variety of conditions. When we analyzed the gene expression data sets of five breast cancer associated subtypes, the number of overlapping pairs could be dramatically reduced in comparison to our previous approach. As a result, we could identify potentially cooperating transcriptional regulators that are characteristic for each of the five breast cancer subtypes. This indicates that our approach is able to discriminate specific potential TF cooperations against ubiquitously occurring combinations. The results obtained with our method may help to understand the genetic programs governing specific biological processes such as the development of different tumor types.
Collapse
Affiliation(s)
- Cornelia Meckbach
- Institute of Bioinformatics, University Medical Center Göttingen, Georg-August-University Göttingen, Göttingen, Germany
| | - Edgar Wingender
- Institute of Bioinformatics, University Medical Center Göttingen, Georg-August-University Göttingen, Göttingen, Germany
| | - Mehmet Gültas
- Institute of Bioinformatics, University Medical Center Göttingen, Georg-August-University Göttingen, Göttingen, Germany.,Department of Breeding Informatics, Georg-August University Göttingen, Göttingen, Germany.,Center for Integrated Breeding Research (CiBreed), Georg-August University Göttingen, Göttingen, Germany
| |
Collapse
|
12
|
Uygun S, Seddon AE, Azodi CB, Shiu SH. Predictive Models of Spatial Transcriptional Response to High Salinity. PLANT PHYSIOLOGY 2017; 174:450-464. [PMID: 28373393 PMCID: PMC5411138 DOI: 10.1104/pp.16.01828] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2016] [Accepted: 03/27/2017] [Indexed: 05/12/2023]
Abstract
Plants are exposed to a variety of environmental conditions, and their ability to respond to environmental variation depends on the proper regulation of gene expression in an organ-, tissue-, and cell type-specific manner. Although our knowledge of how stress responses are regulated is accumulating, a genome-wide model of how plant transcription factors (TFs) and cis-regulatory elements control spatially specific stress response has yet to emerge. Using Arabidopsis (Arabidopsis thaliana) as a model, we identified a set of 1,894 putative cis-regulatory elements (pCREs) that are associated with high-salinity (salt) up-regulated genes in the root or the shoot. We used these pCREs to develop computational models that can better predict salt up-regulated genes in the root and shoot compared with models based on known TF binding motifs. In addition, we incorporated TF binding sites identified via large-scale in vitro assays, chromatin accessibility, evolutionary conservation, and pCRE combinatorial relationships in machine learning models and found that only consideration of pCRE combinations led to better performance in salt up-regulation prediction in the root and shoot. Our results suggest that the plant organ transcriptional response to high salinity is regulated by a core set of pCREs and provide a genome-wide view of the cis-regulatory code of plant spatial transcriptional responses to environmental stress.
Collapse
Affiliation(s)
- Sahra Uygun
- Genetics Program (S.U., S.-H.S.), Department of Plant Biology (A.E.S., C.B.A., S.-H.S.), and Ecology, Evolutionary Biology, and Behavior Program (S.-H.S.), Michigan State University, East Lansing, Michigan 48824
| | - Alexander E Seddon
- Genetics Program (S.U., S.-H.S.), Department of Plant Biology (A.E.S., C.B.A., S.-H.S.), and Ecology, Evolutionary Biology, and Behavior Program (S.-H.S.), Michigan State University, East Lansing, Michigan 48824
| | - Christina B Azodi
- Genetics Program (S.U., S.-H.S.), Department of Plant Biology (A.E.S., C.B.A., S.-H.S.), and Ecology, Evolutionary Biology, and Behavior Program (S.-H.S.), Michigan State University, East Lansing, Michigan 48824
| | - Shin-Han Shiu
- Genetics Program (S.U., S.-H.S.), Department of Plant Biology (A.E.S., C.B.A., S.-H.S.), and Ecology, Evolutionary Biology, and Behavior Program (S.-H.S.), Michigan State University, East Lansing, Michigan 48824
| |
Collapse
|
13
|
Prabahar A, Natarajan J. Prediction of microRNAs involved in immune system diseases through network based features. J Biomed Inform 2016; 65:34-45. [PMID: 27871823 DOI: 10.1016/j.jbi.2016.11.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Revised: 09/28/2016] [Accepted: 11/13/2016] [Indexed: 12/13/2022]
Abstract
MicroRNAs are a class of small non-coding regulatory RNA molecules that modulate the expression of several genes at post-transcriptional level and play a vital role in disease pathogenesis. Recent research shows that a range of miRNAs are involved in the regulation of immunity and its deregulation results in immune mediated diseases such as cancer, inflammation and autoimmune diseases. Computational discovery of these immune miRNAs using a set of specific features is highly desirable. In the current investigation, we present a SVM based classification system which uses a set of novel network based topological and motif features in addition to the baseline sequential and structural features to predict immune specific miRNAs from other non-immune miRNAs. The classifier was trained and tested on a balanced set of equal number of positive and negative examples to show the discriminative power of our network features. Experimental results show that our approach achieves an accuracy of 90.2% and outperforms the classification accuracy of 63.2% reported using the traditional miRNA sequential and structural features. The proposed classifier was further validated with two immune disease sub-class datasets related to multiple sclerosis microarray data and psoriasis RNA-seq data with higher accuracy. These results indicate that our classifier which uses network and motif features along with sequential and structural features will lead to significant improvement in classifying immune miRNAs and hence can be applied to identify other specific classes of miRNAs as an extensible miRNA classification system.
Collapse
Affiliation(s)
- Archana Prabahar
- Data Mining and Text Mining Laboratory, Department of Bioinformatics, Bharathiar University, Coimbatore 641 046, India.
| | - Jeyakumar Natarajan
- Data Mining and Text Mining Laboratory, Department of Bioinformatics, Bharathiar University, Coimbatore 641 046, India.
| |
Collapse
|
14
|
Wu WS, Lai FJ. Detecting Cooperativity between Transcription Factors Based on Functional Coherence and Similarity of Their Target Gene Sets. PLoS One 2016; 11:e0162931. [PMID: 27623007 PMCID: PMC5021274 DOI: 10.1371/journal.pone.0162931] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Accepted: 08/30/2016] [Indexed: 11/22/2022] Open
Abstract
In eukaryotic cells, transcriptional regulation of gene expression is usually achieved by cooperative transcription factors (TFs). Therefore, knowing cooperative TFs is the first step toward uncovering the molecular mechanisms of gene expression regulation. Many algorithms based on different rationales have been proposed to predict cooperative TF pairs in yeast. Although various types of rationales have been used in the existing algorithms, functional coherence is not yet used. This prompts us to develop a new algorithm based on functional coherence and similarity of the target gene sets to identify cooperative TF pairs in yeast. The proposed algorithm predicted 40 cooperative TF pairs. Among them, three (Pdc2-Thi2, Hot1-Msn1 and Leu3-Met28) are novel predictions, which have not been predicted by any existing algorithms. Strikingly, two (Pdc2-Thi2 and Hot1-Msn1) of the three novel predictions have been experimentally validated, demonstrating the power of the proposed algorithm. Moreover, we show that the predictions of the proposed algorithm are more biologically meaningful than the predictions of 17 existing algorithms under four evaluation indices. In summary, our study suggests that new algorithms based on novel rationales are worthy of developing for detecting previously unidentifiable cooperative TF pairs.
Collapse
Affiliation(s)
- Wei-Sheng Wu
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
- * E-mail:
| | - Fu-Jou Lai
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
15
|
A Derived Allosteric Switch Underlies the Evolution of Conditional Cooperativity between HOXA11 and FOXO1. Cell Rep 2016; 15:2097-2108. [DOI: 10.1016/j.celrep.2016.04.088] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2015] [Revised: 02/23/2016] [Accepted: 04/26/2016] [Indexed: 12/11/2022] Open
|
16
|
Zeidler S, Meckbach C, Tacke R, Raad FS, Roa A, Uchida S, Zimmermann WH, Wingender E, Gültas M. Computational Detection of Stage-Specific Transcription Factor Clusters during Heart Development. Front Genet 2016; 7:33. [PMID: 27047536 PMCID: PMC4804722 DOI: 10.3389/fgene.2016.00033] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Accepted: 02/23/2016] [Indexed: 12/28/2022] Open
Abstract
Transcription factors (TFs) regulate gene expression in living organisms. In higher organisms, TFs often interact in non-random combinations with each other to control gene transcription. Understanding the interactions is key to decipher mechanisms underlying tissue development. The aim of this study was to analyze co-occurring transcription factor binding sites (TFBSs) in a time series dataset from a new cell-culture model of human heart muscle development in order to identify common as well as specific co-occurring TFBS pairs in the promoter regions of regulated genes which can be essential to enhance cardiac tissue developmental processes. To this end, we separated available RNAseq dataset into five temporally defined groups: (i) mesoderm induction stage; (ii) early cardiac specification stage; (iii) late cardiac specification stage; (iv) early cardiac maturation stage; (v) late cardiac maturation stage, where each of these stages is characterized by unique differentially expressed genes (DEGs). To identify TFBS pairs for each stage, we applied the MatrixCatch algorithm, which is a successful method to deduce experimentally described TFBS pairs in the promoters of the DEGs. Although DEGs in each stage are distinct, our results show that the TFBS pair networks predicted by MatrixCatch for all stages are quite similar. Thus, we extend the results of MatrixCatch utilizing a Markov clustering algorithm (MCL) to perform network analysis. Using our extended approach, we are able to separate the TFBS pair networks in several clusters to highlight stage-specific co-occurences between TFBSs. Our approach has revealed clusters that are either common (NFAT or HMGIY clusters) or specific (SMAD or AP-1 clusters) for the individual stages. Several of these clusters are likely to play an important role during the cardiomyogenesis. Further, we have shown that the related TFs of TFBSs in the clusters indicate potential synergistic or antagonistic interactions to switch between different stages. Additionally, our results suggest that cardiomyogenesis follows the hourglass model which was already proven for Arabidopsis and some vertebrates. This investigation helps us to get a better understanding of how each stage of cardiomyogenesis is affected by different combination of TFs. Such knowledge may help to understand basic principles of stem cell differentiation into cardiomyocytes.
Collapse
Affiliation(s)
- Sebastian Zeidler
- University Medical Center Göttingen, Institute of Bioinformatics, Georg-August-University GöttingenGöttingen, Germany; Heart Research Center Göttingen, University Medical Center Göttingen, Institute of Pharmacology and Toxicology, Georg-August-University GöttingenGöttingen, Germany; DZHK (German Centre for Cardiovascular Research)Göttingen, Germany
| | - Cornelia Meckbach
- University Medical Center Göttingen, Institute of Bioinformatics, Georg-August-University Göttingen Göttingen, Germany
| | - Rebecca Tacke
- University Medical Center Göttingen, Institute of Bioinformatics, Georg-August-University Göttingen Göttingen, Germany
| | - Farah S Raad
- Heart Research Center Göttingen, University Medical Center Göttingen, Institute of Pharmacology and Toxicology, Georg-August-University GöttingenGöttingen, Germany; DZHK (German Centre for Cardiovascular Research)Göttingen, Germany
| | - Angelica Roa
- Heart Research Center Göttingen, University Medical Center Göttingen, Institute of Pharmacology and Toxicology, Georg-August-University GöttingenGöttingen, Germany; DZHK (German Centre for Cardiovascular Research)Göttingen, Germany
| | - Shizuka Uchida
- Institute of Cardiovascular Regeneration, Goethe University FrankfurtFrankfurt, Germany; DZHK (German Centre for Cardiovascular Research)Frankfurt, Germany
| | - Wolfram-Hubertus Zimmermann
- Heart Research Center Göttingen, University Medical Center Göttingen, Institute of Pharmacology and Toxicology, Georg-August-University GöttingenGöttingen, Germany; DZHK (German Centre for Cardiovascular Research)Göttingen, Germany
| | - Edgar Wingender
- University Medical Center Göttingen, Institute of Bioinformatics, Georg-August-University GöttingenGöttingen, Germany; DZHK (German Centre for Cardiovascular Research)Göttingen, Germany
| | - Mehmet Gültas
- University Medical Center Göttingen, Institute of Bioinformatics, Georg-August-University Göttingen Göttingen, Germany
| |
Collapse
|
17
|
Pieczonka TD, Bragiel AM, Horikawa H, Fukuta K, Yoshioka M, Ishikawa Y. Long-term administration of whey alters atrophy, gene expression profiles and dysfunction of salivary glands in elderly rats. J Funct Foods 2016. [DOI: 10.1016/j.jff.2015.12.027] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
18
|
Meckbach C, Tacke R, Hua X, Waack S, Wingender E, Gültas M. PC-TraFF: identification of potentially collaborating transcription factors using pointwise mutual information. BMC Bioinformatics 2015; 16:400. [PMID: 26627005 PMCID: PMC4667426 DOI: 10.1186/s12859-015-0827-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2015] [Accepted: 11/17/2015] [Indexed: 01/06/2023] Open
Abstract
Background Transcription factors (TFs) are important regulatory proteins that govern transcriptional regulation. Today, it is known that in higher organisms different TFs have to cooperate rather than acting individually in order to control complex genetic programs. The identification of these interactions is an important challenge for understanding the molecular mechanisms of regulating biological processes. In this study, we present a new method based on pointwise mutual information, PC-TraFF, which considers the genome as a document, the sequences as sentences, and TF binding sites (TFBSs) as words to identify interacting TFs in a set of sequences. Results To demonstrate the effectiveness of PC-TraFF, we performed a genome-wide analysis and a breast cancer-associated sequence set analysis for protein coding and miRNA genes. Our results show that in any of these sequence sets, PC-TraFF is able to identify important interacting TF pairs, for most of which we found support by previously published experimental results. Further, we made a pairwise comparison between PC-TraFF and three conventional methods. The outcome of this comparison study strongly suggests that all these methods focus on different important aspects of interaction between TFs and thus the pairwise overlap between any of them is only marginal. Conclusions In this study, adopting the idea from the field of linguistics in the field of bioinformatics, we develop a new information theoretic method, PC-TraFF, for the identification of potentially collaborating transcription factors based on the idiosyncrasy of their binding site distributions on the genome. The results of our study show that PC-TraFF can succesfully identify known interacting TF pairs and thus its currently biologically uncorfirmed predictions could provide new hypotheses for further experimental validation. Additionally, the comparison of the results of PC-TraFF with the results of previous methods demonstrates that different methods with their specific scopes can perfectly supplement each other. Overall, our analyses indicate that PC-TraFF is a time-efficient method where its algorithm has a tractable computational time and memory consumption. The PC-TraFF server is freely accessible at http://pctraff.bioinf.med.uni-goettingen.de/ Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0827-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Cornelia Meckbach
- Institute of Bioinformatics, University of Göttingen, Goldschmidtstr. 1, Göttingen, 37077, Germany.
| | - Rebecca Tacke
- Institute of Bioinformatics, University of Göttingen, Goldschmidtstr. 1, Göttingen, 37077, Germany.
| | - Xu Hua
- Institute of Bioinformatics, University of Göttingen, Goldschmidtstr. 1, Göttingen, 37077, Germany.
| | - Stephan Waack
- Institute of Computer Science, University of Göttingen, Goldschmidtstr. 7, Göttingen, 37077, Germany.
| | - Edgar Wingender
- Institute of Bioinformatics, University of Göttingen, Goldschmidtstr. 1, Göttingen, 37077, Germany.
| | - Mehmet Gültas
- Institute of Bioinformatics, University of Göttingen, Goldschmidtstr. 1, Göttingen, 37077, Germany.
| |
Collapse
|
19
|
Malhotra S, Sowdhamini R. Interactions Among Plant Transcription Factors Regulating Expression of Stress-responsive Genes. Bioinform Biol Insights 2014; 8:193-8. [PMID: 25249757 PMCID: PMC4167486 DOI: 10.4137/bbi.s16313] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2014] [Revised: 06/05/2014] [Accepted: 06/06/2014] [Indexed: 12/23/2022] Open
Abstract
Plants are simultaneously subjected to a variety of stress conditions in the field and are known to combat the hostile conditions by up/down-regulating number of genes. There exists a significant level of cross-talk between different stress responses in plants. In this study, we predict the interacting pairs of transcription factors that regulate the multiple abiotic stress-responsive genes in the plant Arabidopsis thaliana. We identified the interacting pair(s) of transcription factors (TFs) based on the spatial proximity of their binding sites. We also examined the interactions between the predicted pairs of TFs using molecular docking. Subsequent to docking, the best interaction pose was selected using our scoring scheme DockScore, which ranks the docked solutions based on several interface parameters and aims to find optimal interactions between proteins. We analyzed the selected docked pose for the interface residues and their conservation.
Collapse
Affiliation(s)
- Sony Malhotra
- National Centre for Biological Sciences (TIFR), Bangalore, India
| | - R Sowdhamini
- National Centre for Biological Sciences (TIFR), Bangalore, India
| |
Collapse
|
20
|
Srivastava P, Mangal M, Agarwal SM. Understanding the transcriptional regulation of cervix cancer using microarray gene expression data and promoter sequence analysis of a curated gene set. Gene 2013; 535:233-8. [PMID: 24291025 DOI: 10.1016/j.gene.2013.11.028] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2013] [Revised: 11/12/2013] [Accepted: 11/15/2013] [Indexed: 02/08/2023]
Abstract
Cervical cancer, the malignant neoplasm of the cervix uteri is the second most common cancer among women worldwide and the top-most cancer in India. Several factors are responsible for causing cervical cancer, which alter the expression of oncogenic genes resulting in up or down-regulation of gene expression and inactivation of tumor-suppressor genes/gene products. Gene expression is regulated by interactions between transcription factors (TFs) and specific regulatory elements in the promoter regions of target genes. Thus, it is important to decipher and analyze TFs that bind to regulatory regions of diseased genes and regulate their expression. In the present study, computational methods involving the combination of gene expression data from microarray experiments and promoter sequence analysis of a curated gene set involved in the cervical cancer causation have been utilized for identifying potential regulatory elements. Consensus predictions of two approaches led to the identification of twelve TFs that might be crucial to the regulation of cervical cancer progression. Subsequently, TF enrichment and oncomine expression analysis suggested that the transcription factor family E2F played an important role for the regulation of genes involve in cervical carcinogenesis. Our results suggest that E2F possesses diagnostic/prognostic value and can act as a potential drug target in cervical cancer.
Collapse
Affiliation(s)
- Prashant Srivastava
- Integrative Genomics and Medicine, MRC Clinical Sciences, Imperial College, London, UK
| | - Manu Mangal
- Bioinformatics Division, Institute of Cytology and Preventive Oncology, Noida-201301, India
| | - Subhash Mohan Agarwal
- Bioinformatics Division, Institute of Cytology and Preventive Oncology, Noida-201301, India.
| |
Collapse
|
21
|
Li H, Chen D, Zhang J. Statistical analysis of combinatorial transcriptional regulatory motifs in human intron-containing promoter sequences. Comput Biol Chem 2013; 43:35-45. [DOI: 10.1016/j.compbiolchem.2012.12.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2012] [Revised: 12/19/2012] [Accepted: 12/23/2012] [Indexed: 11/16/2022]
|
22
|
[Analysis of transcriptional regulatory sites in introns of human and mouse ribosomal protein genes]. YI CHUAN = HEREDITAS 2012; 34:1577-82. [PMID: 23262105 DOI: 10.3724/sp.j.1005.2012.01577] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Previous studies from oligonucleotides in the ribosomal protein (RP) genes of the yeast and fruitfly indicated that the potential transcriptional regulatory sites are located in the introns of the genes. The transcriptional regulatory sites in introns are still poorly understood. To explore the functional significance of transcriptional regulation of introns, we extracted over-represented oligonucleotides (also known as motifs) in the first introns of the human and mouse ribosomal protein genes by statistical comparative analysis, and found that over 85% of these oligonucleotides were consistent with the known transcriptional factor binding sites, which might be potential transcriptional regulatory elements. By analyzing the base compositions of these elements, we found that a majority (>95%) of the detected motifs were rich in C and G and only a few of them were rich in A and T. Moreover, the oligonucleotides were close to the 5'-ends of the first introns (the distances between the motifs and the transcriptional start sites or upstream regions of genes are short). We speculated that the properties of over-represented motifs in the first introns might be associated with the transcriptional control.
Collapse
|
23
|
Girgis HZ, Ovcharenko I. Predicting tissue specific cis-regulatory modules in the human genome using pairs of co-occurring motifs. BMC Bioinformatics 2012; 13:25. [PMID: 22313678 PMCID: PMC3359238 DOI: 10.1186/1471-2105-13-25] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2011] [Accepted: 02/07/2012] [Indexed: 12/26/2022] Open
Abstract
Background Researchers seeking to unlock the genetic basis of human physiology and diseases have been studying gene transcription regulation. The temporal and spatial patterns of gene expression are controlled by mainly non-coding elements known as cis-regulatory modules (CRMs) and epigenetic factors. CRMs modulating related genes share the regulatory signature which consists of transcription factor (TF) binding sites (TFBSs). Identifying such CRMs is a challenging problem due to the prohibitive number of sequence sets that need to be analyzed. Results We formulated the challenge as a supervised classification problem even though experimentally validated CRMs were not required. Our efforts resulted in a software system named CrmMiner. The system mines for CRMs in the vicinity of related genes. CrmMiner requires two sets of sequences: a mixed set and a control set. Sequences in the vicinity of the related genes comprise the mixed set, whereas the control set includes random genomic sequences. CrmMiner assumes that a large percentage of the mixed set is made of background sequences that do not include CRMs. The system identifies pairs of closely located motifs representing vertebrate TFBSs that are enriched in the training mixed set consisting of 50% of the gene loci. In addition, CrmMiner selects a group of the enriched pairs to represent the tissue-specific regulatory signature. The mixed and the control sets are searched for candidate sequences that include any of the selected pairs. Next, an optimal Bayesian classifier is used to distinguish candidates found in the mixed set from their control counterparts. Our study proposes 62 tissue-specific regulatory signatures and putative CRMs for different human tissues and cell types. These signatures consist of assortments of ubiquitously expressed TFs and tissue-specific TFs. Under controlled settings, CrmMiner identified known CRMs in noisy sets up to 1:25 signal-to-noise ratio. CrmMiner was 21-75% more precise than a related CRM predictor. The sensitivity of the system to locate known human heart enhancers reached up to 83%. CrmMiner precision reached 82% while mining for CRMs specific to the human CD4+ T cells. On several data sets, the system achieved 99% specificity. Conclusion These results suggest that CrmMiner predictions are accurate and likely to be tissue-specific CRMs. We expect that the predicted tissue-specific CRMs and the regulatory signatures broaden our knowledge of gene transcription regulation.
Collapse
Affiliation(s)
- Hani Z Girgis
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health 9600 Rockville Pike, Bethesda, MD 20896, USA
| | | |
Collapse
|
24
|
Myšičková A, Vingron M. Detection of interacting transcription factors in human tissues using predicted DNA binding affinity. BMC Genomics 2012; 13 Suppl 1:S2. [PMID: 22369666 PMCID: PMC3583127 DOI: 10.1186/1471-2164-13-s1-s2] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Background Tissue-specific gene expression is generally regulated by combinatorial interactions among transcription factors (TFs) which bind to the DNA. Despite this known fact, previous discoveries of the mechanism that controls gene expression usually consider only a single TF. Results We provide a prediction of interacting TFs in 22 human tissues based on their DNA-binding affinity in promoter regions. We analyze all possible pairs of 130 vertebrate TFs from the JASPAR database. First, all human promoter regions are scanned for single TF-DNA binding affinities with TRAP and for each TF a ranked list of all promoters ordered by the binding affinity is created. We then study the similarity of the ranked lists and detect candidates for TF-TF interaction by applying a partial independence test for multiway contingency tables. Our candidates are validated by both known protein-protein interactions (PPIs) and known gene regulation mechanisms in the selected tissue. We find that the known PPIs are significantly enriched in the groups of our predicted TF-TF interactions (2 and 7 times more common than expected by chance). In addition, the predicted interacting TFs for studied tissues (liver, muscle, hematopoietic stem cell) are supported in literature to be active regulators or to be expressed in the corresponding tissue. Conclusions The findings from this study indicate that tissue-specific gene expression is regulated by one or two central regulators and a large number of TFs interacting with these central hubs. Our results are in agreement with recent experimental studies.
Collapse
Affiliation(s)
- Alena Myšičková
- Max Planck Institute for Molecular Genetics, Ihnestr 73, 14195 Berlin, Germany.
| | | |
Collapse
|
25
|
Xu J, Li CX, Lv JY, Li YS, Xiao Y, Shao TT, Huo X, Li X, Zou Y, Han QL, Li X, Wang LH, Ren H. Prioritizing candidate disease miRNAs by topological features in the miRNA target-dysregulated network: case study of prostate cancer. Mol Cancer Ther 2011; 10:1857-66. [PMID: 21768329 DOI: 10.1158/1535-7163.mct-11-0055] [Citation(s) in RCA: 172] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Recently, microRNAs (miRNA), small noncoding RNAs, have taken center stage in the field of human molecular oncology. However, their roles in tumor biology remain largely unknown. According to the assumption that miRNAs implicated in a specific tumor phenotype will show aberrant regulation of their target genes, we introduce an approach based on the miRNA target-dysregulated network (MTDN) to prioritize novel disease miRNAs. Target genes have predicted binding sites for any miRNA. The MTDN is constructed by combining computational target prediction with miRNA and mRNA expression profiles in tumor and nontumor tissues. Application of the proposed method to prostate cancer reveals that known prostate cancer miRNAs are characterized by a greater number of dysregulations and coregulators and the tendency to coregulate with each other and that they share a higher proportion of targets with other prostate cancer miRNAs. Support vector machine classifier, based on these features and changes in miRNA expression, is constructed and gives an average overall prediction accuracy of 0.8872 in cross-validation tests. The classifier is then applied to miRNAs in the MTDN. Functions enriched by dysregulated targets of novel predicted miRNAs are closely associated with oncogenesis. In addition, predicted cancer miRNAs within families or from different families show combinatorial dysregulation of target genes, as revealed by analysis of the MTDN modular organization. Finally, 3 miRNA target regulations are verified to hold in prostate cancer cells by transfection assays. These results show that the network-centric method could prioritize novel disease miRNAs and model how oncogenic lesions are mediated by miRNAs, providing important insights into tumorigenesis.
Collapse
Affiliation(s)
- Juan Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Gu Q, Nagaraj SH, Hudson NJ, Dalrymple BP, Reverter A. Genome-wide patterns of promoter sharing and co-expression in bovine skeletal muscle. BMC Genomics 2011; 12:23. [PMID: 21226902 PMCID: PMC3025955 DOI: 10.1186/1471-2164-12-23] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2010] [Accepted: 01/12/2011] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Gene regulation by transcription factors (TF) is species, tissue and time specific. To better understand how the genetic code controls gene expression in bovine muscle we associated gene expression data from developing Longissimus thoracis et lumborum skeletal muscle with bovine promoter sequence information. RESULTS We created a highly conserved genome-wide promoter landscape comprising 87,408 interactions relating 333 TFs with their 9,242 predicted target genes (TGs). We discovered that the complete set of predicted TGs share an average of 2.75 predicted TF binding sites (TFBSs) and that the average co-expression between a TF and its predicted TGs is higher than the average co-expression between the same TF and all genes. Conversely, pairs of TFs sharing predicted TGs showed a co-expression correlation higher that pairs of TFs not sharing TGs. Finally, we exploited the co-occurrence of predicted TFBS in the context of muscle-derived functionally-coherent modules including cell cycle, mitochondria, immune system, fat metabolism, muscle/glycolysis, and ribosome. Our findings enabled us to reverse engineer a regulatory network of core processes, and correctly identified the involvement of E2F1, GATA2 and NFKB1 in the regulation of cell cycle, fat, and muscle/glycolysis, respectively. CONCLUSION The pivotal implication of our research is two-fold: (1) there exists a robust genome-wide expression signal between TFs and their predicted TGs in cattle muscle consistent with the extent of promoter sharing; and (2) this signal can be exploited to recover the cellular mechanisms underpinning transcription regulation of muscle structure and development in bovine. Our study represents the first genome-wide report linking tissue specific co-expression to co-regulation in a non-model vertebrate.
Collapse
Affiliation(s)
- Quan Gu
- Computational and Systems Biology, CSIRO Food Futures Flagship and CSIRO Livestock Industries, 306 Carmody Rd, St. Lucia, Brisbane, Queensland 4067, Australia
| | | | | | | | | |
Collapse
|