1
|
Xin R, Cheng Q, Chi X, Feng X, Zhang H, Wang Y, Duan M, Xie T, Song X, Yu Q, Fan Y, Huang L, Zhou F. Computational Characterization of Undifferentially Expressed Genes with Altered Transcription Regulation in Lung Cancer. Genes (Basel) 2023; 14:2169. [PMID: 38136991 PMCID: PMC10742656 DOI: 10.3390/genes14122169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 11/19/2023] [Accepted: 11/27/2023] [Indexed: 12/24/2023] Open
Abstract
A transcriptome profiles the expression levels of genes in cells and has accumulated a huge amount of public data. Most of the existing biomarker-related studies investigated the differential expression of individual transcriptomic features under the assumption of inter-feature independence. Many transcriptomic features without differential expression were ignored from the biomarker lists. This study proposed a computational analysis protocol (mqTrans) to analyze transcriptomes from the view of high-dimensional inter-feature correlations. The mqTrans protocol trained a regression model to predict the expression of an mRNA feature from those of the transcription factors (TFs). The difference between the predicted and real expression of an mRNA feature in a query sample was defined as the mqTrans feature. The new mqTrans view facilitated the detection of thirteen transcriptomic features with differentially expressed mqTrans features, but without differential expression in the original transcriptomic values in three independent datasets of lung cancer. These features were called dark biomarkers because they would have been ignored in a conventional differential analysis. The detailed discussion of one dark biomarker, GBP5, and additional validation experiments suggested that the overlapping long non-coding RNAs might have contributed to this interesting phenomenon. In summary, this study aimed to find undifferentially expressed genes with significantly changed mqTrans values in lung cancer. These genes were usually ignored in most biomarker detection studies of undifferential expression. However, their differentially expressed mqTrans values in three independent datasets suggested their strong associations with lung cancer.
Collapse
Affiliation(s)
- Ruihao Xin
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (R.X.); (Y.W.); (M.D.); (L.H.)
- Jilin Institute of Chemical Technology, College of Information and Control Engineering, Jilin 132000, China; (Q.C.); (X.C.); (H.Z.)
| | - Qian Cheng
- Jilin Institute of Chemical Technology, College of Information and Control Engineering, Jilin 132000, China; (Q.C.); (X.C.); (H.Z.)
| | - Xiaohang Chi
- Jilin Institute of Chemical Technology, College of Information and Control Engineering, Jilin 132000, China; (Q.C.); (X.C.); (H.Z.)
| | - Xin Feng
- School of Science, Jilin Institute of Chemical Technology, Jilin 132000, China;
- Department of Epidemiology and Biostatistics, School of Public Health, Jilin University, Changchun 130012, China;
| | - Hang Zhang
- Jilin Institute of Chemical Technology, College of Information and Control Engineering, Jilin 132000, China; (Q.C.); (X.C.); (H.Z.)
| | - Yueying Wang
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (R.X.); (Y.W.); (M.D.); (L.H.)
| | - Meiyu Duan
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (R.X.); (Y.W.); (M.D.); (L.H.)
| | - Tunyang Xie
- Centre for Mathematical Sciences, University of Cambridge, Wilberforce Road, Cambridge CB3 0WA, UK;
| | - Xiaonan Song
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Software, Jilin University, Changchun 130012, China;
| | - Qiong Yu
- Department of Epidemiology and Biostatistics, School of Public Health, Jilin University, Changchun 130012, China;
| | - Yusi Fan
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Software, Jilin University, Changchun 130012, China;
| | - Lan Huang
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (R.X.); (Y.W.); (M.D.); (L.H.)
| | - Fengfeng Zhou
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (R.X.); (Y.W.); (M.D.); (L.H.)
- School of Biology and Engineering, Guizhou Medical University, Guiyang 550025, China
| |
Collapse
|
2
|
Pan-cancer identification of the relationship of metabolism-related differentially expressed transcription regulation with non-differentially expressed target genes via a gated recurrent unit network. Comput Biol Med 2022; 148:105883. [PMID: 35878490 DOI: 10.1016/j.compbiomed.2022.105883] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 07/10/2022] [Accepted: 07/16/2022] [Indexed: 11/20/2022]
Abstract
The transcriptome describes the expression of all genes in a sample. Most studies have investigated the differential patterns or discrimination powers of transcript expression levels. In this study, we hypothesized that the quantitative correlations between the expression levels of transcription factors (TFs) and their regulated target genes (mRNAs) serve as a novel view of healthy status, and a disease sample exhibits a differential landscape (mqTrans) of transcription regulations compared with healthy status. We formulated quantitative transcription regulation relationships of metabolism-related genes as a multi-input multi-output regression model via a gated recurrent unit (GRU) network. The GRU model was trained using healthy blood transcriptomes and the expression levels of mRNAs were predicted by those of the TFs. The mqTrans feature of a gene was defined as the difference between its predicted and actual expression levels. A pan-cancer investigation of the differentially expressed mqTrans features was conducted between the early- and late-stage cancers in 26 cancer types of The Cancer Genome Atlas database. This study focused on the differentially expressed mqTrans features, that did not show differential expression in the actual expression levels. These genes could not be detected by conventional differential analysis. Such dark biomarkers are worthy of further wet-lab investigation. The experimental data also showed that the proposed mqTrans investigation improved the classification between early- and late-stage samples for some cancer types. Thus, the mqTrans features serve as a complementary view to transcriptomes, an OMIC type with mature high-throughput production technologies, and abundant public resources.
Collapse
|
3
|
Zhang L, Yang Y, Chai L, Li Q, Liu J, Lin H, Liu L. A deep learning model to identify gene expression level using cobinding transcription factor signals. Brief Bioinform 2021; 23:6447678. [PMID: 34864886 DOI: 10.1093/bib/bbab501] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 10/13/2021] [Accepted: 11/01/2021] [Indexed: 01/02/2023] Open
Abstract
Gene expression is directly controlled by transcription factors (TFs) in a complex combination manner. It remains a challenging task to systematically infer how the cooperative binding of TFs drives gene activity. Here, we quantitatively analyzed the correlation between TFs and surveyed the TF interaction networks associated with gene expression in GM12878 and K562 cell lines. We identified six TF modules associated with gene expression in each cell line. Furthermore, according to the enrichment characteristics of TFs in these TF modules around a target gene, a convolutional neural network model, called TFCNN, was constructed to identify gene expression level. Results showed that the TFCNN model achieved a good prediction performance for gene expression. The average of the area under receiver operating characteristics curve (AUC) can reach up to 0.975 and 0.976, respectively in GM12878 and K562 cell lines. By comparison, we found that the TFCNN model outperformed the prediction models based on SVM and LDA. This is due to the TFCNN model could better extract the combinatorial interaction among TFs. Further analysis indicated that the abundant binding of regulatory TFs dominates expression of target genes, while the cooperative interaction between TFs has a subtle regulatory effects. And gene expression could be regulated by different TF combinations in a nonlinear way. These results are helpful for deciphering the mechanism of TF combination regulating gene expression.
Collapse
Affiliation(s)
- Lirong Zhang
- School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Yanchao Yang
- School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Lu Chai
- School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Qianzhong Li
- School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Junjie Liu
- School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Hao Lin
- School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Li Liu
- School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| |
Collapse
|
4
|
Zhang LQ, Liu JJ, Liu L, Fan GL, Li YN, Li QZ. The impact of gene-body H3K36me3 patterns on gene expression level changes in chronic myelogenous leukemia. Gene 2021; 802:145862. [PMID: 34352296 DOI: 10.1016/j.gene.2021.145862] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2021] [Revised: 07/07/2021] [Accepted: 07/30/2021] [Indexed: 11/29/2022]
Abstract
Chronic myelogenous leukemia (CML) is a malignant clonal disease of hematopoietic stem cells. Researches have exhibited that the progression of CML is related to histone modifications. Here, we perform the systematic analyses of H3K36me3 patterns and gene expression level changes. We observe that the genes with higher gene-body H3K36me3 levels in normal cells show fewer expression changes during leukemogenesis, while the genes with lower gene-body H3K36me3 levels in normal cells yield obvious expression changes during leukemogenesis (ρ = -0.98, P = 9.30 × 10-8). These findings are conserved in human lung/breast cancers and mouse CML, regardless of gene expression levels and gene lengths. Regulatory element analysis and Random Forest regression display that Hoxd13, Rara, Scl, Smad3, Smad4 and Tgif1 induce the up-regulation of genes with lower H3K36me3 levels (ρ = 0.97, P = 2.35 × 10-56). Enrichment analysis shows that the differentially expressed genes with lower H3K36me3 levels are involved in leukemia-related pathways, such as leukocyte migration and regulation of leukocyte activation. Finally, six driver genes (Tp53, Wt1, Dnmt3a, Cacna1b, Phactr1 and Gbp4) with lower H3K36me3 levels are identified. Our analyses indicate that lower gene-body H3K36me3 levels may serve as a biomarker for the progression of CML.
Collapse
Affiliation(s)
- Lu-Qiang Zhang
- Laboratory of Theoretical Biophysics, School oef Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China.
| | - Jun-Jie Liu
- Laboratory of Theoretical Biophysics, School oef Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Li Liu
- Laboratory of Theoretical Biophysics, School oef Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Guo-Liang Fan
- Laboratory of Theoretical Biophysics, School oef Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Yan-Nan Li
- Laboratory of Theoretical Biophysics, School oef Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Qian-Zhong Li
- Laboratory of Theoretical Biophysics, School oef Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China; The Research Center for Laboratory Animal Science, College of Life Sciences, Inner Mongolia University, Hohhot 010021, China.
| |
Collapse
|
5
|
Zhang Y, Cai Y, Roca X, Kwoh CK, Fullwood MJ. Chromatin loop anchors predict transcript and exon usage. Brief Bioinform 2021; 22:6319936. [PMID: 34263910 PMCID: PMC8575016 DOI: 10.1093/bib/bbab254] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Revised: 06/16/2021] [Accepted: 05/25/2021] [Indexed: 11/24/2022] Open
Abstract
Epigenomics and transcriptomics data from high-throughput sequencing techniques such as RNA-seq and ChIP-seq have been successfully applied in predicting gene transcript expression. However, the locations of chromatin loops in the genome identified by techniques such as Chromatin Interaction Analysis with Paired End Tag sequencing (ChIA-PET) have never been used for prediction tasks. Here, we developed machine learning models to investigate if ChIA-PET could contribute to transcript and exon usage prediction. In doing so, we used a large set of transcription factors as well as ChIA-PET data. We developed different Gradient Boosting Trees models according to the different tasks with the integrated datasets from three cell lines, including GM12878, HeLaS3 and K562. We validated the models via 10-fold cross validation, chromosome-split validation and cross-cell validation. Our results show that both transcript and splicing-derived exon usage can be effectively predicted with at least 0.7512 and 0.7459 of accuracy, respectively, on all cell lines from all kinds of validations. Examining the predictive features, we found that RNA Polymerase II ChIA-PET was one of the most important features in both transcript and exon usage prediction, suggesting that chromatin loop anchors are predictive of both transcript and exon usage.
Collapse
Affiliation(s)
- Yu Zhang
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798, Singapore
| | - Yichao Cai
- Cancer Science Institute of Singapore, National University of Singapore, 14 Medical Dr, Singapore 117599, Singapore
| | - Xavier Roca
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Dr, Singapore 637551, Singapore
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798, Singapore
| | - Melissa Jane Fullwood
- Cancer Science Institute of Singapore, National University of Singapore, 14 Medical Dr, Singapore 117599, Singapore.,School of Biological Sciences, Nanyang Technological University, 637551, Singapore.,Institute of Molecular and Cell Biology, Agency for Science, Technology and Research (A*STAR), 61 Biopolis Dr, Singapore 138673, Singapore
| |
Collapse
|
6
|
MACMIC Reveals A Dual Role of CTCF in Epigenetic Regulation of Cell Identity Genes. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:140-153. [PMID: 33677108 PMCID: PMC8498966 DOI: 10.1016/j.gpb.2020.10.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 08/28/2020] [Accepted: 11/17/2020] [Indexed: 11/23/2022]
Abstract
Numerous studies of relationship between epigenomic features have focused on their strong correlation across the genome, likely because such relationship can be easily identified by many established methods for correlation analysis. However, two features with little correlation may still colocalize at many genomic sites to implement important functions. There is no bioinformatic tool for researchers to specifically identify such feature pairs. Here, we develop a method to identify feature pairs in which two features have maximal colocalization minimal correlation (MACMIC) across the genome. By MACMIC analysis of 3306 feature pairs in 16 human cell types, we reveal a dual role of CCCTC-binding factor (CTCF) in epigenetic regulation of cell identity genes. Although super-enhancers are associated with activation of target genes, only a subset of super-enhancers colocalized with CTCF regulate cell identity genes. At super-enhancers colocalized with CTCF, CTCF is required for the active marker H3K27ac in cell types requiring the activation, and also required for the repressive marker H3K27me3 in other cell types requiring repression. Our work demonstrates the biological utility of the MACMIC analysis and reveals a key role for CTCF in epigenetic regulation of cell identity. The code for MACMIC is available at https://github.com/bxia888/MACMIC.
Collapse
|
7
|
Wang H, Liu Y, Guan H, Fan GL. The Regulation of Target Genes by Co-occupancy of Transcription Factors, c-Myc and Mxi1 with Max in the Mouse Cell Line. Curr Bioinform 2020. [DOI: 10.2174/1574893614666191106103633] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Background:
The regulatory function of transcription factors on genes is not only related
to the location of binding genes and its related functions, but is also related to the methods of
binding.
Objective:
It is necessary to study the regulation effects in different binding methods on target genes.
Methods:
In this study, we provided a reliable theoretical basis for studying gene expression
regulation of co-binding transcription factors and further revealed the specific regulation of
transcription factor co-binding in cancer cells.
Results:
Transcription factors tend to combine with other transcription factors in the regulatory
region to form a competitive or synergistic relationship to regulate target genes accurately.
Conclusion:
We found that up-regulated genes in cancer cells were involved in the regulation of
their own immune system related to the normal cells.
Collapse
Affiliation(s)
- Hui Wang
- Department of Physics, School of Physical Science and Technology, Inner Mongolia University, Hohhot, China
| | - Yuan Liu
- Department of Physics, School of Physical Science and Technology, Inner Mongolia University, Hohhot, China
| | - Hua Guan
- ENT Department, Huhhot First Hospital, Hohhot, China
| | - Guo-Liang Fan
- Department of Physics, School of Physical Science and Technology, Inner Mongolia University, Hohhot, China
| |
Collapse
|
8
|
Jin W, Li QZ, Zuo YC, Cao YN, Zhang LQ, Hou R, Su WX. Relationship Between DNA Methylation in Key Region and the Differential Expressions of Genes in Human Breast Tumor Tissue. DNA Cell Biol 2018; 38:49-62. [PMID: 30346835 DOI: 10.1089/dna.2018.4276] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
Breast cancer has a high mortality rate for females. Aberrant DNA methylation plays a crucial role in the occurrence and progression of breast carcinoma. By comparing DNA methylation differences between tumor breast tissue and normal breast tissue, we calculate and analyze the distributions of the hyper- and hypomethylation sites in different function regions. Results indicate that enhancer regions are often hypomethylated in breast cancer. CpG islands (CGIs) are mainly hypermethylated, while the flanking CGI (shores and shelves) is more easily hypomethylated. The hypomethylation in gene body region is related to the upregulation of gene expression, and the hypomethylation of enhancer regions is closely associated with gene expression upregulation in breast cancer. Some key hypomethylation sites in enhancer regions and key hypermethylation sites in CGIs for regulating key genes are, respectively, found, such as oncogenes ESR1 and ERBB2 and tumor suppressor genes FBLN2, CEBPA, and FAT4. This suggests that the recognizing methylation status of these genes will be useful for the diagnosis of breast cancer.
Collapse
Affiliation(s)
- Wen Jin
- 1 Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University , Hohhot, China
| | - Qian-Zhong Li
- 1 Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University , Hohhot, China .,2 The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Inner Mongolia University , Hohhot, China
| | - Yong-Chun Zuo
- 2 The State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Inner Mongolia University , Hohhot, China
| | - Yan-Ni Cao
- 1 Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University , Hohhot, China
| | - Lu-Qiang Zhang
- 1 Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University , Hohhot, China
| | - Rui Hou
- 1 Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University , Hohhot, China
| | - Wen-Xia Su
- 3 College of Science, Inner Mongolia Agricultural University , Hohhot, China
| |
Collapse
|
9
|
Genome-wide analysis of H3K36me3 and its regulations to cancer-related genes expression in human cell lines. Biosystems 2018; 171:59-65. [DOI: 10.1016/j.biosystems.2018.07.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Revised: 07/01/2018] [Accepted: 07/09/2018] [Indexed: 01/11/2023]
|
10
|
Chen D, Fu LY, Hu D, Klukas C, Chen M, Kaufmann K. The HTPmod Shiny application enables modeling and visualization of large-scale biological data. Commun Biol 2018; 1:89. [PMID: 30271970 PMCID: PMC6123733 DOI: 10.1038/s42003-018-0091-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Accepted: 06/03/2018] [Indexed: 01/20/2023] Open
Abstract
The wave of high-throughput technologies in genomics and phenomics are enabling data to be generated on an unprecedented scale and at a reasonable cost. Exploring the large-scale data sets generated by these technologies to derive biological insights requires efficient bioinformatic tools. Here we introduce an interactive, open-source web application (HTPmod) for high-throughput biological data modeling and visualization. HTPmod is implemented with the Shiny framework by integrating the computational power and professional visualization of R and including various machine-learning approaches. We demonstrate that HTPmod can be used for modeling and visualizing large-scale, high-dimensional data sets (such as multiple omics data) under a broad context. By reinvestigating example data sets from recent studies, we find not only that HTPmod can reproduce results from the original studies in a straightforward fashion and within a reasonable time, but also that novel insights may be gained from fast reinvestigation of existing data by HTPmod. Dijun Chen et al. present HTPmod, a Shiny web application for modeling and visualization of large-scale genomic and phenomic datasets. The authors show that HTPmod can quickly reproduce analyses of high-throughput biological datasets and produce publication-quality figures.
Collapse
Affiliation(s)
- Dijun Chen
- Department for Plant Cell and Molecular Biology, Institute for Biology, Humboldt-Universität zu Berlin, Berlin, 10115, Germany. .,Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstrasse 3, Gatersleben, 06466, Germany.
| | - Liang-Yu Fu
- Department for Plant Cell and Molecular Biology, Institute for Biology, Humboldt-Universität zu Berlin, Berlin, 10115, Germany
| | - Dahui Hu
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Christian Klukas
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstrasse 3, Gatersleben, 06466, Germany.,Digitalization in Research & Development (ROM), BASF SE, Ludwigshafen am Rhein, 67056, Germany
| | - Ming Chen
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China.
| | - Kerstin Kaufmann
- Department for Plant Cell and Molecular Biology, Institute for Biology, Humboldt-Universität zu Berlin, Berlin, 10115, Germany.
| |
Collapse
|
11
|
Zhang LQ, Li QZ. Estimating the effects of transcription factors binding and histone modifications on gene expression levels in human cells. Oncotarget 2018; 8:40090-40103. [PMID: 28454114 PMCID: PMC5522221 DOI: 10.18632/oncotarget.16988] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2016] [Accepted: 03/11/2017] [Indexed: 12/22/2022] Open
Abstract
Transcription factors and histone modifications are vital for the regulation of gene expression. Hence, to estimate the effects of transcription factors binding and histone modifications on gene expression, we construct a statistical model for the genome-wide 15 transcription factors binding data, 10 histone modifications profiles and DNase-I hypersensitivity data in three mammalian. Remarkably, our results show POLR2A and H3K36me3 can highly and consistently predict gene expression in three cell lines. And H3K4me3, H3K27me3 and H3K9ac are more reliable predictors than other histone modifications in human embryonic stem cells. Moreover, genome-wide statistical redundancies exist within and between transcription factors and histone modifications, and these phenomena may be caused by the regulation mechanism. In further study, we find that even though transcription factors and histone modifications offer similar effects on expression levels of genome-wide genes, the effects of transcription factors and histone modifications on predictive abilities are different for genes in independent biological processes.
Collapse
Affiliation(s)
- Lu-Qiang Zhang
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot, China
| | - Qian-Zhong Li
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot, China
| |
Collapse
|