1
|
ChIP-GSM: Inferring active transcription factor modules to predict functional regulatory elements. PLoS Comput Biol 2021; 17:e1009203. [PMID: 34292930 PMCID: PMC8330942 DOI: 10.1371/journal.pcbi.1009203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 08/03/2021] [Accepted: 06/20/2021] [Indexed: 11/19/2022] Open
Abstract
Transcription factors (TFs) often function as a module including both master factors and mediators binding at cis-regulatory regions to modulate nearby gene transcription. ChIP-seq profiling of multiple TFs makes it feasible to infer functional TF modules. However, when inferring TF modules based on co-localization of ChIP-seq peaks, often many weak binding events are missed, especially for mediators, resulting in incomplete identification of modules. To address this problem, we develop a ChIP-seq data-driven Gibbs Sampler to infer Modules (ChIP-GSM) using a Bayesian framework that integrates ChIP-seq profiles of multiple TFs. ChIP-GSM samples read counts of module TFs iteratively to estimate the binding potential of a module to each region and, across all regions, estimates the module abundance. Using inferred module-region probabilistic bindings as feature units, ChIP-GSM then employs logistic regression to predict active regulatory elements. Validation of ChIP-GSM predicted regulatory regions on multiple independent datasets sharing the same context confirms the advantage of using TF modules for predicting regulatory activity. In a case study of K562 cells, we demonstrate that the ChIP-GSM inferred modules form as groups, activate gene expression at different time points, and mediate diverse functional cellular processes. Hence, ChIP-GSM infers biologically meaningful TF modules and improves the prediction accuracy of regulatory region activities.
Collapse
|
2
|
Wang YXR, Li L, Li JJ, Huang H. Network Modeling in Biology: Statistical Methods for Gene and Brain Networks. Stat Sci 2021; 36:89-108. [PMID: 34305304 PMCID: PMC8296984 DOI: 10.1214/20-sts792] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The rise of network data in many different domains has offered researchers new insight into the problem of modeling complex systems and propelled the development of numerous innovative statistical methodologies and computational tools. In this paper, we primarily focus on two types of biological networks, gene networks and brain networks, where statistical network modeling has found both fruitful and challenging applications. Unlike other network examples such as social networks where network edges can be directly observed, both gene and brain networks require careful estimation of edges using covariates as a first step. We provide a discussion on existing statistical and computational methods for edge esitimation and subsequent statistical inference problems in these two types of biological networks.
Collapse
Affiliation(s)
- Y X Rachel Wang
- School of Mathematics and Statistics, University of Sydney, Australia
| | - Lexin Li
- Department of Biostatistics and Epidemiology, School of Public Health, University of California, Berkeley
| | | | - Haiyan Huang
- Department of Statistics, University of California, Berkeley
| |
Collapse
|
3
|
3Scover: Identifying Safeguard TF from Cell Type-TF Specificity Network by an Extended Minimum Set Cover Model. iScience 2020; 23:101227. [PMID: 32554189 PMCID: PMC7303665 DOI: 10.1016/j.isci.2020.101227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 04/27/2020] [Accepted: 05/28/2020] [Indexed: 11/22/2022] Open
Abstract
Transcription factors (TFs) define cellular identity either by activating target cell program or by silencing donor program as demonstrated by intensive cell reprogramming studies. Here, we propose an extended minimum set cover model with stable selection (3Scover) to systematically identify silencing TFs, named safeguard TFs, from omics data. First, a cell type-TF specificity network is constructed to systematically link cell types with their specifically expressed TFs. Then we search the minimum TF set to cover this network with “many but one specificity” characteristic and integrate many subsampling models for a stable solution. 3Scover identified 30 safeguard TFs in human and mouse. These safeguard TFs are significantly enriched in the experimentally discovered reprogramming panel with their protein-protein interactors. In addition, they tend to interact closely with chromatin regulators, negatively regulate transcription, and function earlier in development. Collectively, 3Scover allows us to probe master TFs and combinatorial regulation in controlling cell identity. Cell type-TF specificity networks reveal the relationships among TF and cell identity 3SCover extracts safeguard TFs by “many but one specificity” and parsimony principle Safeguard TFs are enriched in reprogramming panel and interact closely with CR Safeguard TFs are conserved in mouse and human
Collapse
|
4
|
Wong KC, Yan S, Lin Q, Li X, Peng C. Deleterious Non-Synonymous Single Nucleotide Polymorphism Predictions on Human Transcription Factors. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:327-333. [PMID: 30475727 DOI: 10.1109/tcbb.2018.2882548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Transcription factors (TFs) are the major components of human gene regulation. In particular, they bind onto specific DNA sequences and regulate neighborhood genes in different tissues at different developmental stages. Non-synonymous single nucleotide polymorphisms on its protein-coding sequences could result in undesired consequences in human. Therefore, it is necessary to develop methods for predicting any abnormality among those non-synonymous single nucleotide polymorphisms. To address it, we have developed and compared different strategies to predict deleterious non-synonymous single nucleotide polymorphisms (also known as missense mutations) on the protein-coding sequences of human TFs. Taking advantage of evolutionary conservation signals, we have developed and compared different classifiers with different feature sets as computed from different evolutionarily related sequence collections. The results indicate that the classic ensemble algorithm, Adaboost with decision stumps, with orthologous sequence collection, has performed the best (namely, TFmedic). We have further compared TFmedic with other state-of-the-arts methods (i.e., PolyPhen-2 and SIFT) on PolyPhen-2's own datasets, demonstrating that TFmedic can outperform the others. As applications, we have further applied TFmedic to all possible missense mutations on all human transcription factors; the proteome-wide results reveal interesting insights, consistent with the existing physiochemical knowledge. A case study with the actual 3D structure is conducted, revealing how TFmedic can be contributed to protein-DNA binding complex studies.
Collapse
|
5
|
Ho SY, Chang BH, Chung CH, Lin YL, Chuang CH, Hsieh PJ, Huang WC, Tsai NM, Huang SC, Liu YK, Lo YC, Liao KW. Development of a computational promoter with highly efficient expression in tumors. BMC Cancer 2018; 18:480. [PMID: 29703163 PMCID: PMC5924487 DOI: 10.1186/s12885-018-4421-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Accepted: 04/22/2018] [Indexed: 11/25/2022] Open
Abstract
Background Gene therapy is a potent method to increase the therapeutic efficacy against cancer. However, a gene that is specifically expressed in the tumor area has not been identified. In addition, nonspecific expression of therapeutic genes in normal tissues may cause side effects that can harm the patients’ health. Certain promoters have been reported to drive therapeutic gene expression specifically in cancer cells; however, low expression levels of the target gene are a problem for providing good therapeutic efficacy. Therefore, a specific and highly expressive promoter is needed for cancer gene therapy. Methods Bioinformatics approaches were utilized to analyze transcription factors (TFs) from high-throughput data. Reverse transcription polymerase chain reaction, western blotting and cell transfection were applied for the measurement of mRNA, protein expression and activity. C57BL/6JNarl mice were injected with pD5-hrGFP to evaluate the expression of TFs. Results We analyzed bioinformatics data and identified three TFs, nuclear factor kappa-light-chain-enhancer of activated B cells (NF-κB), cyclic AMP response element binding protein (CREB), and hypoxia-inducible factor-1α (HIF-1α), that are highly active in tumor cells. Here, we constructed a novel mini-promoter, D5, that is composed of the binding sites of the three TFs. The results show that the D5 promoter specifically drives therapeutic gene expression in tumor tissues and that the strength of the D5 promoter is directly proportional to tumor size. Conclusions Our results show that bioinformatics may be a good tool for the selection of appropriate TFs and for the design of specific mini-promoters to improve cancer gene therapy. Electronic supplementary material The online version of this article (10.1186/s12885-018-4421-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shu-Yi Ho
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan, Republic of China
| | - Bo-Hau Chang
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan, Republic of China
| | - Chen-Han Chung
- Institute of Molecular Medicine and Bioengineering, National Chiao Tung University, Hsinchu, 30050, Taiwan, Republic of China
| | - Yu-Ling Lin
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan, Republic of China.,Center for Bioinformatics Research, National Chiao Tung University, Hsinchu, Taiwan, Republic of China
| | - Cheng-Hsun Chuang
- Institute of Molecular Medicine and Bioengineering, National Chiao Tung University, Hsinchu, 30050, Taiwan, Republic of China
| | - Pei-Jung Hsieh
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan, Republic of China
| | - Wei-Chih Huang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 300, Taiwan, Republic of China
| | - Nu-Man Tsai
- School of Medical and Laboratory Biotechnology, Chung Shan Medical University, Taichung, Taiwan, Republic of China.,Clinical Laboratory, Chung Shan Medical University Hospital, Taichung, Taiwan
| | - Sheng-Chieh Huang
- Department of Surgery, National Yang Ming University, Taipei, Taiwan, Republic of China.,Division of Colon and Rectal surgery, Department of surgery, Taipei Veteran General Hospital, Taipei, Taiwan, Republic of China
| | - Yen-Ku Liu
- Institute of Molecular Medicine and Bioengineering, National Chiao Tung University, Hsinchu, 30050, Taiwan, Republic of China
| | - Yu-Chih Lo
- Department of Biotechnology and Bioindustry Sciences, College of Bioscience and Biotechnology, National Cheng Kung University, Tainan, Taiwan, Republic of China
| | - Kuang-Wen Liao
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan, Republic of China. .,Institute of Molecular Medicine and Bioengineering, National Chiao Tung University, Hsinchu, 30050, Taiwan, Republic of China. .,College of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan, Republic of China. .,Graduate Institute of Medicine, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan, Republic of China.
| |
Collapse
|
6
|
Natkańska U, Skoneczna A, Sieńko M, Skoneczny M. The budding yeast orthologue of Parkinson's disease-associated DJ-1 is a multi-stress response protein protecting cells against toxic glycolytic products. BIOCHIMICA ET BIOPHYSICA ACTA-MOLECULAR CELL RESEARCH 2017; 1864:39-50. [DOI: 10.1016/j.bbamcr.2016.10.016] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Revised: 10/20/2016] [Accepted: 10/25/2016] [Indexed: 12/13/2022]
|
7
|
Wu WS, Lai FJ. Detecting Cooperativity between Transcription Factors Based on Functional Coherence and Similarity of Their Target Gene Sets. PLoS One 2016; 11:e0162931. [PMID: 27623007 PMCID: PMC5021274 DOI: 10.1371/journal.pone.0162931] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Accepted: 08/30/2016] [Indexed: 11/22/2022] Open
Abstract
In eukaryotic cells, transcriptional regulation of gene expression is usually achieved by cooperative transcription factors (TFs). Therefore, knowing cooperative TFs is the first step toward uncovering the molecular mechanisms of gene expression regulation. Many algorithms based on different rationales have been proposed to predict cooperative TF pairs in yeast. Although various types of rationales have been used in the existing algorithms, functional coherence is not yet used. This prompts us to develop a new algorithm based on functional coherence and similarity of the target gene sets to identify cooperative TF pairs in yeast. The proposed algorithm predicted 40 cooperative TF pairs. Among them, three (Pdc2-Thi2, Hot1-Msn1 and Leu3-Met28) are novel predictions, which have not been predicted by any existing algorithms. Strikingly, two (Pdc2-Thi2 and Hot1-Msn1) of the three novel predictions have been experimentally validated, demonstrating the power of the proposed algorithm. Moreover, we show that the predictions of the proposed algorithm are more biologically meaningful than the predictions of 17 existing algorithms under four evaluation indices. In summary, our study suggests that new algorithms based on novel rationales are worthy of developing for detecting previously unidentifiable cooperative TF pairs.
Collapse
Affiliation(s)
- Wei-Sheng Wu
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
- * E-mail:
| | - Fu-Jou Lai
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
8
|
Wang J, Liu Q, Sun J, Shyr Y. Disrupted cooperation between transcription factors across diverse cancer types. BMC Genomics 2016; 17:560. [PMID: 27496222 PMCID: PMC4975902 DOI: 10.1186/s12864-016-2842-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2015] [Accepted: 06/15/2016] [Indexed: 12/21/2022] Open
Abstract
Background Transcription Factors (TFs), essential for many cellular processes, generally work coordinately to induce transcriptional change in response to internal and external signals. Disrupted cooperation between TFs, leading to dysregulation of target genes, contributes to the pathogenesis of many diseases, including cancer. Although the aberrant activation of individual TFs and the functional effects have been widely studied, the perturbation of TF cooperativity in cancer has rarely been explored. Results We used TF co-expression as proxy as cooperativity and performed a large-scale study on disrupted TF cooperation across seven cancer types. While the connectivity of downstream effectors, like metabolic genes and TF targets, were more or similarly disrupted than/with non-TFs, the cooperativity of TFs (upstream regulators) were consistently less disturbed in all studied cancer types. Highly coordinated TFs in normal, however, generally lost that cooperation in cancer. Although different types of cancer shared very few TF pairs with highly disrupted cooperation, the cooperativity of interferon regulatory factors (IRF) was highly disrupted in six cancer types. Specifically, the cooperativity of IRF8 was highly perturbed in lung cancer, which was further validated by two independent lung squamous cell carcinoma (LUSC) and lung adenocarcinoma (LUAD) datasets. More interestingly, the cooperativity of IRF8 was markedly associated with tumor progression and even contributed to the patient survival independent of tumor stage. Conclusions Our findings underscore the far more important role of TF cooperativity in tumorigenesis than previously appreciated. Disrupted cooperation of TFs provides potential clinical utility as prognostic markers for predicting the patient survival. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2842-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jing Wang
- Center for Quantitative Sciences, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Qi Liu
- Center for Quantitative Sciences, Vanderbilt University School of Medicine, Nashville, TN, USA.,Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Jingchun Sun
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Yu Shyr
- Center for Quantitative Sciences, Vanderbilt University School of Medicine, Nashville, TN, USA. .,Department of Cancer Biology, Vanderbilt University School of Medicine, Nashville, TN, USA. .,Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, TN, USA.
| |
Collapse
|
9
|
Wu WS, Hsieh YC, Lai FJ. YCRD: Yeast Combinatorial Regulation Database. PLoS One 2016; 11:e0159213. [PMID: 27392072 PMCID: PMC4938206 DOI: 10.1371/journal.pone.0159213] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 06/28/2016] [Indexed: 12/21/2022] Open
Abstract
In eukaryotes, the precise transcriptional control of gene expression is typically achieved through combinatorial regulation using cooperative transcription factors (TFs). Therefore, a database which provides regulatory associations between cooperative TFs and their target genes is helpful for biologists to study the molecular mechanisms of transcriptional regulation of gene expression. Because there is no such kind of databases in the public domain, this prompts us to construct a database, called Yeast Combinatorial Regulation Database (YCRD), which deposits 434,197 regulatory associations between 2535 cooperative TF pairs and 6243 genes. The comprehensive collection of more than 2500 cooperative TF pairs was retrieved from 17 existing algorithms in the literature. The target genes of a cooperative TF pair (e.g. TF1-TF2) are defined as the common target genes of TF1 and TF2, where a TF’s experimentally validated target genes were downloaded from YEASTRACT database. In YCRD, users can (i) search the target genes of a cooperative TF pair of interest, (ii) search the cooperative TF pairs which regulate a gene of interest and (iii) identify important cooperative TF pairs which regulate a given set of genes. We believe that YCRD will be a valuable resource for yeast biologists to study combinatorial regulation of gene expression. YCRD is available at http://cosbi.ee.ncku.edu.tw/YCRD/ or http://cosbi2.ee.ncku.edu.tw/YCRD/.
Collapse
Affiliation(s)
- Wei-Sheng Wu
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
- * E-mail:
| | - Yen-Chen Hsieh
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Fu-Jou Lai
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
10
|
Wu WS, Lai FJ, Tu BW, Chang DTH. CoopTFD: a repository for predicted yeast cooperative transcription factor pairs. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw092. [PMID: 27242036 PMCID: PMC4885606 DOI: 10.1093/database/baw092] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2016] [Accepted: 05/09/2016] [Indexed: 01/22/2023]
Abstract
In eukaryotic cells, transcriptional regulation of gene expression is usually accomplished by cooperative Transcription Factors (TFs). Therefore, knowing cooperative TFs is helpful for uncovering the mechanisms of transcriptional regulation. In yeast, many cooperative TF pairs have been predicted by various algorithms in the literature. However, until now, there is still no database which collects the predicted yeast cooperative TFs from existing algorithms. This prompts us to construct Cooperative Transcription Factors Database (CoopTFD), which has a comprehensive collection of 2622 predicted cooperative TF pairs (PCTFPs) in yeast from 17 existing algorithms. For each PCTFP, our database also provides five types of validation information: (i) the algorithms which predict this PCTFP, (ii) the publications which experimentally show that this PCTFP has physical or genetic interactions, (iii) the publications which experimentally study the biological roles of both TFs of this PCTFP, (iv) the common Gene Ontology (GO) terms of this PCTFP and (v) the common target genes of this PCTFP. Based on the provided validation information, users can judge the biological plausibility of a PCTFP of interest. We believe that CoopTFD will be a valuable resource for yeast biologists to study the combinatorial regulation of gene expression controlled by cooperative TFs. Database URL:http://cosbi.ee.ncku.edu.tw/CoopTFD/ or http://cosbi2.ee.ncku.edu.tw/CoopTFD/
Collapse
Affiliation(s)
- Wei-Sheng Wu
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| | - Fu-Jou Lai
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| | - Bor-Wen Tu
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| | - Darby Tien-Hao Chang
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| |
Collapse
|
11
|
Abstract
The laboratory mouse is the primary mammalian species used for studying alternative splicing events. Recent studies have generated computational models to predict functions for splice isoforms in the mouse. However, the functional relationship network, describing the probability of splice isoforms participating in the same biological process or pathway, has not yet been studied in the mouse. Here we describe a rich genome-wide resource of mouse networks at the isoform level, which was generated using a unique framework that was originally developed to infer isoform functions. This network was built through integrating heterogeneous genomic and protein data, including RNA-seq, exon array, protein docking and pseudo-amino acid composition. Through simulation and cross-validation studies, we demonstrated the accuracy of the algorithm in predicting isoform-level functional relationships. We showed that this network enables the users to reveal functional differences of the isoforms of the same gene, as illustrated by literature evidence with Anxa6 (annexin a6) as an example. We expect this work will become a useful resource for the mouse genetics community to understand gene functions. The network is publicly available at: http://guanlab.ccmb.med.umich.edu/isoformnetwork.
Collapse
|
12
|
Lai FJ, Chang HT, Wu WS. PCTFPeval: a web tool for benchmarking newly developed algorithms for predicting cooperative transcription factor pairs in yeast. BMC Bioinformatics 2015; 16 Suppl 18:S2. [PMID: 26677932 PMCID: PMC4682397 DOI: 10.1186/1471-2105-16-s18-s2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Background Computational identification of cooperative transcription factor (TF) pairs helps understand the combinatorial regulation of gene expression in eukaryotic cells. Many advanced algorithms have been proposed to predict cooperative TF pairs in yeast. However, it is still difficult to conduct a comprehensive and objective performance comparison of different algorithms because of lacking sufficient performance indices and adequate overall performance scores. To solve this problem, in our previous study (published in BMC Systems Biology 2014), we adopted/proposed eight performance indices and designed two overall performance scores to compare the performance of 14 existing algorithms for predicting cooperative TF pairs in yeast. Most importantly, our performance comparison framework can be applied to comprehensively and objectively evaluate the performance of a newly developed algorithm. However, to use our framework, researchers have to put a lot of effort to construct it first. To save researchers time and effort, here we develop a web tool to implement our performance comparison framework, featuring fast data processing, a comprehensive performance comparison and an easy-to-use web interface. Results The developed tool is called PCTFPeval (Predicted Cooperative TF Pair evaluator), written in PHP and Python programming languages. The friendly web interface allows users to input a list of predicted cooperative TF pairs from their algorithm and select (i) the compared algorithms among the 15 existing algorithms, (ii) the performance indices among the eight existing indices, and (iii) the overall performance scores from two possible choices. The comprehensive performance comparison results are then generated in tens of seconds and shown as both bar charts and tables. The original comparison results of each compared algorithm and each selected performance index can be downloaded as text files for further analyses. Conclusions Allowing users to select eight existing performance indices and 15 existing algorithms for comparison, our web tool benefits researchers who are eager to comprehensively and objectively evaluate the performance of their newly developed algorithm. Thus, our tool greatly expedites the progress in the research of computational identification of cooperative TF pairs.
Collapse
|
13
|
Wu WS, Lai FJ. Properly defining the targets of a transcription factor significantly improves the computational identification of cooperative transcription factor pairs in yeast. BMC Genomics 2015; 16 Suppl 12:S10. [PMID: 26679776 PMCID: PMC4682405 DOI: 10.1186/1471-2164-16-s12-s10] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Background Transcriptional regulation of gene expression in eukaryotes is usually accomplished by cooperative transcription factors (TFs). Computational identification of cooperative TF pairs has become a hot research topic and many algorithms have been proposed in the literature. A typical algorithm for predicting cooperative TF pairs has two steps. (Step 1) Define the targets of each TF under study. (Step 2) Design a measure for calculating the cooperativity of a TF pair based on the targets of these two TFs. While different algorithms have distinct sophisticated cooperativity measures, the targets of a TF are usually defined using ChIP-chip data. However, there is an inherent weakness in using ChIP-chip data to define the targets of a TF. ChIP-chip analysis can only identify the binding targets of a TF but it cannot distinguish the true regulatory from the binding but non-regulatory targets of a TF. Results This work is the first study which aims to investigate whether the performance of computational identification of cooperative TF pairs could be improved by using a more biologically relevant way to define the targets of a TF. For this purpose, we propose four simple algorithms, all of which consist of two steps. (Step 1) Define the targets of a TF using (i) ChIP-chip data in the first algorithm, (ii) TF binding data in the second algorithm, (iii) TF perturbation data in the third algorithm, and (iv) the intersection of TF binding and TF perturbation data in the fourth algorithm. Compared with the first three algorithms, the fourth algorithm uses a more biologically relevant way to define the targets of a TF. (Step 2) Measure the cooperativity of a TF pair by the statistical significance of the overlap of the targets of these two TFs using the hypergeometric test. By adopting four existing performance indices, we show that the fourth proposed algorithm (PA4) significantly out performs the other three proposed algorithms. This suggests that the computational identification of cooperative TF pairs is indeed improved when using a more biologically relevant way to define the targets of a TF. Strikingly, the prediction results of our simple PA4 are more biologically meaningful than those of the 12 existing sophisticated algorithms in the literature, all of which used ChIP-chip data to define the targets of a TF. This suggests that properly defining the targets of a TF may be more important than designing sophisticated cooperativity measures. In addition, our PA4 has the power to predict several experimentally validated cooperative TF pairs, which have not been successfully predicted by any existing algorithms in the literature. Conclusions This study shows that the performance of computational
identification of cooperative TF pairs could be improved by using a more biologically relevant way to define the targets of a TF. The main contribution of this study is not to propose another new algorithm but to provide a new thinking for the research of computational identification of cooperative TF pairs. Researchers should put more effort on properly defining the targets of a TF (i.e. Step 1) rather than totally focus on designing sophisticated cooperativity measures (i.e. Step 2). The lists of TF target genes, the Matlab codes and the prediction results of the four proposed algorithms could be downloaded from our companion website http://cosbi3.ee.ncku.edu.tw/TFI/
Collapse
|
14
|
Predicting the functions of long noncoding RNAs using RNA-seq based on Bayesian network. BIOMED RESEARCH INTERNATIONAL 2015; 2015:839590. [PMID: 25815337 PMCID: PMC4359839 DOI: 10.1155/2015/839590] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2014] [Revised: 02/05/2015] [Accepted: 02/06/2015] [Indexed: 02/01/2023]
Abstract
Long noncoding RNAs (lncRNAs) have been shown to play key roles in various biological processes. However, functions of most lncRNAs are poorly characterized. Here, we represent a framework to predict functions of lncRNAs through construction of a regulatory network between lncRNAs and protein-coding genes. Using RNA-seq data, the transcript profiles of lncRNAs and protein-coding genes are constructed. Using the Bayesian network method, a regulatory network, which implies dependency relations between lncRNAs and protein-coding genes, was built. In combining protein interaction network, highly connected coding genes linked by a given lncRNA were subsequently used to predict functions of the lncRNA through functional enrichment. Application of our method to prostate RNA-seq data showed that 762 lncRNAs in the constructed regulatory network were assigned functions. We found that lncRNAs are involved in diverse biological processes, such as tissue development or embryo development (e.g., nervous system development and mesoderm development). By comparison with functions inferred using the neighboring gene-based method and functions determined using lncRNA knockdown experiments, our method can provide comparable predicted functions of lncRNAs. Overall, our method can be applied to emerging RNA-seq data, which will help researchers identify complex relations between lncRNAs and coding genes and reveal important functions of lncRNAs.
Collapse
|
15
|
Evolutionary Developmental Biology and the Limits of Philosophical Accounts of Mechanistic Explanation. HISTORY, PHILOSOPHY AND THEORY OF THE LIFE SCIENCES 2015. [DOI: 10.1007/978-94-017-9822-8_7] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
16
|
Lai FJ, Jhu MH, Chiu CC, Huang YM, Wu WS. Identifying cooperative transcription factors in yeast using multiple data sources. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 5:S2. [PMID: 25559499 PMCID: PMC4305981 DOI: 10.1186/1752-0509-8-s5-s2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Transcriptional regulation of gene expression is usually accomplished by multiple interactive transcription factors (TFs). Therefore, it is crucial to understand the precise cooperative interactions among TFs. Various kinds of experimental data including ChIP-chip, TF binding site (TFBS), gene expression, TF knockout and protein-protein interaction data have been used to identify cooperative TF pairs in existing methods. The nucleosome occupancy data is not yet used for this research topic despite that several researches have revealed the association between nucleosomes and TFBSs. RESULTS In this study, we developed a novel method to infer the cooperativity between two TFs by integrating the TF-gene documented regulation, TFBS and nucleosome occupancy data. TF-gene documented regulation and TFBS data were used to determine the target genes of a TF, and the genome-wide nucleosome occupancy data was used to assess the nucleosome occupancy on TFBSs. Our method identifies cooperative TF pairs based on two biologically plausible assumptions. If two TFs cooperate, then (i) they should have a significantly higher number of common target genes than random expectation and (ii) their binding sites (in the promoters of their common target genes) should tend to be co-depleted of nucleosomes in order to make these binding sites simultaneously accessible to TF binding. Each TF pair is given a cooperativity score by our method. The higher the score is, the more likely a TF pair has cooperativity. Finally, a list of 27 cooperative TF pairs has been predicted by our method. Among these 27 TF pairs, 19 pairs are also predicted by existing methods. The other 8 pairs are novel cooperative TF pairs predicted by our method. The biological relevance of these 8 novel cooperative TF pairs is justified by the existence of protein-protein interactions and co-annotation in the same MIPS functional categories. Moreover, we adopted three performance indices to compare our predictions with 11 existing methods' predictions. We show that our method performs better than these 11 existing methods in identifying cooperative TF pairs in yeast. Finally, the cooperative TF network constructed from the 27 predicted cooperative TF pairs shows that our method has the power to find cooperative TF pairs of different biological processes. CONCLUSION Our method is effective in identifying cooperative TF pairs in yeast. Many of our predictions are validated by the literature, and our method outperforms 11 existing methods. We believe that our study will help biologists to understand the mechanisms of transcriptional regulation in eukaryotic cells.
Collapse
|
17
|
Lai FJ, Chang HT, Huang YM, Wu WS. A comprehensive performance evaluation on the prediction results of existing cooperative transcription factors identification algorithms. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 4:S9. [PMID: 25521604 PMCID: PMC4290732 DOI: 10.1186/1752-0509-8-s4-s9] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Background Eukaryotic transcriptional regulation is known to be highly connected through the networks of cooperative transcription factors (TFs). Measuring the cooperativity of TFs is helpful for understanding the biological relevance of these TFs in regulating genes. The recent advances in computational techniques led to various predictions of cooperative TF pairs in yeast. As each algorithm integrated different data resources and was developed based on different rationales, it possessed its own merit and claimed outperforming others. However, the claim was prone to subjectivity because each algorithm compared with only a few other algorithms and only used a small set of performance indices for comparison. This motivated us to propose a series of indices to objectively evaluate the prediction performance of existing algorithms. And based on the proposed performance indices, we conducted a comprehensive performance evaluation. Results We collected 14 sets of predicted cooperative TF pairs (PCTFPs) in yeast from 14 existing algorithms in the literature. Using the eight performance indices we adopted/proposed, the cooperativity of each PCTFP was measured and a ranking score according to the mean cooperativity of the set was given to each set of PCTFPs under evaluation for each performance index. It was seen that the ranking scores of a set of PCTFPs vary with different performance indices, implying that an algorithm used in predicting cooperative TF pairs is of strength somewhere but may be of weakness elsewhere. We finally made a comprehensive ranking for these 14 sets. The results showed that Wang J's study obtained the best performance evaluation on the prediction of cooperative TF pairs in yeast. Conclusions In this study, we adopted/proposed eight performance indices to make a comprehensive performance evaluation on the prediction results of 14 existing cooperative TFs identification algorithms. Most importantly, these proposed indices can be easily applied to measure the performance of new algorithms developed in the future, thus expedite progress in this research field.
Collapse
|
18
|
Wang YXR, Huang H. Review on statistical methods for gene network reconstruction using expression data. J Theor Biol 2014; 362:53-61. [PMID: 24726980 DOI: 10.1016/j.jtbi.2014.03.040] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2014] [Revised: 03/29/2014] [Accepted: 03/31/2014] [Indexed: 12/16/2022]
Abstract
Network modeling has proven to be a fundamental tool in analyzing the inner workings of a cell. It has revolutionized our understanding of biological processes and made significant contributions to the discovery of disease biomarkers. Much effort has been devoted to reconstruct various types of biochemical networks using functional genomic datasets generated by high-throughput technologies. This paper discusses statistical methods used to reconstruct gene regulatory networks using gene expression data. In particular, we highlight progress made and challenges yet to be met in the problems involved in estimating gene interactions, inferring causality and modeling temporal changes of regulation behaviors. As rapid advances in technologies have made available diverse, large-scale genomic data, we also survey methods of incorporating all these additional data to achieve better, more accurate inference of gene networks.
Collapse
Affiliation(s)
- Y X Rachel Wang
- Department of Statistics, University of California, Berkeley, CA 94720, USA.
| | - Haiyan Huang
- Department of Statistics, University of California, Berkeley, CA 94720, USA.
| |
Collapse
|
19
|
Cieślik M, Bekiranov S. Combinatorial epigenetic patterns as quantitative predictors of chromatin biology. BMC Genomics 2014; 15:76. [PMID: 24472558 PMCID: PMC3922690 DOI: 10.1186/1471-2164-15-76] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2013] [Accepted: 01/15/2014] [Indexed: 01/01/2023] Open
Abstract
Background Chromatin immunoprecipitation followed by deep sequencing (ChIP-seq) is the most widely used method for characterizing the epigenetic states of chromatin on a genomic scale. With the recent availability of large genome-wide data sets, often comprising several epigenetic marks, novel approaches are required to explore functionally relevant interactions between histone modifications. Computational discovery of "chromatin states" defined by such combinatorial interactions enabled descriptive annotations of genomes, but more quantitative approaches are needed to progress towards predictive models. Results We propose non-negative matrix factorization (NMF) as a new unsupervised method to discover combinatorial patterns of epigenetic marks that frequently co-occur in subsets of genomic regions. We show that this small set of combinatorial "codes" can be effectively displayed and interpreted. NMF codes enable dimensionality reduction and have desirable statistical properties for regression and classification tasks. We demonstrate the utility of codes in the quantitative prediction of Pol2-binding and the discrimination between Pol2-bound promoters and enhancers. Finally, we show that specific codes can be linked to molecular pathways and targets of pluripotency genes during differentiation. Conclusions We have introduced and evaluated a new computational approach to represent combinatorial patterns of epigenetic marks as quantitative variables suitable for predictive modeling and supervised machine learning. To foster widespread adoption of this method we make it available as an open-source software-package – epicode at
https://github.com/mcieslik-mctp/epicode.
Collapse
Affiliation(s)
- Marcin Cieślik
- Department of Biochemistry and Molecular Genetics, University of Virginia Health System, Charlottesville, Virginia, USA.
| | | |
Collapse
|
20
|
Brigandt I. Systems biology and the integration of mechanistic explanation and mathematical explanation. STUDIES IN HISTORY AND PHILOSOPHY OF BIOLOGICAL AND BIOMEDICAL SCIENCES 2013; 44:477-492. [PMID: 23863399 DOI: 10.1016/j.shpsc.2013.06.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2012] [Revised: 06/12/2013] [Accepted: 06/14/2013] [Indexed: 06/02/2023]
Abstract
The paper discusses how systems biology is working toward complex accounts that integrate explanation in terms of mechanisms and explanation by mathematical models-which some philosophers have viewed as rival models of explanation. Systems biology is an integrative approach, and it strongly relies on mathematical modeling. Philosophical accounts of mechanisms capture integrative in the sense of multilevel and multifield explanations, yet accounts of mechanistic explanation (as the analysis of a whole in terms of its structural parts and their qualitative interactions) have failed to address how a mathematical model could contribute to such explanations. I discuss how mathematical equations can be explanatorily relevant. Several cases from systems biology are discussed to illustrate the interplay between mechanistic research and mathematical modeling, and I point to questions about qualitative phenomena (rather than the explanation of quantitative details), where quantitative models are still indispensable to the explanation. Systems biology shows that a broader philosophical conception of mechanisms is needed, which takes into account functional-dynamical aspects, interaction in complex networks with feedback loops, system-wide functional properties such as distributed functionality and robustness, and a mechanism's ability to respond to perturbations (beyond its actual operation). I offer general conclusions for philosophical accounts of explanation.
Collapse
Affiliation(s)
- Ingo Brigandt
- Department of Philosophy, University of Alberta, 2-40 Assiniboia Hall, Edmonton, AB T6G2E7, Canada.
| |
Collapse
|
21
|
Hu P, Shen Z, Tu H, Zhang L, Shi T. Integrating multiple resources to identify specific transcriptional cooperativity with a Bayesian approach. ACTA ACUST UNITED AC 2013; 30:823-30. [PMID: 24192543 DOI: 10.1093/bioinformatics/btt596] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
MOTIVATION Limited cohort of transcription factors is capable to structure various gene-expression patterns. Transcriptional cooperativity (TC) is deemed to be the main mechanism of complexity and precision in regulatory programs. Although many data types generated from numerous experimental technologies are utilized in an attempt to understand combinational transcriptional regulation, complementary computational approach that can integrate diverse data resources and assimilate them into biological model is still under development. RESULTS We developed a novel Bayesian approach for integrative analysis of proteomic, transcriptomic and genomic data to identify specific TC. The model evaluation demonstrated distinguishable power of features derived from distinct data sources and their essentiality to model performance. Our model outperformed other classifiers and alternative methods. The application that contextualized TC within hepatocarcinogenesis revealed carcinoma associated alterations. Derived TC networks were highly significant in capturing validated cooperativity as well as revealing novel ones. Our methodology is the first multiple data integration approach to predict dynamic nature of TC. It is promising in identifying tissue- or disease-specific TC and can further facilitate the interpretation of underlying mechanisms for various physiological conditions. CONTACT tieliushi01@gmail.com SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pengzhan Hu
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai 200241, China
| | | | | | | | | |
Collapse
|
22
|
Cieślik M, Hoang SA, Baranova N, Chodaparambil S, Kumar M, Allison DF, Xu X, Wamsley JJ, Gray L, Jones DR, Mayo MW, Bekiranov S. Epigenetic coordination of signaling pathways during the epithelial-mesenchymal transition. Epigenetics Chromatin 2013; 6:28. [PMID: 24004852 PMCID: PMC3847279 DOI: 10.1186/1756-8935-6-28] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2013] [Accepted: 07/11/2013] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND The epithelial-mesenchymal transition (EMT) is a de-differentiation process required for wound healing and development. In tumors of epithelial origin aberrant induction of EMT contributes to cancer progression and metastasis. Studies have begun to implicate epigenetic reprogramming in EMT; however, the relationship between reprogramming and the coordination of cellular processes is largely unexplored. We have previously developed a system to study EMT in a canonical non-small cell lung cancer (NSCLC) model. In this system we have shown that the induction of EMT results in constitutive NF-κB activity. We hypothesized a role for chromatin remodeling in the sustained deregulation of cellular signaling pathways. RESULTS We mapped sixteen histone modifications and two variants for epithelial and mesenchymal states. Combinatorial patterns of epigenetic changes were quantified at gene and enhancer loci. We found a distinct chromatin signature among genes in well-established EMT pathways. Strikingly, these genes are only a small minority of those that are differentially expressed. At putative enhancers of genes with the 'EMT-signature' we observed highly coordinated epigenetic activation or repression. Furthermore, enhancers that are activated are bound by a set of transcription factors that is distinct from those that bind repressed enhancers. Upregulated genes with the 'EMT-signature' are upstream regulators of NF-κB, but are also bound by NF-κB at their promoters and enhancers. These results suggest a chromatin-mediated positive feedback as a likely mechanism for sustained NF-κB activation. CONCLUSIONS There is highly specific epigenetic regulation at genes and enhancers across several pathways critical to EMT. The sites of these changes in chromatin state implicate several inducible transcription factors with critical roles in EMT (NF-κB, AP-1 and MYC) as targets of this reprogramming. Furthermore, we find evidence that suggests that these transcription factors are in chromatin-mediated transcriptional feedback loops that regulate critical EMT genes. In sum, we establish an important link between chromatin remodeling and shifts in cellular reprogramming.
Collapse
Affiliation(s)
- Marcin Cieślik
- Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Ave, P.O. Box 800733, Charlottesville, VA 22908, USA
| | - Stephen A Hoang
- Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Ave, P.O. Box 800733, Charlottesville, VA 22908, USA
| | - Natalya Baranova
- Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Ave, P.O. Box 800733, Charlottesville, VA 22908, USA
| | - Sanjay Chodaparambil
- Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Ave, P.O. Box 800733, Charlottesville, VA 22908, USA
| | - Manish Kumar
- Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Ave, P.O. Box 800733, Charlottesville, VA 22908, USA
| | - David F Allison
- Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Ave, P.O. Box 800733, Charlottesville, VA 22908, USA
| | - Xiaojiang Xu
- Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Ave, P.O. Box 800733, Charlottesville, VA 22908, USA
| | - J Jacob Wamsley
- Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Ave, P.O. Box 800733, Charlottesville, VA 22908, USA
| | - Lisa Gray
- Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Ave, P.O. Box 800733, Charlottesville, VA 22908, USA
| | - David R Jones
- Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Ave, P.O. Box 800733, Charlottesville, VA 22908, USA.,Department of Surgery, University of Virginia, Charlottesville, VA 22908, USA
| | - Marty W Mayo
- Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Ave, P.O. Box 800733, Charlottesville, VA 22908, USA
| | - Stefan Bekiranov
- Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Ave, P.O. Box 800733, Charlottesville, VA 22908, USA
| |
Collapse
|
23
|
Le NT, Ho TB, Ho BH. Computational reconstruction of transcriptional relationships from ChIP-chip data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:300-307. [PMID: 22848139 DOI: 10.1109/tcbb.2012.102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
UNLABELLED Eukaryotic gene transcription is a complex process, which requires the orchestrated recruitment of a large number of proteins, such as sequence-specific DNA binding factors, chromatin remodelers and modifiers, and general transcription machinery, to regulatory regions. Previous works have shown that these regulatory proteins favor specific organizational theme along promoters. Details about how they cooperatively regulate transcriptional process, however, remain unclear. We developed a method to reconstruct a Bayesian network (BN) model representing functional relationships among various transcriptional components. The positive/negative influence between these components was measured from protein binding and nucleosome occupancy data and embedded into the model. Application on S.cerevisiae ChIP-Chip data showed that the proposed method can recover confirmed relationships, such as Isw1-Pol II, TFIIH-Pol II, TFIIB-TBP, Pol II-H3K36Me3, H3K4Me3-H3K14Ac, etc. Moreover, it can distinguish colocating components from functionally related ones. Novel relationships, e.g., ones between Mediator and chromatin remodeling complexes (CRCs), and the combinatorial regulation of Pol II recruitment and activity by CRCs and general transcription factors (GTFs), were also suggested. CONCLUSION protein binding events during transcription positively influence each other. Among contributing components, GTFs and CRCs play pivotal roles in transcriptional regulation. These findings provide insights into the regulatory mechanism.
Collapse
Affiliation(s)
- Ngoc Tu Le
- School of Knowledge Science, Japan Advanced Institute of Science and Technology, Asahidai 1-1, Nomi, Ishikawa 923-1292, Japan.
| | | | | |
Collapse
|
24
|
Wang Y, Chen X, Liu ZP, Huang Q, Wang Y, Xu D, Zhang XS, Chen R, Chen L. De novo prediction of RNA–protein interactions from sequence information. ACTA ACUST UNITED AC 2013; 9:133-42. [DOI: 10.1039/c2mb25292a] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
25
|
Vandenbon A, Kumagai Y, Akira S, Standley DM. A novel unbiased measure for motif co-occurrence predicts combinatorial regulation of transcription. BMC Genomics 2012; 13 Suppl 7:S11. [PMID: 23282148 PMCID: PMC3521209 DOI: 10.1186/1471-2164-13-s7-s11] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Background Multiple transcription factors (TFs) are involved in the generation of gene expression patterns, such as tissue-specific gene expression and pleiotropic immune responses. However, how combinations of TFs orchestrate diverse gene expression patterns is poorly understood. Here we propose a new measure for regulatory motif co-occurrence and a new methodology to systematically identify TF pairs significantly co-occurring in a set of promoter sequences. Results Initial analyses suggest that non-CpG promoters have a higher potential for combinatorial regulation than CpG island-associated promoters, and that co-occurrences are strongly influenced by motif similarity. We applied our method to large-scale gene expression data from various tissues, and showed how our measure for motif co-occurrence is not biased by motif over-representation. Our method identified, amongst others, the binding motifs of HNF1 and FOXP1 to be significantly co-occurring in promoters of liver/kidney specific genes. Binding sites tend to be positioned proximally to each other, suggesting interactions exist between this pair of transcription factors. Moreover, the binding sites of several TFs were found to co-occur with NF-κB and IRF sites in sets of genes with similar expression patterns in dendritic cells after Toll-like receptor stimulation. Of these, we experimentally verified that CCAAT enhancer binding protein alpha positively regulates its target promoters synergistically with NF-κB. Conclusions Both computational and experimental results indicate that the proposed method can clarify TF interactions that could not be observed by currently available prediction methods.
Collapse
Affiliation(s)
- Alexis Vandenbon
- Laboratory of Systems Immunology, Immunology Frontier Research Center, Osaka University, 3-1 Yamada-oka, Suita, Osaka 565-0871, Japan.
| | | | | | | |
Collapse
|
26
|
Dümcke S, Seizl M, Etzold S, Pirkl N, Martin DE, Cramer P, Tresch A. One Hand Clapping: detection of condition-specific transcription factor interactions from genome-wide gene activity data. Nucleic Acids Res 2012; 40:8883-92. [PMID: 22844089 PMCID: PMC3467085 DOI: 10.1093/nar/gks695] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
We present One Hand Clapping (OHC), a method for the detection of condition-specific interactions between transcription factors (TFs) from genome-wide gene activity measurements. OHC is based on a mapping between transcription factors and their target genes. Given a single case–control experiment, it uses a linear regression model to assess whether the common targets of two arbitrary TFs behave differently than expected from the genes targeted by only one of the TFs. When applied to osmotic stress data in S. cerevisiae, OHC produces consistent results across three types of expression measurements: gene expression microarray data, RNA Polymerase II ChIP-chip binding data and messenger RNA synthesis rates. Among the eight novel, condition-specific TF pairs, we validate the interaction between Gcn4p and Arr1p experimentally. We apply OHC to a large gene activity dataset in S. cerevisiae and provide a compendium of condition-specific TF interactions.
Collapse
Affiliation(s)
- Sebastian Dümcke
- Gene Center Munich and Department of Biochemistry, Ludwig-Maximilians-Universität München, Feodor-Lynen-Straße 25, 81377 Munich, Germany
| | | | | | | | | | | | | |
Collapse
|
27
|
Rodin AS, Gogoshin G, Boerwinkle E. Systems biology data analysis methodology in pharmacogenomics. Pharmacogenomics 2012; 12:1349-60. [PMID: 21919609 DOI: 10.2217/pgs.11.76] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Pharmacogenetics aims to elucidate the genetic factors underlying the individual's response to pharmacotherapy. Coupled with the recent (and ongoing) progress in high-throughput genotyping, sequencing and other genomic technologies, pharmacogenetics is rapidly transforming into pharmacogenomics, while pursuing the primary goals of identifying and studying the genetic contribution to drug therapy response and adverse effects, and existing drug characterization and new drug discovery. Accomplishment of both of these goals hinges on gaining a better understanding of the underlying biological systems; however, reverse-engineering biological system models from the massive datasets generated by the large-scale genetic epidemiology studies presents a formidable data analysis challenge. In this article, we review the recent progress made in developing such data analysis methodology within the paradigm of systems biology research that broadly aims to gain a 'holistic', or 'mechanistic' understanding of biological systems by attempting to capture the entirety of interactions between the components (genetic and otherwise) of the system.
Collapse
Affiliation(s)
- Andrei S Rodin
- Human Genetics Center, School of Public Health, University of Texas Health Science Center, Houston, TX 77030, USA.
| | | | | |
Collapse
|
28
|
Chen MJM, Chou LC, Hsieh TT, Lee DD, Liu KW, Yu CY, Oyang YJ, Tsai HK, Chen CY. De novo motif discovery facilitates identification of interactions between transcription factors in Saccharomyces cerevisiae. ACTA ACUST UNITED AC 2012; 28:701-8. [PMID: 22238267 DOI: 10.1093/bioinformatics/bts002] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
MOTIVATION Gene regulation involves complicated mechanisms such as cooperativity between a set of transcription factors (TFs). Previous studies have used target genes shared by two TFs as a clue to infer TF-TF interactions. However, this task remains challenging because the target genes with low binding affinity are frequently omitted by experimental data, especially when a single strict threshold is employed. This article aims at improving the accuracy of inferring TF-TF interactions by incorporating motif discovery as a fundamental step when detecting overlapping targets of TFs based on ChIP-chip data. RESULTS The proposed method, simTFBS, outperforms three naïve methods that adopt fixed thresholds when inferring TF-TF interactions based on ChIP-chip data. In addition, simTFBS is compared with two advanced methods and demonstrates its advantages in predicting TF-TF interactions. By comparing simTFBS with predictions based on the set of available annotated yeast TF binding motifs, we demonstrate that the good performance of simTFBS is indeed coming from the additional motifs found by the proposed procedures. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mei-Ju May Chen
- Department of Computer Science and Information Engineering, National Taiwan University and Institute of Information Science, Academia Sinica, Taipei, Taiwan
| | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Building Synthetic Systems to Learn Nature’s Design Principles. EVOLUTIONARY SYSTEMS BIOLOGY 2012; 751:411-29. [DOI: 10.1007/978-1-4614-3567-9_19] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
30
|
Moyle-Heyrman G, Tims HS, Widom J. Structural constraints in collaborative competition of transcription factors against the nucleosome. J Mol Biol 2011; 412:634-46. [PMID: 21821044 DOI: 10.1016/j.jmb.2011.07.032] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2011] [Revised: 07/03/2011] [Accepted: 07/16/2011] [Indexed: 01/13/2023]
Abstract
Cooperativity in transcription factor (TF) binding is essential in eukaryotic gene regulation and arises through diverse mechanisms. Here, we focus on one mechanism, collaborative competition, which is of interest because it arises both automatically (with no requirement for TF coevolution) and spontaneously (with no requirement for ATP-dependent nucleosome remodeling factors). Previous experimental studies of collaborative competition analyzed cases in which target sites for pairs of cooperating TFs were contained within the same side of the nucleosome. Here, we utilize new assays to measure cooperativity in protein binding to pairs of nucleosomal DNA target sites. We focus on the cases that are of greatest in vivo relevance, in which one binding site is located close to the end of a nucleosome and the other binding site is located at diverse positions throughout the nucleosome. Our results reveal energetically significant positive (favorable) cooperativity for pairs of sites on the same side of the nucleosome but, for the cases examined, energetically insignificant cooperativity between sites on opposite sides of the nucleosome. These findings imply a special significance for TF binding sites that are spaced within one-half nucleosome length (74 bp) or less along the genome and may prove useful for prediction of cooperatively acting TFs genome wide.
Collapse
|
31
|
Wang J, Huang Q, Liu ZP, Wang Y, Wu LY, Chen L, Zhang XS. NOA: a novel Network Ontology Analysis method. Nucleic Acids Res 2011; 39:e87. [PMID: 21543451 PMCID: PMC3141273 DOI: 10.1093/nar/gkr251] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Gene ontology analysis has become a popular and important tool in bioinformatics study, and current ontology analyses are mainly conducted in individual gene or a gene list. However, recent molecular network analysis reveals that the same list of genes with different interactions may perform different functions. Therefore, it is necessary to consider molecular interactions to correctly and specifically annotate biological networks. Here, we propose a novel Network Ontology Analysis (NOA) method to perform gene ontology enrichment analysis on biological networks. Specifically, NOA first defines link ontology that assigns functions to interactions based on the known annotations of joint genes via optimizing two novel indexes ‘Coverage’ and ‘Diversity’. Then, NOA generates two alternative reference sets to statistically rank the enriched functional terms for a given biological network. We compare NOA with traditional enrichment analysis methods in several biological networks, and find that: (i) NOA can capture the change of functions not only in dynamic transcription regulatory networks but also in rewiring protein interaction networks while the traditional methods cannot and (ii) NOA can find more relevant and specific functions than traditional methods in different types of static networks. Furthermore, a freely accessible web server for NOA has been developed at http://www.aporc.org/noa/.
Collapse
Affiliation(s)
- Jiguang Wang
- Key Laboratory of Management, Decision and Information Systems, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, Japan
| | | | | | | | | | | | | |
Collapse
|
32
|
Paik HJ, Ryu TW, Heo HS, Seo SW, Lee DH, Hur CG. Predicting tissue-specific expressions based on sequence characteristics. BMB Rep 2011; 44:250-5. [DOI: 10.5483/bmbrep.2011.44.4.250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
33
|
Wang Y, Franzosa EA, Zhang XS, Xia Y. Protein evolution in yeast transcription factor subnetworks. Nucleic Acids Res 2010; 38:5959-69. [PMID: 20466810 PMCID: PMC2952844 DOI: 10.1093/nar/gkq353] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2009] [Revised: 04/21/2010] [Accepted: 04/22/2010] [Indexed: 01/08/2023] Open
Abstract
When averaged over the full yeast protein-protein interaction and transcriptional regulatory networks, protein hubs with many interaction partners or regulators tend to evolve significantly more slowly due to increased negative selection. However, genome-wide analysis of protein evolution in the subnetworks of associations involving yeast transcription factors (TFs) reveals that TF hubs do not tend to evolve significantly more slowly than TF non-hubs. This result holds for all four major types of TF hubs: interaction hubs, regulatory in-degree and out-degree hubs, as well as co-regulatory hubs that jointly regulate target genes with many TFs. Furthermore, TF regulatory in-degree hubs tend to evolve significantly more quickly than TF non-hubs. Most importantly, the correlations between evolutionary rate (K(A)/K(S)) and degrees for TFs are significantly more positive than those for generic proteins within the same global protein-protein interaction and transcriptional regulatory networks. Compared to generic protein hubs, TF hubs operate at a higher level in the hierarchical structure of cellular networks, and hence experience additional evolutionary forces (relaxed negative selection or positive selection through network rewiring). The striking difference between the evolution of TF hubs and generic protein hubs demonstrates that components within the same global network can be governed by distinct organizational and evolutionary principles.
Collapse
Affiliation(s)
- Yong Wang
- Bioinformatics Program, Boston University, Boston, MA 02215, USA, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100080, China, Department of Chemistry and Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Eric A. Franzosa
- Bioinformatics Program, Boston University, Boston, MA 02215, USA, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100080, China, Department of Chemistry and Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Xiang-Sun Zhang
- Bioinformatics Program, Boston University, Boston, MA 02215, USA, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100080, China, Department of Chemistry and Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| | - Yu Xia
- Bioinformatics Program, Boston University, Boston, MA 02215, USA, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100080, China, Department of Chemistry and Department of Biomedical Engineering, Boston University, Boston, MA 02215, USA
| |
Collapse
|
34
|
Darnell RB. HITS-CLIP: panoramic views of protein-RNA regulation in living cells. WILEY INTERDISCIPLINARY REVIEWS-RNA 2010; 1:266-86. [PMID: 21935890 PMCID: PMC3222227 DOI: 10.1002/wrna.31] [Citation(s) in RCA: 298] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The study of gene regulation in cells has recently begun to shift from a period dominated by the study of transcription factor-DNA interactions to a new focus on RNA regulation. This was sparked by the still-emerging recognition of the central role for RNA in cellular complexity emanating from the RNA World hypothesis, and has been facilitated by technologic advances, in particular high throughput RNA sequencing and crosslinking methods (RNA-Seq, CLIP, and HITS-CLIP). This study will place these advances in context, and, focusing on CLIP, will explain the method, what it can be used for, and how to approach using it. Examples of the successes, limitations, and future of the technique will be discussed.
Collapse
Affiliation(s)
- Robert B Darnell
- Laboratory of Neuro-Oncology, The Rockefeller University, Howard Hughes Medical Institute, New York, NY 10065, USA.
| |
Collapse
|
35
|
Zhang C, Frias MA, Mele A, Ruggiu M, Eom T, Marney CB, Wang H, Licatalosi DD, Fak JJ, Darnell RB. Integrative modeling defines the Nova splicing-regulatory network and its combinatorial controls. Science 2010; 329:439-43. [PMID: 20558669 DOI: 10.1126/science.1191150] [Citation(s) in RCA: 231] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The control of RNA alternative splicing is critical for generating biological diversity. Despite emerging genome-wide technologies to study RNA complexity, reliable and comprehensive RNA-regulatory networks have not been defined. Here, we used Bayesian networks to probabilistically model diverse data sets and predict the target networks of specific regulators. We applied this strategy to identify approximately 700 alternative splicing events directly regulated by the neuron-specific factor Nova in the mouse brain, integrating RNA-binding data, splicing microarray data, Nova-binding motifs, and evolutionary signatures. The resulting integrative network revealed combinatorial regulation by Nova and the neuronal splicing factor Fox, interplay between phosphorylation and splicing, and potential links to neurologic disease. Thus, we have developed a general approach to understanding mammalian RNA regulation at the systems level.
Collapse
Affiliation(s)
- Chaolin Zhang
- Laboratory of Molecular Neuro-Oncology, Howard Hughes Medical Institute, The Rockefeller University, 1230 York Avenue, New York, NY 10021, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|