1
|
Unraveling unique and common cell type-specific mechanisms in glioblastoma multiforme. Comput Struct Biotechnol J 2022; 20:90-106. [PMID: 34976314 PMCID: PMC8688884 DOI: 10.1016/j.csbj.2021.12.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 11/22/2021] [Accepted: 12/06/2021] [Indexed: 11/20/2022] Open
Abstract
Glioblastoma multiforme persists to be an enigmatic distress in neuro-oncology. Its untethering capacity to thrive in a confined microenvironment, metastasize intracranially, and remain resistant to the systemic treatments, renders this tumour incurable. The glial cell type specificity in GBM remains exploratory. In our study, we aimed to address this problem by studying the GBM at the cell type level in the brain. The cellular makeup of this tumour is composed of genetically altered glial cells which include astrocyte, microglia, oligodendrocyte precursor cell, newly formed oligodendrocyte and myelinating oligodendrocyte. We extracted cell type-specific solid tumour as well as recurrent solid tumour glioma genes, and studied their functional networks and contribution towards gliomagenesis. We identified the principal transcription factors that are found to be regulating vital tumorigenic processes. We also assessed the protein-protein interaction networks at their domain level to get a more microscopic view of the structural and functional operations that transpire in these cells. This yielded the eminent protein regulators exhibiting their regulation in signaling pathways. Overall, our study unveiled regulatory mechanisms in glioma cell types that can be targeted for a more efficient glioma therapy.
Collapse
Key Words
- CAMs, Cell adhesion molecules
- CNS, Cental nervous system
- DEG, Differentially expressed genes
- EMT, Epithelial-mesenchymal transistion
- GBM, Glioblastoma multiforme
- GSC, Glioblastoma Stem Cell
- Glial cell types
- Glioblastoma multiforme
- INstruct, a database of structurally resolved protein interactome
- MO, Myelinating oligodendrocyte
- NCBI, National Centre for Biotechnology Information
- NFO, Newly formed oligodendrocyte
- NPC, Neural progenitor cell
- OPC, Oligodendrocyte precursor cell
- PDI, Protein domain interactions
- PDIN, Protein domain interaction network
- PPI, Protein-protein interactions
- Primary solid tumour
- Protein domains
- Protein interaction networks
- RSEM, RNA-seq by Expectation-Maximization
- Recurrent solid tumour transcription factors
- SIGNOR, Signaling Network Open Resource
- TCGA, The Cancer Genome Atlas
- TF, Transcription factor
- TP, Primary solid tumour
- TR, Recurrent solid tumour
- WHO, World health organization
- iDEP, Integrated Differential Expression and Pathway analysis
Collapse
|
2
|
Champion M, Chiquet J, Neuvial P, Elati M, Radvanyi F, Birmelé E. Identification of deregulation mechanisms specific to cancer subtypes. J Bioinform Comput Biol 2021; 19:2140003. [PMID: 33653235 DOI: 10.1142/s0219720021400035] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In many cancers, mechanisms of gene regulation can be severely altered. Identification of deregulated genes, which do not follow the regulation processes that exist between transcription factors and their target genes, is of importance to better understand the development of the disease. We propose a methodology to detect deregulation mechanisms with a particular focus on cancer subtypes. This strategy is based on the comparison between tumoral and healthy cells. First, we use gene expression data from healthy cells to infer a reference gene regulatory network. Then, we compare it with gene expression levels in tumor samples to detect deregulated target genes. We finally measure the ability of each transcription factor to explain these deregulations. We apply our method on a public bladder cancer data set derived from The Cancer Genome Atlas project and confirm that it captures hallmarks of cancer subtypes. We also show that it enables the discovery of new potential biomarkers.
Collapse
Affiliation(s)
| | - Julien Chiquet
- Université Paris Saclay, AgroParisTech, INRAE, UMR MIA-Paris, Paris, France
| | - Pierre Neuvial
- Institut de Mathématiques de Toulouse, UMR 5219, Université de Toulouse, CNRS, France
| | - Mohamed Elati
- CANTHER, University of Lille, CNRS UMR 1277, Inserm U9020, 59045 Lille cedex, France
| | - François Radvanyi
- Institut Curie, PSL Research University, CNRS, UMR144, Paris, France
| | - Etienne Birmelé
- Université de Paris, CNRS, MAP5 UMR8145, Paris, France.,Institut de Recherche Mathématique Avancée, UMR 7501 Université de Strasbourg, CNRS, Strasbourg, France
| |
Collapse
|
3
|
Closed-loop cycles of experiment design, execution, and learning accelerate systems biology model development in yeast. Proc Natl Acad Sci U S A 2019; 116:18142-18147. [PMID: 31420515 PMCID: PMC6731661 DOI: 10.1073/pnas.1900548116] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Systems biology involves the development of large computational models of biological systems. The radical improvement of systems biology models will necessarily involve the automation of model improvement cycles. We present here a general approach to automating systems biology model improvement. Humans are eukaryotic organisms, and the yeast Saccharomyces cerevisiae is widely used in biology as a “model” for eukaryotic cells. The yeast diauxic shift is the most studied cellular transformation. We combined multiple software tools with integrated laboratory robotics to execute three semiautomated cycles of diauxic shift model improvement. All the experiments were formalized and communicated to a cloud laboratory automation system (Eve) for execution. The resulting improved model is relevant to understanding cancer, the immune system, and aging. One of the most challenging tasks in modern science is the development of systems biology models: Existing models are often very complex but generally have low predictive performance. The construction of high-fidelity models will require hundreds/thousands of cycles of model improvement, yet few current systems biology research studies complete even a single cycle. We combined multiple software tools with integrated laboratory robotics to execute three cycles of model improvement of the prototypical eukaryotic cellular transformation, the yeast (Saccharomyces cerevisiae) diauxic shift. In the first cycle, a model outperforming the best previous diauxic shift model was developed using bioinformatic and systems biology tools. In the second cycle, the model was further improved using automatically planned experiments. In the third cycle, hypothesis-led experiments improved the model to a greater extent than achieved using high-throughput experiments. All of the experiments were formalized and communicated to a cloud laboratory automation system (Eve) for automatic execution, and the results stored on the semantic web for reuse. The final model adds a substantial amount of knowledge about the yeast diauxic shift: 92 genes (+45%), and 1,048 interactions (+147%). This knowledge is also relevant to understanding cancer, the immune system, and aging. We conclude that systems biology software tools can be combined and integrated with laboratory robots in closed-loop cycles.
Collapse
|
4
|
Dhifli W, Puig J, Dispot A, Elati M. Latent network-based representations for large-scale gene expression data analysis. BMC Bioinformatics 2019; 19:466. [PMID: 30717663 PMCID: PMC7394327 DOI: 10.1186/s12859-018-2481-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2018] [Accepted: 11/09/2018] [Indexed: 12/12/2022] Open
Abstract
Background With the recent advancements in high-throughput experimental procedures, biologists are gathering huge quantities of data. A main priority in bioinformatics and computational biology is to provide system level analytical tools capable of meeting an ever-growing production of high-throughput biological data while taking into account its biological context. In gene expression data analysis, genes have widely been considered as independent components. However, a systemic view shows that they act synergistically in living cells, forming functional complexes and more generally a biological system. Results In this paper, we propose LatNet, a signal transformation framework that, starting from an initial large-scale gene expression data, allows to generate new representations based on latent network-based relationships between the genes. LatNet aims to leverage system level relations between the genes as an underlying hidden structure to derive the new transformed latent signals. We present a concrete implementation of our framework, based on a gene regulatory network structure and two signal transformation approaches, to quantify latent network-based activity of regulators, as well as gene perturbation signals. The new gene/regulator signals are at the level of each sample of the input data and, thus, could directly be used instead of the initial expression signals for major bioinformatics analysis, including diagnosis and personalized medicine. Conclusion Multiple patterns could be hidden or weakly observed in expression data. LatNet helps in uncovering latent signals that could emphasize hidden patterns based on the relations between the genes and, thus, enhancing the performance of gene expression-based analysis algorithms. We use LatNet for the analysis of real-world gene expression data of bladder cancer and we show the efficiency of our transformation framework as compared to using the initial expression data.
Collapse
Affiliation(s)
- Wajdi Dhifli
- University of Lille, 42, rue Paul Duez, Lille, 59000, France
| | - Julia Puig
- University of Lille, 42, rue Paul Duez, Lille, 59000, France
| | - Aurélien Dispot
- University of Lille, 42, rue Paul Duez, Lille, 59000, France
| | - Mohamed Elati
- University of Lille, 42, rue Paul Duez, Lille, 59000, France. .,UMR 8030 ; Génomique Métabolique / Laboratoire iSSB ; CEA-CNRS-UEVE, Genopole campus 1, 5 rue Henri Desbruères, Évry, 91030 Cedex, France.
| |
Collapse
|
5
|
Wu WS, Chen PH, Chen TT, Tseng YY. YGMD: a repository for yeast cooperative transcription factor sets and their target gene modules. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2017:4596568. [PMID: 29220473 PMCID: PMC5691354 DOI: 10.1093/database/bax085] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2017] [Accepted: 10/19/2017] [Indexed: 01/02/2023]
Abstract
By organizing the genome into gene modules (GMs), a living cell coordinates the activities of a set of genes to properly respond to environmental changes. The transcriptional regulation of the expression of a GM is usually carried out by a cooperative transcription factor set (CoopTFS) consisting of several cooperative transcription factors (TFs). Therefore, a database which provides CoopTFSs and their target GMs is useful for studying the cellular responses to internal or external stimuli. To address this need, here we constructed YGMD (Yeast Gene Module Database) to provide 34120 CoopTFSs, each of which consists of two to five cooperative TFs, and their target GMs. The cooperativity between TFs in a CoopTFS is suggested by physical/genetic interaction evidence or/and predicted by existing algorithms. The target GM regulated by a CoopTFS is defined as the common target genes of all the TFs in that CoopTFS. The regulatory association between any TF in a CoopTFS and any gene in the target GM is supported by experimental evidence in the literature. In YGMD, users can (i) search the GM regulated by a specific CoopTFS of interest or (ii) search all possible CoopTFSs whose target GMs contain a specific gene of interest. The biological relevance of YGMD is shown by a case study which demonstrates that YGMD can provide a GM enriched with genes known to be regulated by the query CoopTFS (Cbf1-Met4-Met32). We believe that YGMD provides a valuable resource for yeast biologists to study the transcriptional regulation of GMs. Database URL:http://cosbi4.ee.ncku.edu.tw/YGMD/, http://cosbi5.ee.ncku.edu.tw/YGMD/ or http://cosbi.ee.ncku.edu.tw/YGMD/
Collapse
Affiliation(s)
- Wei-Sheng Wu
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| | - Pin-Han Chen
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| | - Tsung-Te Chen
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| | - Yan-Yuan Tseng
- Center for Molecular Medicine and Genetics, Wayne State University, School of Medicine, Detroit, MI 48201, USA
| |
Collapse
|
6
|
Pilarczyk K, Wlaźlak E, Przyczyna D, Blachecki A, Podborska A, Anathasiou V, Konkoli Z, Szaciłowski K. Molecules, semiconductors, light and information: Towards future sensing and computing paradigms. Coord Chem Rev 2018. [DOI: 10.1016/j.ccr.2018.03.018] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
7
|
Tabe-Bordbar S, Emad A, Zhao SD, Sinha S. A closer look at cross-validation for assessing the accuracy of gene regulatory networks and models. Sci Rep 2018; 8:6620. [PMID: 29700343 PMCID: PMC5920056 DOI: 10.1038/s41598-018-24937-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 04/09/2018] [Indexed: 11/26/2022] Open
Abstract
Cross-validation (CV) is a technique to assess the generalizability of a model to unseen data. This technique relies on assumptions that may not be satisfied when studying genomics datasets. For example, random CV (RCV) assumes that a randomly selected set of samples, the test set, well represents unseen data. This assumption doesn’t hold true where samples are obtained from different experimental conditions, and the goal is to learn regulatory relationships among the genes that generalize beyond the observed conditions. In this study, we investigated how the CV procedure affects the assessment of supervised learning methods used to learn gene regulatory networks (or in other applications). We compared the performance of a regression-based method for gene expression prediction estimated using RCV with that estimated using a clustering-based CV (CCV) procedure. Our analysis illustrates that RCV can produce over-optimistic estimates of the model’s generalizability compared to CCV. Next, we defined the ‘distinctness’ of test set from training set and showed that this measure is predictive of performance of the regression method. Finally, we introduced a simulated annealing method to construct partitions with gradually increasing distinctness and showed that performance of different gene expression prediction methods can be better evaluated using this method.
Collapse
Affiliation(s)
- Shayan Tabe-Bordbar
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America
| | - Amin Emad
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America
| | - Sihai Dave Zhao
- Department of Statistics, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America. .,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America.
| |
Collapse
|
8
|
Banos DT, Trébulle P, Elati M. Integrating transcriptional activity in genome-scale models of metabolism. BMC SYSTEMS BIOLOGY 2017; 11:134. [PMID: 29322933 PMCID: PMC5763306 DOI: 10.1186/s12918-017-0507-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Background Genome-scale metabolic models provide an opportunity for rational approaches to studies of the different reactions taking place inside the cell. The integration of these models with gene regulatory networks is a hot topic in systems biology. The methods developed to date focus mostly on resolving the metabolic elements and use fairly straightforward approaches to assess the impact of genome expression on the metabolic phenotype. Results We present here a method for integrating the reverse engineering of gene regulatory networks into these metabolic models. We applied our method to a high-dimensional gene expression data set to infer a background gene regulatory network. We then compared the resulting phenotype simulations with those obtained by other relevant methods. Conclusions Our method outperformed the other approaches tested and was more robust to noise. We also illustrate the utility of this method for studies of a complex biological phenomenon, the diauxic shift in yeast.
Collapse
Affiliation(s)
- Daniel Trejo Banos
- UMR 8030 Génomique Métabolique / Laboratoire iSSB CEA-CNRS-UEVE, Genopole campus 1, 5 rue Henri Desbruères, Cedex Évry, 91030, France
| | - Pauline Trébulle
- UMR 8030 Génomique Métabolique / Laboratoire iSSB CEA-CNRS-UEVE, Genopole campus 1, 5 rue Henri Desbruères, Cedex Évry, 91030, France.,Micalis Institute, INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, 78350, France
| | - Mohamed Elati
- UMR 8030 Génomique Métabolique / Laboratoire iSSB CEA-CNRS-UEVE, Genopole campus 1, 5 rue Henri Desbruères, Cedex Évry, 91030, France. .,Université Lille, CNRS, Centrale Lille, UMR 9189 - CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, Lille, F-59000, France.
| |
Collapse
|
9
|
Inference and interrogation of a coregulatory network in the context of lipid accumulation in Yarrowia lipolytica. NPJ Syst Biol Appl 2017; 3:21. [PMID: 28955503 PMCID: PMC5554221 DOI: 10.1038/s41540-017-0024-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Revised: 07/07/2017] [Accepted: 07/13/2017] [Indexed: 12/14/2022] Open
Abstract
Complex phenotypes, such as lipid accumulation, result from cooperativity between regulators and the integration of multiscale information. However, the elucidation of such regulatory programs by experimental approaches may be challenging, particularly in context-specific conditions. In particular, we know very little about the regulators of lipid accumulation in the oleaginous yeast of industrial interest Yarrowia lipolytica. This lack of knowledge limits the development of this yeast as an industrial platform, due to the time-consuming and costly laboratory efforts required to design strains with the desired phenotypes. In this study, we aimed to identify context-specific regulators and mechanisms, to guide explorations of the regulation of lipid accumulation in Y. lipolytica. Using gene regulatory network inference, and considering the expression of 6539 genes over 26 time points from GSE35447 for biolipid production and a list of 151 transcription factors, we reconstructed a gene regulatory network comprising 111 transcription factors, 4451 target genes and 17048 regulatory interactions (YL-GRN-1) supported by evidence of protein-protein interactions. This study, based on network interrogation and wet laboratory validation (a) highlights the relevance of our proposed measure, the transcription factors influence, for identifying phases corresponding to changes in physiological state without prior knowledge (b) suggests new potential regulators and drivers of lipid accumulation and
Collapse
|
10
|
Pilarczyk K, Daly B, Podborska A, Kwolek P, Silverson VA, de Silva AP, Szaciłowski K. Coordination chemistry for information acquisition and processing. Coord Chem Rev 2016. [DOI: 10.1016/j.ccr.2016.04.012] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
|
11
|
Wu WS, Lai FJ. Detecting Cooperativity between Transcription Factors Based on Functional Coherence and Similarity of Their Target Gene Sets. PLoS One 2016; 11:e0162931. [PMID: 27623007 PMCID: PMC5021274 DOI: 10.1371/journal.pone.0162931] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Accepted: 08/30/2016] [Indexed: 11/22/2022] Open
Abstract
In eukaryotic cells, transcriptional regulation of gene expression is usually achieved by cooperative transcription factors (TFs). Therefore, knowing cooperative TFs is the first step toward uncovering the molecular mechanisms of gene expression regulation. Many algorithms based on different rationales have been proposed to predict cooperative TF pairs in yeast. Although various types of rationales have been used in the existing algorithms, functional coherence is not yet used. This prompts us to develop a new algorithm based on functional coherence and similarity of the target gene sets to identify cooperative TF pairs in yeast. The proposed algorithm predicted 40 cooperative TF pairs. Among them, three (Pdc2-Thi2, Hot1-Msn1 and Leu3-Met28) are novel predictions, which have not been predicted by any existing algorithms. Strikingly, two (Pdc2-Thi2 and Hot1-Msn1) of the three novel predictions have been experimentally validated, demonstrating the power of the proposed algorithm. Moreover, we show that the predictions of the proposed algorithm are more biologically meaningful than the predictions of 17 existing algorithms under four evaluation indices. In summary, our study suggests that new algorithms based on novel rationales are worthy of developing for detecting previously unidentifiable cooperative TF pairs.
Collapse
Affiliation(s)
- Wei-Sheng Wu
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
- * E-mail:
| | - Fu-Jou Lai
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
12
|
Wu WS, Hsieh YC, Lai FJ. YCRD: Yeast Combinatorial Regulation Database. PLoS One 2016; 11:e0159213. [PMID: 27392072 PMCID: PMC4938206 DOI: 10.1371/journal.pone.0159213] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 06/28/2016] [Indexed: 12/21/2022] Open
Abstract
In eukaryotes, the precise transcriptional control of gene expression is typically achieved through combinatorial regulation using cooperative transcription factors (TFs). Therefore, a database which provides regulatory associations between cooperative TFs and their target genes is helpful for biologists to study the molecular mechanisms of transcriptional regulation of gene expression. Because there is no such kind of databases in the public domain, this prompts us to construct a database, called Yeast Combinatorial Regulation Database (YCRD), which deposits 434,197 regulatory associations between 2535 cooperative TF pairs and 6243 genes. The comprehensive collection of more than 2500 cooperative TF pairs was retrieved from 17 existing algorithms in the literature. The target genes of a cooperative TF pair (e.g. TF1-TF2) are defined as the common target genes of TF1 and TF2, where a TF’s experimentally validated target genes were downloaded from YEASTRACT database. In YCRD, users can (i) search the target genes of a cooperative TF pair of interest, (ii) search the cooperative TF pairs which regulate a gene of interest and (iii) identify important cooperative TF pairs which regulate a given set of genes. We believe that YCRD will be a valuable resource for yeast biologists to study combinatorial regulation of gene expression. YCRD is available at http://cosbi.ee.ncku.edu.tw/YCRD/ or http://cosbi2.ee.ncku.edu.tw/YCRD/.
Collapse
Affiliation(s)
- Wei-Sheng Wu
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
- * E-mail:
| | - Yen-Chen Hsieh
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
| | - Fu-Jou Lai
- Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
13
|
Wu WS, Lai FJ, Tu BW, Chang DTH. CoopTFD: a repository for predicted yeast cooperative transcription factor pairs. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw092. [PMID: 27242036 PMCID: PMC4885606 DOI: 10.1093/database/baw092] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2016] [Accepted: 05/09/2016] [Indexed: 01/22/2023]
Abstract
In eukaryotic cells, transcriptional regulation of gene expression is usually accomplished by cooperative Transcription Factors (TFs). Therefore, knowing cooperative TFs is helpful for uncovering the mechanisms of transcriptional regulation. In yeast, many cooperative TF pairs have been predicted by various algorithms in the literature. However, until now, there is still no database which collects the predicted yeast cooperative TFs from existing algorithms. This prompts us to construct Cooperative Transcription Factors Database (CoopTFD), which has a comprehensive collection of 2622 predicted cooperative TF pairs (PCTFPs) in yeast from 17 existing algorithms. For each PCTFP, our database also provides five types of validation information: (i) the algorithms which predict this PCTFP, (ii) the publications which experimentally show that this PCTFP has physical or genetic interactions, (iii) the publications which experimentally study the biological roles of both TFs of this PCTFP, (iv) the common Gene Ontology (GO) terms of this PCTFP and (v) the common target genes of this PCTFP. Based on the provided validation information, users can judge the biological plausibility of a PCTFP of interest. We believe that CoopTFD will be a valuable resource for yeast biologists to study the combinatorial regulation of gene expression controlled by cooperative TFs. Database URL:http://cosbi.ee.ncku.edu.tw/CoopTFD/ or http://cosbi2.ee.ncku.edu.tw/CoopTFD/
Collapse
Affiliation(s)
- Wei-Sheng Wu
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| | - Fu-Jou Lai
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| | - Bor-Wen Tu
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| | - Darby Tien-Hao Chang
- Department of Electrical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| |
Collapse
|
14
|
Schönbach C, Horton P, Yiu SM, Tan TW, Ranganathan S. GIW and InCoB are advancing bioinformatics in the Asia-Pacific. BMC Bioinformatics 2015; 16:I1. [PMID: 28102114 PMCID: PMC6389036 DOI: 10.1186/1471-2105-16-s18-i1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
GIW/InCoB2015 the joint 26th International Conference on Genome Informatics (GIW) and 14th International Conference on Bioinformatics (InCoB) held in Tokyo, September 9-11, 2015 was attended by over 200 delegates. Fifty-one out of 89 oral presentations were based on research articles accepted for publication in four BMC journal supplements and three other journals. Sixteen articles in this supplement and six articles in the BMC Systems Biology GIW/InCoB2015 Supplement are covered by this introduction. The topics range from genome informatics, protein structure informatics, image analysis to biological networks and biomarker discovery.
Collapse
Affiliation(s)
- Christian Schönbach
- Department of Biology, School of Science and Technology, Nazarbayev University, Astana, 010000 Republic of Kazakhstan
- Center for AIDS Research and International Research Center for Medical Sciences, Kumamoto University, Kumamoto, 860-0811 Japan
| | - Paul Horton
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, 135-0064 Japan
- Department of Computational Biology, Graduate School of Frontier Sciences, University of Tokyo, Japan
| | - Siu-Ming Yiu
- Department of Computer Science, Faculty of Engineering, The University of Hong Kong, Hong Kong, HKSAR
| | - Tin Wee Tan
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117599
| | - Shoba Ranganathan
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117599
- Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, NSW 2109 Australia
| |
Collapse
|
15
|
Lai FJ, Chang HT, Wu WS. PCTFPeval: a web tool for benchmarking newly developed algorithms for predicting cooperative transcription factor pairs in yeast. BMC Bioinformatics 2015; 16 Suppl 18:S2. [PMID: 26677932 PMCID: PMC4682397 DOI: 10.1186/1471-2105-16-s18-s2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Background Computational identification of cooperative transcription factor (TF) pairs helps understand the combinatorial regulation of gene expression in eukaryotic cells. Many advanced algorithms have been proposed to predict cooperative TF pairs in yeast. However, it is still difficult to conduct a comprehensive and objective performance comparison of different algorithms because of lacking sufficient performance indices and adequate overall performance scores. To solve this problem, in our previous study (published in BMC Systems Biology 2014), we adopted/proposed eight performance indices and designed two overall performance scores to compare the performance of 14 existing algorithms for predicting cooperative TF pairs in yeast. Most importantly, our performance comparison framework can be applied to comprehensively and objectively evaluate the performance of a newly developed algorithm. However, to use our framework, researchers have to put a lot of effort to construct it first. To save researchers time and effort, here we develop a web tool to implement our performance comparison framework, featuring fast data processing, a comprehensive performance comparison and an easy-to-use web interface. Results The developed tool is called PCTFPeval (Predicted Cooperative TF Pair evaluator), written in PHP and Python programming languages. The friendly web interface allows users to input a list of predicted cooperative TF pairs from their algorithm and select (i) the compared algorithms among the 15 existing algorithms, (ii) the performance indices among the eight existing indices, and (iii) the overall performance scores from two possible choices. The comprehensive performance comparison results are then generated in tens of seconds and shown as both bar charts and tables. The original comparison results of each compared algorithm and each selected performance index can be downloaded as text files for further analyses. Conclusions Allowing users to select eight existing performance indices and 15 existing algorithms for comparison, our web tool benefits researchers who are eager to comprehensively and objectively evaluate the performance of their newly developed algorithm. Thus, our tool greatly expedites the progress in the research of computational identification of cooperative TF pairs.
Collapse
|
16
|
Picchetti T, Chiquet J, Elati M, Neuvial P, Nicolle R, Birmelé E. A model for gene deregulation detection using expression data. BMC SYSTEMS BIOLOGY 2015; 9 Suppl 6:S6. [PMID: 26679516 PMCID: PMC4674863 DOI: 10.1186/1752-0509-9-s6-s6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
In tumoral cells, gene regulation mechanisms are severely altered. Genes that do not react normally to their regulators' activity can provide explanations for the tumoral behavior, and be characteristic of cancer subtypes. We thus propose a statistical methodology to identify the misregulated genes given a reference network and gene expression data. Our model is based on a regulatory process in which all genes are allowed to be deregulated. We derive an EM algorithm where the hidden variables correspond to the status (under/over/normally expressed) of the genes and where the E-step is solved thanks to a message passing algorithm. Our procedure provides posterior probabilities of deregulation in a given sample for each gene. We assess the performance of our method by numerical experiments on simulations and on a bladder cancer data set.
Collapse
|
17
|
Wu WS, Lai FJ. Properly defining the targets of a transcription factor significantly improves the computational identification of cooperative transcription factor pairs in yeast. BMC Genomics 2015; 16 Suppl 12:S10. [PMID: 26679776 PMCID: PMC4682405 DOI: 10.1186/1471-2164-16-s12-s10] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Background Transcriptional regulation of gene expression in eukaryotes is usually accomplished by cooperative transcription factors (TFs). Computational identification of cooperative TF pairs has become a hot research topic and many algorithms have been proposed in the literature. A typical algorithm for predicting cooperative TF pairs has two steps. (Step 1) Define the targets of each TF under study. (Step 2) Design a measure for calculating the cooperativity of a TF pair based on the targets of these two TFs. While different algorithms have distinct sophisticated cooperativity measures, the targets of a TF are usually defined using ChIP-chip data. However, there is an inherent weakness in using ChIP-chip data to define the targets of a TF. ChIP-chip analysis can only identify the binding targets of a TF but it cannot distinguish the true regulatory from the binding but non-regulatory targets of a TF. Results This work is the first study which aims to investigate whether the performance of computational identification of cooperative TF pairs could be improved by using a more biologically relevant way to define the targets of a TF. For this purpose, we propose four simple algorithms, all of which consist of two steps. (Step 1) Define the targets of a TF using (i) ChIP-chip data in the first algorithm, (ii) TF binding data in the second algorithm, (iii) TF perturbation data in the third algorithm, and (iv) the intersection of TF binding and TF perturbation data in the fourth algorithm. Compared with the first three algorithms, the fourth algorithm uses a more biologically relevant way to define the targets of a TF. (Step 2) Measure the cooperativity of a TF pair by the statistical significance of the overlap of the targets of these two TFs using the hypergeometric test. By adopting four existing performance indices, we show that the fourth proposed algorithm (PA4) significantly out performs the other three proposed algorithms. This suggests that the computational identification of cooperative TF pairs is indeed improved when using a more biologically relevant way to define the targets of a TF. Strikingly, the prediction results of our simple PA4 are more biologically meaningful than those of the 12 existing sophisticated algorithms in the literature, all of which used ChIP-chip data to define the targets of a TF. This suggests that properly defining the targets of a TF may be more important than designing sophisticated cooperativity measures. In addition, our PA4 has the power to predict several experimentally validated cooperative TF pairs, which have not been successfully predicted by any existing algorithms in the literature. Conclusions This study shows that the performance of computational
identification of cooperative TF pairs could be improved by using a more biologically relevant way to define the targets of a TF. The main contribution of this study is not to propose another new algorithm but to provide a new thinking for the research of computational identification of cooperative TF pairs. Researchers should put more effort on properly defining the targets of a TF (i.e. Step 1) rather than totally focus on designing sophisticated cooperativity measures (i.e. Step 2). The lists of TF target genes, the Matlab codes and the prediction results of the four proposed algorithms could be downloaded from our companion website http://cosbi3.ee.ncku.edu.tw/TFI/
Collapse
|
18
|
Nicolle R, Radvanyi F, Elati M. CoRegNet: reconstruction and integrated analysis of co-regulatory networks. Bioinformatics 2015; 31:3066-8. [PMID: 25979476 PMCID: PMC4565029 DOI: 10.1093/bioinformatics/btv305] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Accepted: 05/08/2015] [Indexed: 11/13/2022] Open
Abstract
UNLABELLED CoRegNet is an R/Bioconductor package to analyze large-scale transcriptomic data by highlighting sets of co-regulators. Based on a transcriptomic dataset, CoRegNet can be used to: reconstruct a large-scale co-regulatory network, integrate regulation evidences such as transcription factor binding sites and ChIP data, estimate sample-specific regulator activity, identify cooperative transcription factors and analyze the sample-specific combinations of active regulators through an interactive visualization tool. In this study CoRegNet was used to identify driver regulators of bladder cancer. AVAILABILITY CoRegNet is available at http://bioconductor.org/packages/CoRegNet CONTACT remy.nicolle@issb.genopole.fr or mohamed.elati@issb.genopole.fr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rémy Nicolle
- iSSB, CNRS, University of Evry, Genopole, 91030 Evry Cedex, France, Institut Curie, PSL Research University, 75248 Cedex 05, France and CNRS UMR144, 75248 Cedex 05, France iSSB, CNRS, University of Evry, Genopole, 91030 Evry Cedex, France, Institut Curie, PSL Research University, 75248 Cedex 05, France and CNRS UMR144, 75248 Cedex 05, France iSSB, CNRS, University of Evry, Genopole, 91030 Evry Cedex, France, Institut Curie, PSL Research University, 75248 Cedex 05, France and CNRS UMR144, 75248 Cedex 05, France
| | - François Radvanyi
- iSSB, CNRS, University of Evry, Genopole, 91030 Evry Cedex, France, Institut Curie, PSL Research University, 75248 Cedex 05, France and CNRS UMR144, 75248 Cedex 05, France iSSB, CNRS, University of Evry, Genopole, 91030 Evry Cedex, France, Institut Curie, PSL Research University, 75248 Cedex 05, France and CNRS UMR144, 75248 Cedex 05, France
| | - Mohamed Elati
- iSSB, CNRS, University of Evry, Genopole, 91030 Evry Cedex, France, Institut Curie, PSL Research University, 75248 Cedex 05, France and CNRS UMR144, 75248 Cedex 05, France
| |
Collapse
|
19
|
Lai FJ, Jhu MH, Chiu CC, Huang YM, Wu WS. Identifying cooperative transcription factors in yeast using multiple data sources. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 5:S2. [PMID: 25559499 PMCID: PMC4305981 DOI: 10.1186/1752-0509-8-s5-s2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Transcriptional regulation of gene expression is usually accomplished by multiple interactive transcription factors (TFs). Therefore, it is crucial to understand the precise cooperative interactions among TFs. Various kinds of experimental data including ChIP-chip, TF binding site (TFBS), gene expression, TF knockout and protein-protein interaction data have been used to identify cooperative TF pairs in existing methods. The nucleosome occupancy data is not yet used for this research topic despite that several researches have revealed the association between nucleosomes and TFBSs. RESULTS In this study, we developed a novel method to infer the cooperativity between two TFs by integrating the TF-gene documented regulation, TFBS and nucleosome occupancy data. TF-gene documented regulation and TFBS data were used to determine the target genes of a TF, and the genome-wide nucleosome occupancy data was used to assess the nucleosome occupancy on TFBSs. Our method identifies cooperative TF pairs based on two biologically plausible assumptions. If two TFs cooperate, then (i) they should have a significantly higher number of common target genes than random expectation and (ii) their binding sites (in the promoters of their common target genes) should tend to be co-depleted of nucleosomes in order to make these binding sites simultaneously accessible to TF binding. Each TF pair is given a cooperativity score by our method. The higher the score is, the more likely a TF pair has cooperativity. Finally, a list of 27 cooperative TF pairs has been predicted by our method. Among these 27 TF pairs, 19 pairs are also predicted by existing methods. The other 8 pairs are novel cooperative TF pairs predicted by our method. The biological relevance of these 8 novel cooperative TF pairs is justified by the existence of protein-protein interactions and co-annotation in the same MIPS functional categories. Moreover, we adopted three performance indices to compare our predictions with 11 existing methods' predictions. We show that our method performs better than these 11 existing methods in identifying cooperative TF pairs in yeast. Finally, the cooperative TF network constructed from the 27 predicted cooperative TF pairs shows that our method has the power to find cooperative TF pairs of different biological processes. CONCLUSION Our method is effective in identifying cooperative TF pairs in yeast. Many of our predictions are validated by the literature, and our method outperforms 11 existing methods. We believe that our study will help biologists to understand the mechanisms of transcriptional regulation in eukaryotic cells.
Collapse
|
20
|
Lai FJ, Chang HT, Huang YM, Wu WS. A comprehensive performance evaluation on the prediction results of existing cooperative transcription factors identification algorithms. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 4:S9. [PMID: 25521604 PMCID: PMC4290732 DOI: 10.1186/1752-0509-8-s4-s9] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Background Eukaryotic transcriptional regulation is known to be highly connected through the networks of cooperative transcription factors (TFs). Measuring the cooperativity of TFs is helpful for understanding the biological relevance of these TFs in regulating genes. The recent advances in computational techniques led to various predictions of cooperative TF pairs in yeast. As each algorithm integrated different data resources and was developed based on different rationales, it possessed its own merit and claimed outperforming others. However, the claim was prone to subjectivity because each algorithm compared with only a few other algorithms and only used a small set of performance indices for comparison. This motivated us to propose a series of indices to objectively evaluate the prediction performance of existing algorithms. And based on the proposed performance indices, we conducted a comprehensive performance evaluation. Results We collected 14 sets of predicted cooperative TF pairs (PCTFPs) in yeast from 14 existing algorithms in the literature. Using the eight performance indices we adopted/proposed, the cooperativity of each PCTFP was measured and a ranking score according to the mean cooperativity of the set was given to each set of PCTFPs under evaluation for each performance index. It was seen that the ranking scores of a set of PCTFPs vary with different performance indices, implying that an algorithm used in predicting cooperative TF pairs is of strength somewhere but may be of weakness elsewhere. We finally made a comprehensive ranking for these 14 sets. The results showed that Wang J's study obtained the best performance evaluation on the prediction of cooperative TF pairs in yeast. Conclusions In this study, we adopted/proposed eight performance indices to make a comprehensive performance evaluation on the prediction results of 14 existing cooperative TFs identification algorithms. Most importantly, these proposed indices can be easily applied to measure the performance of new algorithms developed in the future, thus expedite progress in this research field.
Collapse
|
21
|
Chebil I, Nicolle R, Santini G, Rouveirol C, Elati M. Hybrid Method Inference for the Construction of Cooperative Regulatory Network in Human. IEEE Trans Nanobioscience 2014; 13:97-103. [DOI: 10.1109/tnb.2014.2316920] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
22
|
Compte-rendu de la journée « Mathématiques et biologie des cancers », Institut Curie, Paris, 12 juin 2013. Bull Cancer 2013. [DOI: 10.1684/bdc.2013.1813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
23
|
Elati M, Nicolle R, Junier I, Fernández D, Fekih R, Font J, Képès F. PreCisIon: PREdiction of CIS-regulatory elements improved by gene's positION. Nucleic Acids Res 2012; 41:1406-15. [PMID: 23241390 PMCID: PMC3561985 DOI: 10.1093/nar/gks1286] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Conventional approaches to predict transcriptional regulatory interactions usually rely on the definition of a shared motif sequence on the target genes of a transcription factor (TF). These efforts have been frustrated by the limited availability and accuracy of TF binding site motifs, usually represented as position-specific scoring matrices, which may match large numbers of sites and produce an unreliable list of target genes. To improve the prediction of binding sites, we propose to additionally use the unrelated knowledge of the genome layout. Indeed, it has been shown that co-regulated genes tend to be either neighbors or periodically spaced along the whole chromosome. This study demonstrates that respective gene positioning carries significant information. This novel type of information is combined with traditional sequence information by a machine learning algorithm called PreCisIon. To optimize this combination, PreCisIon builds a strong gene target classifier by adaptively combining weak classifiers based on either local binding sequence or global gene position. This strategy generically paves the way to the optimized incorporation of any future advances in gene target prediction based on local sequence, genome layout or on novel criteria. With the current state of the art, PreCisIon consistently improves methods based on sequence information only. This is shown by implementing a cross-validation analysis of the 20 major TFs from two phylogenetically remote model organisms. For Bacillus subtilis and Escherichia coli, respectively, PreCisIon achieves on average an area under the receiver operating characteristic curve of 70 and 60%, a sensitivity of 80 and 70% and a specificity of 60 and 56%. The newly predicted gene targets are demonstrated to be functionally consistent with previously known targets, as assessed by analysis of Gene Ontology enrichment or of the relevant literature and databases.
Collapse
Affiliation(s)
- Mohamed Elati
- Institute of Systems and Synthetic Biology, CNRS, University of Evry, Genopole, 91030 Evry, France.
| | | | | | | | | | | | | |
Collapse
|
24
|
van Hijum SAFT, Medema MH, Kuipers OP. Mechanisms and evolution of control logic in prokaryotic transcriptional regulation. Microbiol Mol Biol Rev 2009; 73:481-509, Table of Contents. [PMID: 19721087 PMCID: PMC2738135 DOI: 10.1128/mmbr.00037-08] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
A major part of organismal complexity and versatility of prokaryotes resides in their ability to fine-tune gene expression to adequately respond to internal and external stimuli. Evolution has been very innovative in creating intricate mechanisms by which different regulatory signals operate and interact at promoters to drive gene expression. The regulation of target gene expression by transcription factors (TFs) is governed by control logic brought about by the interaction of regulators with TF binding sites (TFBSs) in cis-regulatory regions. A factor that in large part determines the strength of the response of a target to a given TF is motif stringency, the extent to which the TFBS fits the optimal TFBS sequence for a given TF. Advances in high-throughput technologies and computational genomics allow reconstruction of transcriptional regulatory networks in silico. To optimize the prediction of transcriptional regulatory networks, i.e., to separate direct regulation from indirect regulation, a thorough understanding of the control logic underlying the regulation of gene expression is required. This review summarizes the state of the art of the elements that determine the functionality of TFBSs by focusing on the molecular biological mechanisms and evolutionary origins of cis-regulatory regions.
Collapse
Affiliation(s)
- Sacha A F T van Hijum
- Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Kerklaan 30, 9751 NN Haren, The Netherlands.
| | | | | |
Collapse
|
25
|
Lee WP, Tzou WS. Computational methods for discovering gene networks from expression data. Brief Bioinform 2009; 10:408-23. [PMID: 19505889 DOI: 10.1093/bib/bbp028] [Citation(s) in RCA: 87] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Designing and conducting experiments are routine practices for modern biologists. The real challenge, especially in the post-genome era, usually comes not from acquiring data, but from subsequent activities such as data processing, analysis, knowledge generation and gaining insight into the research question of interest. The approach of inferring gene regulatory networks (GRNs) has been flourishing for many years, and new methods from mathematics, information science, engineering and social sciences have been applied. We review different kinds of computational methods biologists use to infer networks of varying levels of accuracy and complexity. The primary concern of biologists is how to translate the inferred network into hypotheses that can be tested with real-life experiments. Taking the biologists' viewpoint, we scrutinized several methods for predicting GRNs in mammalian cells, and more importantly show how the power of different knowledge databases of different types can be used to identify modules and subnetworks, thereby reducing complexity and facilitating the generation of testable hypotheses.
Collapse
Affiliation(s)
- Wei-Po Lee
- Department of Information Management, National Sun Yat-sen University, Kaohsiung, Taiwan.
| | | |
Collapse
|
26
|
Birmelé E, Elati M, Rouveirol C, Ambroise C. Identification of functional modules based on transcriptional regulation structure. BMC Proc 2008; 2 Suppl 4:S4. [PMID: 19091051 PMCID: PMC2654972 DOI: 10.1186/1753-6561-2-s4-s4] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background Identifying gene functional modules is an important step towards elucidating gene functions at a global scale. Clustering algorithms mostly rely on co-expression of genes, that is group together genes having similar expression profiles. Results We propose to cluster genes by co-regulation rather than by co-expression. We therefore present an inference algorithm for detecting co-regulated groups from gene expression data and introduce a method to cluster genes given that inferred regulatory structure. Finally, we propose to validate the clustering through a score based on the GO enrichment of the obtained groups of genes. Conclusion We evaluate the methods on the stress response of S. Cerevisiae data and obtain better scores than clustering obtained directly from gene expression.
Collapse
Affiliation(s)
- Etienne Birmelé
- Laboratoire Statistique et Génome, UMR CNRS 8071, INRA 1152, Tour Evry 2, F-91000 Evry, France.
| | | | | | | |
Collapse
|