101
|
Zamal FA, Ruths D. On the contributions of topological features to transcriptional regulatory network robustness. BMC Bioinformatics 2012. [PMID: 23194062 PMCID: PMC3541983 DOI: 10.1186/1471-2105-13-318] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Because biological networks exhibit a high-degree of robustness, a systemic understanding of their architecture and function requires an appraisal of the network design principles that confer robustness. In this project, we conduct a computational study of the contribution of three degree-based topological properties (transcription factor-target ratio, degree distribution, cross-talk suppression) and their combinations on the robustness of transcriptional regulatory networks. We seek to quantify the relative degree of robustness conferred by each property (and combination) and also to determine the extent to which these properties alone can explain the robustness observed in transcriptional networks. RESULTS To study individual properties and their combinations, we generated synthetic, random networks that retained one or more of the three properties with values derived from either the yeast or E. coli gene regulatory networks. Robustness of these networks were estimated through simulation. Our results indicate that the combination of the three properties we considered explains the majority of the structural robustness observed in the real transcriptional networks. Surprisingly, scale-free degree distribution is, overall, a minor contributor to robustness. Instead, most robustness is gained through topological features that limit the complexity of the overall network and increase the transcription factor subnetwork sparsity. CONCLUSIONS Our work demonstrates that (i) different types of robustness are implemented by different topological aspects of the network and (ii) size and sparsity of the transcription factor subnetwork play an important role for robustness induction. Our results are conserved across yeast and E Coli, which suggests that the design principles examined are present within an array of living systems.
Collapse
Affiliation(s)
- Faiyaz Al Zamal
- School of Computer Science, McGill University, Montreal, Canada.
| | | |
Collapse
|
102
|
Saadatpour A, Albert R. Boolean modeling of biological regulatory networks: a methodology tutorial. Methods 2012; 62:3-12. [PMID: 23142247 DOI: 10.1016/j.ymeth.2012.10.012] [Citation(s) in RCA: 63] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2012] [Accepted: 10/31/2012] [Indexed: 12/14/2022] Open
Abstract
Given the complexity and interactive nature of biological systems, constructing informative and coherent network models of these systems and subsequently developing efficient approaches to analyze the assembled networks is of immense importance. The integration of network analysis and dynamic modeling enables one to investigate the behavior of the underlying system as a whole and to make experimentally testable predictions about less-understood aspects of the processes involved. In this paper, we present a tutorial on the fundamental steps of Boolean modeling of biological regulatory networks. We demonstrate how to infer a Boolean network model from the available experimental data, analyze the network using graph-theoretical measures, and convert it into a predictive dynamic model. For each step, the pitfalls one may encounter and possible ways to circumvent them are also discussed. We illustrate these steps on a toy network as well as in the context of the Drosophila melanogaster segment polarity gene network.
Collapse
Affiliation(s)
- Assieh Saadatpour
- Department of Mathematics, The Pennsylvania State University, University Park, PA 16802, USA
| | | |
Collapse
|
103
|
Graham JH, Robb DT, Poe AR. Random phenotypic variation of yeast (Saccharomyces cerevisiae) single-gene knockouts fits a double pareto-lognormal distribution. PLoS One 2012; 7:e48964. [PMID: 23139826 PMCID: PMC3490920 DOI: 10.1371/journal.pone.0048964] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2012] [Accepted: 10/08/2012] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Distributed robustness is thought to influence the buffering of random phenotypic variation through the scale-free topology of gene regulatory, metabolic, and protein-protein interaction networks. If this hypothesis is true, then the phenotypic response to the perturbation of particular nodes in such a network should be proportional to the number of links those nodes make with neighboring nodes. This suggests a probability distribution approximating an inverse power-law of random phenotypic variation. Zero phenotypic variation, however, is impossible, because random molecular and cellular processes are essential to normal development. Consequently, a more realistic distribution should have a y-intercept close to zero in the lower tail, a mode greater than zero, and a long (fat) upper tail. The double Pareto-lognormal (DPLN) distribution is an ideal candidate distribution. It consists of a mixture of a lognormal body and upper and lower power-law tails. OBJECTIVE AND METHODS If our assumptions are true, the DPLN distribution should provide a better fit to random phenotypic variation in a large series of single-gene knockout lines than other skewed or symmetrical distributions. We fit a large published data set of single-gene knockout lines in Saccharomyces cerevisiae to seven different probability distributions: DPLN, right Pareto-lognormal (RPLN), left Pareto-lognormal (LPLN), normal, lognormal, exponential, and Pareto. The best model was judged by the Akaike Information Criterion (AIC). RESULTS Phenotypic variation among gene knockouts in S. cerevisiae fits a double Pareto-lognormal (DPLN) distribution better than any of the alternative distributions, including the right Pareto-lognormal and lognormal distributions. CONCLUSIONS AND SIGNIFICANCE A DPLN distribution is consistent with the hypothesis that developmental stability is mediated, in part, by distributed robustness, the resilience of gene regulatory, metabolic, and protein-protein interaction networks. Alternatively, multiplicative cell growth, and the mixing of lognormal distributions having different variances, may generate a DPLN distribution.
Collapse
Affiliation(s)
- John H Graham
- Department of Biology, Berry College, Mount Berry, Georgia, USA.
| | | | | |
Collapse
|
104
|
Construction of gene regulatory networks with colored noise. Neural Comput Appl 2012. [DOI: 10.1007/s00521-011-0584-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
105
|
Abstract
Background Cancer and other gene related diseases are usually caused by a failure in the signaling pathway between genes and cells. These failures can occur in different areas of the gene regulatory network, but can be abstracted as faults in the regulatory function. For effective cancer treatment, it is imperative to identify faults and select appropriate drugs to treat the faults. In this paper, we present an extensible Max-SAT based automatic test pattern generation (ATPG) algorithm for cancer therapy. This ATPG algorithm is based on Boolean Satisfiability (SAT) and utilizes the stuck-at fault model for representing signaling faults. A weighted partial Max-SAT formulation is used to enable efficient selection of the most effective drug. Results Several usage cases are presented for fault identification and drug selection. These cases include the identification of testable faults, optimal drug selection for single/multiple known faults, and optimal drug selection for overall fault coverage. Experimental results on growth factor (GF) signaling pathways demonstrate that our algorithm is flexible, and can yield an exact solution for each feature in much less than 1 second.
Collapse
|
106
|
Chen C, Fushing H. Multiscale community geometry in a network and its application. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2012; 86:041120. [PMID: 23214542 DOI: 10.1103/physreve.86.041120] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2012] [Revised: 08/17/2012] [Indexed: 06/01/2023]
Abstract
We introduce a between-ness-based distance metric to extract local and global information for each pair of nodes (or "vertices" used interchangeably) located in a binary network. Since this distance then superimposes a weighted graph upon such a binary network, a multiscale clustering mechanism, called data cloud geometry, is applicable to discover hierarchical communities within a binary network. This approach resolves many shortcomings of community finding approaches, which are primarily based on modularity optimization. Using several contrived and real binary networks, our community hierarchies compare favorably with results derived from a recently proposed approach based on time-scale differences of random walks and has already demonstrated significant improvements over module-based approaches, especially on the multiscale and the determination of the number of communities.
Collapse
Affiliation(s)
- Chen Chen
- University of California, Davis, California 95616, USA
| | | |
Collapse
|
107
|
Liang J, Han J. Stochastic Boolean networks: an efficient approach to modeling gene regulatory networks. BMC SYSTEMS BIOLOGY 2012; 6:113. [PMID: 22929591 PMCID: PMC3532238 DOI: 10.1186/1752-0509-6-113] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2012] [Accepted: 08/06/2012] [Indexed: 11/10/2022]
Abstract
BACKGROUND Various computational models have been of interest due to their use in the modelling of gene regulatory networks (GRNs). As a logical model, probabilistic Boolean networks (PBNs) consider molecular and genetic noise, so the study of PBNs provides significant insights into the understanding of the dynamics of GRNs. This will ultimately lead to advances in developing therapeutic methods that intervene in the process of disease development and progression. The applications of PBNs, however, are hindered by the complexities involved in the computation of the state transition matrix and the steady-state distribution of a PBN. For a PBN with n genes and N Boolean networks, the complexity to compute the state transition matrix is O(nN22n) or O(nN2n) for a sparse matrix. RESULTS This paper presents a novel implementation of PBNs based on the notions of stochastic logic and stochastic computation. This stochastic implementation of a PBN is referred to as a stochastic Boolean network (SBN). An SBN provides an accurate and efficient simulation of a PBN without and with random gene perturbation. The state transition matrix is computed in an SBN with a complexity of O(nL2n), where L is a factor related to the stochastic sequence length. Since the minimum sequence length required for obtaining an evaluation accuracy approximately increases in a polynomial order with the number of genes, n, and the number of Boolean networks, N, usually increases exponentially with n, L is typically smaller than N, especially in a network with a large number of genes. Hence, the computational efficiency of an SBN is primarily limited by the number of genes, but not directly by the total possible number of Boolean networks. Furthermore, a time-frame expanded SBN enables an efficient analysis of the steady-state distribution of a PBN. These findings are supported by the simulation results of a simplified p53 network, several randomly generated networks and a network inferred from a T cell immune response dataset. An SBN can also implement the function of an asynchronous PBN and is potentially useful in a hybrid approach in combination with a continuous or single-molecule level stochastic model. CONCLUSIONS Stochastic Boolean networks (SBNs) are proposed as an efficient approach to modelling gene regulatory networks (GRNs). The SBN approach is able to recover biologically-proven regulatory behaviours, such as the oscillatory dynamics of the p53-Mdm2 network and the dynamic attractors in a T cell immune response network. The proposed approach can further predict the network dynamics when the genes are under perturbation, thus providing biologically meaningful insights for a better understanding of the dynamics of GRNs. The algorithms and methods described in this paper have been implemented in Matlab packages, which are attached as Additional files.
Collapse
Affiliation(s)
- Jinghang Liang
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB T6G 2V4, Canada.
| | | |
Collapse
|
108
|
Lo K, Raftery AE, Dombek KM, Zhu J, Schadt EE, Bumgarner RE, Yeung KY. Integrating external biological knowledge in the construction of regulatory networks from time-series expression data. BMC SYSTEMS BIOLOGY 2012; 6:101. [PMID: 22898396 PMCID: PMC3465231 DOI: 10.1186/1752-0509-6-101] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2012] [Accepted: 07/24/2012] [Indexed: 01/27/2023]
Abstract
BACKGROUND Inference about regulatory networks from high-throughput genomics data is of great interest in systems biology. We present a Bayesian approach to infer gene regulatory networks from time series expression data by integrating various types of biological knowledge. RESULTS We formulate network construction as a series of variable selection problems and use linear regression to model the data. Our method summarizes additional data sources with an informative prior probability distribution over candidate regression models. We extend the Bayesian model averaging (BMA) variable selection method to select regulators in the regression framework. We summarize the external biological knowledge by an informative prior probability distribution over the candidate regression models. CONCLUSIONS We demonstrate our method on simulated data and a set of time-series microarray experiments measuring the effect of a drug perturbation on gene expression levels, and show that it outperforms leading regression-based methods in the literature.
Collapse
Affiliation(s)
- Kenneth Lo
- Department of Microbiology, University of Washington, Box 358070, Seattle, WA, 98195, USA
| | - Adrian E Raftery
- Department of Statistics, University of Washington, Box 354320, Seattle, WA, 98195, USA
| | - Kenneth M Dombek
- Department of Biochemistry, University of Washington, Box 357350, Seattle, WA, 98195, USA
| | - Jun Zhu
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, NY, 10029, USA
| | - Eric E Schadt
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, NY, 10029, USA
| | - Roger E Bumgarner
- Department of Microbiology, University of Washington, Box 358070, Seattle, WA, 98195, USA
| | - Ka Yee Yeung
- Department of Microbiology, University of Washington, Box 358070, Seattle, WA, 98195, USA
| |
Collapse
|
109
|
Kosti I, Radivojac P, Mandel-Gutfreund Y. An integrated regulatory network reveals pervasive cross-regulation among transcription and splicing factors. PLoS Comput Biol 2012; 8:e1002603. [PMID: 22844237 PMCID: PMC3405991 DOI: 10.1371/journal.pcbi.1002603] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2012] [Accepted: 05/24/2012] [Indexed: 11/19/2022] Open
Abstract
Traditionally the gene expression pathway has been regarded as being comprised of independent steps, from RNA transcription to protein translation. To date there is increasing evidence of coupling between the different processes of the pathway, specifically between transcription and splicing. To study the interplay between these processes we derived a transcription-splicing integrated network. The nodes of the network included experimentally verified human proteins belonging to three groups of regulators: transcription factors, splicing factors and kinases. The nodes were wired by instances of predicted transcriptional and alternative splicing regulation. Analysis of the network indicated a pervasive cross-regulation among the nodes; specifically, splicing factors are significantly more connected by alternative splicing regulatory edges relative to the two other subgroups, while transcription factors are more extensively controlled by transcriptional regulation. Furthermore, we found that splicing factors are the most regulated of the three regulatory groups and are subject to extensive combinatorial control by alternative splicing and transcriptional regulation. Consistent with the network results, our bioinformatics analyses showed that the subgroup of kinases have the highest density of predicted phosphorylation sites. Overall, our systematic study reveals that an organizing principle in the logic of integrated networks favor the regulation of regulatory proteins by the specific regulation they conduct. Based on these results, we propose a new regulatory paradigm postulating that gene expression regulation of the master regulators in the cell is predominantly achieved by cross-regulation.
Collapse
Affiliation(s)
- Idit Kosti
- Faculty of Biology, Technion – Israel Institute of Technology, Haifa, Israel
| | - Predrag Radivojac
- School of Informatics and Computing, Indiana University, Bloomington, Indiana, United States of America
| | - Yael Mandel-Gutfreund
- Faculty of Biology, Technion – Israel Institute of Technology, Haifa, Israel
- * E-mail:
| |
Collapse
|
110
|
Xuan NV, Chetty M, Coppel R, Wangikar PP. Gene regulatory network modeling via global optimization of high-order dynamic Bayesian network. BMC Bioinformatics 2012; 13:131. [PMID: 22694481 PMCID: PMC3433362 DOI: 10.1186/1471-2105-13-131] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2011] [Accepted: 06/13/2012] [Indexed: 11/11/2022] Open
Abstract
Background Dynamic Bayesian network (DBN) is among the mainstream approaches for modeling various biological networks, including the gene regulatory network (GRN). Most current methods for learning DBN employ either local search such as hill-climbing, or a meta stochastic global optimization framework such as genetic algorithm or simulated annealing, which are only able to locate sub-optimal solutions. Further, current DBN applications have essentially been limited to small sized networks. Results To overcome the above difficulties, we introduce here a deterministic global optimization based DBN approach for reverse engineering genetic networks from time course gene expression data. For such DBN models that consist only of inter time slice arcs, we show that there exists a polynomial time algorithm for learning the globally optimal network structure. The proposed approach, named GlobalMIT+, employs the recently proposed information theoretic scoring metric named mutual information test (MIT). GlobalMIT+ is able to learn high-order time delayed genetic interactions, which are common to most biological systems. Evaluation of the approach using both synthetic and real data sets, including a 733 cyanobacterial gene expression data set, shows significantly improved performance over other techniques. Conclusions Our studies demonstrate that deterministic global optimization approaches can infer large scale genetic networks.
Collapse
Affiliation(s)
- Nguyen Vinh Xuan
- Gippsland School of Information Technology, Monash University, Melbourne, Australia.
| | | | | | | |
Collapse
|
111
|
Morshed N, Chetty M, Nguyen XV. Simultaneous learning of instantaneous and time-delayed genetic interactions using novel information theoretic scoring technique. BMC SYSTEMS BIOLOGY 2012; 6:62. [PMID: 22691450 PMCID: PMC3529704 DOI: 10.1186/1752-0509-6-62] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2011] [Accepted: 06/06/2012] [Indexed: 11/10/2022]
Abstract
Background Understanding gene interactions is a fundamental question in systems biology. Currently, modeling of gene regulations using the Bayesian Network (BN) formalism assumes that genes interact either instantaneously or with a certain amount of time delay. However in reality, biological regulations, both instantaneous and time-delayed, occur simultaneously. A framework that can detect and model both these two types of interactions simultaneously would represent gene regulatory networks more accurately. Results In this paper, we introduce a framework based on the Bayesian Network (BN) formalism that can represent both instantaneous and time-delayed interactions between genes simultaneously. A novel scoring metric having firm mathematical underpinnings is also proposed that, unlike other recent methods, can score both interactions concurrently and takes into account the reality that multiple regulators can regulate a gene jointly, rather than in an isolated pair-wise manner. Further, a gene regulatory network (GRN) inference method employing an evolutionary search that makes use of the framework and the scoring metric is also presented. Conclusion By taking into consideration the biological fact that both instantaneous and time-delayed regulations can occur among genes, our approach models gene interactions with greater accuracy. The proposed framework is efficient and can be used to infer gene networks having multiple orders of instantaneous and time-delayed regulations simultaneously. Experiments are carried out using three different synthetic networks (with three different mechanisms for generating synthetic data) as well as real life networks of Saccharomyces cerevisiae, E. coli and cyanobacteria gene expression data. The results show the effectiveness of our approach.
Collapse
Affiliation(s)
- Nizamul Morshed
- Gippsland School of Information Technology, Faculty of Information Technology, Monash University, VIC 3842, Northways Road, Australia.
| | | | | |
Collapse
|
112
|
Issues impacting genetic network reverse engineering algorithm validation using small networks. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2012; 1824:1434-41. [PMID: 22683439 DOI: 10.1016/j.bbapap.2012.05.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2012] [Revised: 05/15/2012] [Accepted: 05/31/2012] [Indexed: 11/22/2022]
Abstract
Genetic network reverse engineering has been an area of intensive research within the systems biology community during the last decade. With many techniques currently available, the task of validating them and choosing the best one for a certain problem is a complex issue. Current practice has been to validate an approach on in-silico synthetic data sets, and, wherever possible, on real data sets with known ground-truth. In this study, we highlight a major issue that the validation of reverse engineering algorithms on small benchmark networks very often results in networks which are not statistically better than a randomly picked network. Another important issue highlighted is that with short time series, a small variation in the pre-processing procedure might yield large differences in the inferred networks. To demonstrate these issues, we have selected as our case study the IRMA in-vivo synthetic yeast network recently published in Cell. Using Fisher's exact test, we show that many results reported in the literature on reverse-engineering this network are not significantly better than random. The discussion is further extended to some other networks commonly used for validation purposes in the literature. The results presented in this study emphasize that studies carried out using small genetic networks are likely to be trivial, making it imperative that larger real networks be used for validating and benchmarking purposes. If smaller networks are considered, then the results should be interpreted carefully to avoid over confidence. This article is part of a Special Issue entitled: Computational Methods for Protein Interaction and Structural Prediction.
Collapse
|
113
|
Altaf-Ul-Amin M, Wada M, Kanaya S. Partitioning a PPI Network into Overlapping Modules Constrained by High-Density and Periphery Tracking. ACTA ACUST UNITED AC 2012. [DOI: 10.5402/2012/726429] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
This paper presents an algorithm called DPClusO for partitioning simple graphs into overlapping modules, that is, clusters constrained by density and periphery tracking. The major advantages of DPClusO over the related and previously published algorithm DPClus are shorter running time and ensuring coverage, that is, each node goes to at least one module. DPClusO is a general-purpose clustering algorithm and useful for finding overlapping cohesive groups in a simple graph for any type of application. This work shows that the modules generated by DPClusO from several PPI networks of yeast with high-density constraint match with more known complexes compared to some other recently published complex generating algorithms. Furthermore, the biological significance of the high density modules has been demonstrated by comparing their P values in the context of Gene Ontology (GO) terms with those of the randomly generated modules having the same size, distribution, and zero density. As a consequence, it was also learnt that a PPI network is a combination of mainly high-density and star-like modules.
Collapse
Affiliation(s)
- Md. Altaf-Ul-Amin
- Computational Systems Biology Lab, Nara Institute of Science and Technology, Ikoma, Nara 630-0101, Japan
| | - Masayoshi Wada
- Computational Systems Biology Lab, Nara Institute of Science and Technology, Ikoma, Nara 630-0101, Japan
| | - Shigehiko Kanaya
- Computational Systems Biology Lab, Nara Institute of Science and Technology, Ikoma, Nara 630-0101, Japan
| |
Collapse
|
114
|
de Matos Simoes R, Tripathi S, Emmert-Streib F. Organizational structure and the periphery of the gene regulatory network in B-cell lymphoma. BMC SYSTEMS BIOLOGY 2012; 6:38. [PMID: 22583750 PMCID: PMC3476434 DOI: 10.1186/1752-0509-6-38] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2011] [Accepted: 05/14/2012] [Indexed: 12/22/2022]
Abstract
Background The physical periphery of a biological cell is mainly described by signaling pathways which are triggered by transmembrane proteins and receptors that are sentinels to control the whole gene regulatory network of a cell. However, our current knowledge about the gene regulatory mechanisms that are governed by extracellular signals is severely limited. Results The purpose of this paper is three fold. First, we infer a gene regulatory network from a large-scale B-cell lymphoma expression data set using the C3NET algorithm. Second, we provide a functional and structural analysis of the largest connected component of this network, revealing that this network component corresponds to the peripheral region of a cell. Third, we analyze the hierarchical organization of network components of the whole inferred B-cell gene regulatory network by introducing a new approach which exploits the variability within the data as well as the inferential characteristics of C3NET. As a result, we find a functional bisection of the network corresponding to different cellular components. Conclusions Overall, our study allows to highlight the peripheral gene regulatory network of B-cells and shows that it is centered around hub transmembrane proteins located at the physical periphery of the cell. In addition, we identify a variety of novel pathological transmembrane proteins such as ion channel complexes and signaling receptors in B-cell lymphoma.
Collapse
Affiliation(s)
- Ricardo de Matos Simoes
- Computational Biology and Machine Learning Lab, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, Belfast, UK
| | | | | |
Collapse
|
115
|
Data simulation and regulatory network reconstruction from time-series microarray data using stepwise multiple linear regression. ACTA ACUST UNITED AC 2012. [DOI: 10.1007/s13721-012-0008-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
116
|
Pinho R, Borenstein E, Feldman MW. Most networks in Wagner's model are cycling. PLoS One 2012; 7:e34285. [PMID: 22511935 PMCID: PMC3325246 DOI: 10.1371/journal.pone.0034285] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2012] [Accepted: 02/25/2012] [Indexed: 11/18/2022] Open
Abstract
In this paper we study a model of gene networks introduced by Andreas Wagner in the 1990s that has been used extensively to study the evolution of mutational robustness. We investigate a range of model features and parameters and evaluate the extent to which they influence the probability that a random gene network will produce a fixed point steady state expression pattern. There are many different types of models used in the literature, (discrete/continuous, sparse/dense, small/large network) and we attempt to put some order into this diversity, motivated by the fact that many properties are qualitatively the same in all the models. Our main result is that random networks in all models give rise to cyclic behavior more often than fixed points. And although periodic orbits seem to dominate network dynamics, they are usually considered unstable and not allowed to survive in previous evolutionary studies. Defining stability as the probability of fixed points, we show that the stability distribution of these networks is highly robust to changes in its parameters. We also find sparser networks to be more stable, which may help to explain why they seem to be favored by evolution. We have unified several disconnected previous studies of this class of models under the framework of stability, in a way that had not been systematically explored before.
Collapse
Affiliation(s)
- Ricardo Pinho
- Department of Biology, Stanford University, Stanford, California, United States of America.
| | | | | |
Collapse
|
117
|
The layout of a bacterial genome. FEBS Lett 2012; 586:2043-8. [DOI: 10.1016/j.febslet.2012.03.051] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2012] [Revised: 03/25/2012] [Accepted: 03/26/2012] [Indexed: 12/25/2022]
|
118
|
Marbach D, Roy S, Ay F, Meyer PE, Candeias R, Kahveci T, Bristow CA, Kellis M. Predictive regulatory models in Drosophila melanogaster by integrative inference of transcriptional networks. Genome Res 2012; 22:1334-49. [PMID: 22456606 PMCID: PMC3396374 DOI: 10.1101/gr.127191.111] [Citation(s) in RCA: 89] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Gaining insights on gene regulation from large-scale functional data sets is a grand challenge in systems biology. In this article, we develop and apply methods for transcriptional regulatory network inference from diverse functional genomics data sets and demonstrate their value for gene function and gene expression prediction. We formulate the network inference problem in a machine-learning framework and use both supervised and unsupervised methods to predict regulatory edges by integrating transcription factor (TF) binding, evolutionarily conserved sequence motifs, gene expression, and chromatin modification data sets as input features. Applying these methods to Drosophila melanogaster, we predict ∼300,000 regulatory edges in a network of ∼600 TFs and 12,000 target genes. We validate our predictions using known regulatory interactions, gene functional annotations, tissue-specific expression, protein–protein interactions, and three-dimensional maps of chromosome conformation. We use the inferred network to identify putative functions for hundreds of previously uncharacterized genes, including many in nervous system development, which are independently confirmed based on their tissue-specific expression patterns. Last, we use the regulatory network to predict target gene expression levels as a function of TF expression, and find significantly higher predictive power for integrative networks than for motif or ChIP-based networks. Our work reveals the complementarity between physical evidence of regulatory interactions (TF binding, motif conservation) and functional evidence (coordinated expression or chromatin patterns) and demonstrates the power of data integration for network inference and studies of gene regulation at the systems level.
Collapse
Affiliation(s)
- Daniel Marbach
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | | | | | | | | | | | | |
Collapse
|
119
|
Mok J, Zhu X, Snyder M. Dissecting phosphorylation networks: lessons learned from yeast. Expert Rev Proteomics 2012; 8:775-86. [PMID: 22087660 DOI: 10.1586/epr.11.64] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Protein phosphorylation continues to be regarded as one of the most important post-translational modifications found in eukaryotes and has been implicated in key roles in the development of a number of human diseases. In order to elucidate roles for the 518 human kinases, phosphorylation has routinely been studied using the budding yeast Saccharomyces cerevisiae as a model system. In recent years, a number of technologies have emerged to globally map phosphorylation in yeast. In this article, we review these technologies and discuss how these phosphorylation mapping efforts have shed light on our understanding of kinase signaling pathways and eukaryotic proteomic networks in general.
Collapse
Affiliation(s)
- Janine Mok
- Stanford Genome Technology Center, Department of Biochemistry, Stanford School of Medicine, 855 S. California Avenue, Palo Alto, CA 94304, USA
| | | | | |
Collapse
|
120
|
van Dijk ADJ, van Mourik S, van Ham RCHJ. Mutational robustness of gene regulatory networks. PLoS One 2012; 7:e30591. [PMID: 22295094 PMCID: PMC3266278 DOI: 10.1371/journal.pone.0030591] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2011] [Accepted: 12/19/2011] [Indexed: 11/18/2022] Open
Abstract
Mutational robustness of gene regulatory networks refers to their ability to generate constant biological output upon mutations that change network structure. Such networks contain regulatory interactions (transcription factor – target gene interactions) but often also protein-protein interactions between transcription factors. Using computational modeling, we study factors that influence robustness and we infer several network properties governing it. These include the type of mutation, i.e. whether a regulatory interaction or a protein-protein interaction is mutated, and in the case of mutation of a regulatory interaction, the sign of the interaction (activating vs. repressive). In addition, we analyze the effect of combinations of mutations and we compare networks containing monomeric with those containing dimeric transcription factors. Our results are consistent with available data on biological networks, for example based on evolutionary conservation of network features. As a novel and remarkable property, we predict that networks are more robust against mutations in monomer than in dimer transcription factors, a prediction for which analysis of conservation of DNA binding residues in monomeric vs. dimeric transcription factors provides indirect evidence.
Collapse
Affiliation(s)
- Aalt D J van Dijk
- Applied Bioinformatics, PRI, Wageningen UR, Wageningen, The Netherlands.
| | | | | |
Collapse
|
121
|
Aittokallio T, Kurki M, Nevalainen O, Nikula T, West A, Lahesmaa R. Computational Strategies for Analyzing Data in Gene Expression Microarray Experiments. J Bioinform Comput Biol 2012; 1:541-86. [PMID: 15290769 DOI: 10.1142/s0219720003000319] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2003] [Revised: 07/02/2003] [Indexed: 11/18/2022]
Abstract
Microarray analysis has become a widely used method for generating gene expression data on a genomic scale. Microarrays have been enthusiastically applied in many fields of biological research, even though several open questions remain about the analysis of such data. A wide range of approaches are available for computational analysis, but no general consensus exists as to standard for microarray data analysis protocol. Consequently, the choice of data analysis technique is a crucial element depending both on the data and on the goals of the experiment. Therefore, basic understanding of bioinformatics is required for optimal experimental design and meaningful interpretation of the results. This review summarizes some of the common themes in DNA microarray data analysis, including data normalization and detection of differential expression. Algorithms are demonstrated by analyzing cDNA microarray data from an experiment monitoring gene expression in T helper cells. Several computational biology strategies, along with their relative merits, are overviewed and potential areas for additional research discussed. The goal of the review is to provide a computational framework for applying and evaluating such bioinformatics strategies. Solid knowledge of microarray informatics contributes to the implementation of more efficient computational protocols for the given data obtained through microarray experiments.
Collapse
Affiliation(s)
- Tero Aittokallio
- Department of Computational Biology, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-Shi, Chiba 277-8562, Japan.
| | | | | | | | | | | |
Collapse
|
122
|
Steinacher A, Soyer OS. Evolutionary principles underlying structure and response dynamics of cellular networks. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2012; 751:225-47. [PMID: 22821461 DOI: 10.1007/978-1-4614-3567-9_11] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The network view in systems biology, in conjunction with the continuing development of experimental technologies, is providing us with the key structural and dynamical features of both cell-wide and pathway-level regulatory, signaling and metabolic systems. These include for example modularity and presence of hub proteins at the structural level and ultrasensitivity and feedback control at the level of dynamics. The uncovering of such features, and the seeming commonality of some of them, makes many systems biologists believe that these could represent design principles that underpin cellular systems across organisms. Here, we argue that such claims on any observed feature requires an understanding of how it has emerged in evolution and how it can shape subsequent evolution. We review recent and past studies that aim to achieve such evolutionary understanding for observed features of cellular networks. We argue that this evolutionary framework could lead to deciphering evolutionary origin and relevance of proposed design principles, thereby allowing to predict their presence or absence in an organism based on its environment and biochemistry and their effect on its future evolution.
Collapse
Affiliation(s)
- Arno Steinacher
- College of Engineering, Mathematics and Physical Sciences, University of Exeter, Exeter, UK.
| | | |
Collapse
|
123
|
Yamamoto Y, Yokoyama K. Common and unique network dynamics in football games. PLoS One 2011; 6:e29638. [PMID: 22216336 PMCID: PMC3247158 DOI: 10.1371/journal.pone.0029638] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2010] [Accepted: 12/02/2011] [Indexed: 11/17/2022] Open
Abstract
The sport of football is played between two teams of eleven players each using a spherical ball. Each team strives to score by driving the ball into the opposing goal as the result of skillful interactions among players. Football can be regarded from the network perspective as a competitive relationship between two cooperative networks with a dynamic network topology and dynamic network node. Many complex large-scale networks have been shown to have topological properties in common, based on a small-world network and scale-free network models. However, the human dynamic movement pattern of this network has never been investigated in a real-world setting. Here, we show that the power law in degree distribution emerged in the passing behavior in the 2006 FIFA World Cup Final and an international “A” match in Japan, by describing players as vertices connected by links representing passes. The exponent values are similar to the typical values that occur in many real-world networks, which are in the range of , and are larger than that of a gene transcription network, . Furthermore, we reveal the stochastically switched dynamics of the hub player throughout the game as a unique feature in football games. It suggests that this feature could result not only in securing vulnerability against intentional attack, but also in a power law for self-organization. Our results suggest common and unique network dynamics of two competitive networks, compared with the large-scale networks that have previously been investigated in numerous works. Our findings may lead to improved resilience and survivability not only in biological networks, but also in communication networks.
Collapse
Affiliation(s)
- Yuji Yamamoto
- Research Center of Health, Physical Fitness and Sports, Nagoya University, Chikusa, Nagoya, Japan.
| | | |
Collapse
|
124
|
Iyer LM, Aravind L. Insights from the architecture of the bacterial transcription apparatus. J Struct Biol 2011; 179:299-319. [PMID: 22210308 DOI: 10.1016/j.jsb.2011.12.013] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2011] [Revised: 12/14/2011] [Accepted: 12/18/2011] [Indexed: 10/14/2022]
Abstract
We provide a portrait of the bacterial transcription apparatus in light of the data emerging from structural studies, sequence analysis and comparative genomics to bring out important but underappreciated features. We first describe the key structural highlights and evolutionary implications emerging from comparison of the cellular RNA polymerase subunits with the RNA-dependent RNA polymerase involved in RNAi in eukaryotes and their homologs from newly identified bacterial selfish elements. We describe some previously unnoticed domains and the possible evolutionary stages leading to the RNA polymerases of extant life forms. We then present the case for the ancient orthology of the basal transcription factors, the sigma factor and TFIIB, in the bacterial and the archaeo-eukaryotic lineages. We also present a synopsis of the structural and architectural taxonomy of specific transcription factors and their genome-scale demography. In this context, we present certain notable deviations from the otherwise invariant proteome-wide trends in transcription factor distribution and use it to predict the presence of an unusual lineage-specifically expanded signaling system in certain firmicutes like Paenibacillus. We then discuss the intersection between functional properties of transcription factors and the organization of transcriptional networks. Finally, we present some of the interesting evolutionary conundrums posed by our newly gained understanding of the bacterial transcription apparatus and potential areas for future explorations.
Collapse
Affiliation(s)
- Lakshminarayan M Iyer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, Room 5N50, Bethesda, MD 20894, USA
| | | |
Collapse
|
125
|
Topological structure of the space of phenotypes: the case of RNA neutral networks. PLoS One 2011; 6:e26324. [PMID: 22028856 PMCID: PMC3196570 DOI: 10.1371/journal.pone.0026324] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2011] [Accepted: 09/23/2011] [Indexed: 11/19/2022] Open
Abstract
The evolution and adaptation of molecular populations is constrained by the diversity accessible through mutational processes. RNA is a paradigmatic example of biopolymer where genotype (sequence) and phenotype (approximated by the secondary structure fold) are identified in a single molecule. The extreme redundancy of the genotype-phenotype map leads to large ensembles of RNA sequences that fold into the same secondary structure and can be connected through single-point mutations. These ensembles define neutral networks of phenotypes in sequence space. Here we analyze the topological properties of neutral networks formed by 12-nucleotides RNA sequences, obtained through the exhaustive folding of sequence space. A total of 4(12) sequences fragments into 645 subnetworks that correspond to 57 different secondary structures. The topological analysis reveals that each subnetwork is far from being random: it has a degree distribution with a well-defined average and a small dispersion, a high clustering coefficient, and an average shortest path between nodes close to its minimum possible value, i.e. the Hamming distance between sequences. RNA neutral networks are assortative due to the correlation in the composition of neighboring sequences, a feature that together with the symmetries inherent to the folding process explains the existence of communities. Several topological relationships can be analytically derived attending to structural restrictions and generic properties of the folding process. The average degree of these phenotypic networks grows logarithmically with their size, such that abundant phenotypes have the additional advantage of being more robust to mutations. This property prevents fragmentation of neutral networks and thus enhances the navigability of sequence space. In summary, RNA neutral networks show unique topological properties, unknown to other networks previously described.
Collapse
|
126
|
Erb I, van Nimwegen E. Transcription factor binding site positioning in yeast: proximal promoter motifs characterize TATA-less promoters. PLoS One 2011; 6:e24279. [PMID: 21931670 PMCID: PMC3170328 DOI: 10.1371/journal.pone.0024279] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2011] [Accepted: 08/09/2011] [Indexed: 12/26/2022] Open
Abstract
The availability of sequence specificities for a substantial fraction of yeast's transcription factors and comparative genomic algorithms for binding site prediction has made it possible to comprehensively annotate transcription factor binding sites genome-wide. Here we use such a genome-wide annotation for comprehensively studying promoter architecture in yeast, focusing on the distribution of transcription factor binding sites relative to transcription start sites, and the architecture of TATA and TATA-less promoters. For most transcription factors, binding sites are positioned further upstream and vary over a wider range in TATA promoters than in TATA-less promoters. In contrast, a group of ‘proximal promoter motifs’ (GAT1/GLN3/DAL80, FKH1/2, PBF1/2, RPN4, NDT80, and ROX1) occur preferentially in TATA-less promoters and show a strong preference for binding close to the transcription start site in these promoters. We provide evidence that suggests that pre-initiation complexes are recruited at TATA sites in TATA promoters and at the sites of the other proximal promoter motifs in TATA-less promoters. TATA-less promoters can generally be classified by the proximal promoter motif they contain, with different classes of TATA-less promoters showing different patterns of transcription factor binding site positioning and nucleosome coverage. These observations suggest that different modes of regulation of transcription initiation may be operating in the different promoter classes. In addition we show that, across all promoter classes, there is a close match between nucleosome free regions and regions of highest transcription factor binding site density. This close agreement between transcription factor binding site density and nucleosome depletion suggests a direct and general competition between transcription factors and nucleosomes for binding to promoters.
Collapse
Affiliation(s)
- Ionas Erb
- Bioinformatics and Genomics program, Center for Genomic Regulation and Pompeu Fabra University, Barcelona, Spain
| | - Erik van Nimwegen
- Biozentrum, University of Basel, and Swiss Institute of Bioinformatics, Basel, Switzerland
- * E-mail:
| |
Collapse
|
127
|
Xie X, Jin J, Mao Y. Evolutionary versatility of eukaryotic protein domains revealed by their bigram networks. BMC Evol Biol 2011; 11:242. [PMID: 21849086 PMCID: PMC3167776 DOI: 10.1186/1471-2148-11-242] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2011] [Accepted: 08/18/2011] [Indexed: 11/21/2022] Open
Abstract
Background Protein domains are globular structures of independently folded polypeptides that exert catalytic or binding activities. Their sequences are recognized as evolutionary units that, through genome recombination, constitute protein repertoires of linkage patterns. Via mutations, domains acquire modified functions that contribute to the fitness of cells and organisms. Recent studies have addressed the evolutionary selection that may have shaped the functions of individual domains and the emergence of particular domain combinations, which led to new cellular functions in multi-cellular animals. This study focuses on modeling domain linkage globally and investigates evolutionary implications that may be revealed by novel computational analysis. Results A survey of 77 completely sequenced eukaryotic genomes implies a potential hierarchical and modular organization of biological functions in most living organisms. Domains in a genome or multiple genomes are modeled as a network of hetero-duplex covalent linkages, termed bigrams. A novel computational technique is introduced to decompose such networks, whereby the notion of domain "networking versatility" is derived and measured. The most and least "versatile" domains (termed "core domains" and "peripheral domains" respectively) are examined both computationally via sequence conservation measures and experimentally using selected domains. Our study suggests that such a versatility measure extracted from the bigram networks correlates with the adaptivity of domains during evolution, where the network core domains are highly adaptive, significantly contrasting the network peripheral domains. Conclusions Domain recombination has played a major part in the evolution of eukaryotes attributing to genome complexity. From a system point of view, as the results of selection and constant refinement, networks of domain linkage are structured in a hierarchical modular fashion. Domains with high degree of networking versatility appear to be evolutionary adaptive, potentially through functional innovations. Domain bigram networks are informative as a model of biological functions. The networking versatility indices extracted from such networks for individual domains reflect the strength of evolutionary selection that the domains have experienced.
Collapse
Affiliation(s)
- Xueying Xie
- Research Center for Learning Science, Southeast University, Sipai Lou 2, Nanjing 210096 China.
| | | | | |
Collapse
|
128
|
Huang S. Systems biology of stem cells: three useful perspectives to help overcome the paradigm of linear pathways. Philos Trans R Soc Lond B Biol Sci 2011; 366:2247-59. [PMID: 21727130 PMCID: PMC3130416 DOI: 10.1098/rstb.2011.0008] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Stem cell behaviours, such as stabilization of the undecided state of pluripotency or multipotency, the priming towards a prospective fate, binary fate decisions and irreversible commitment, must all somehow emerge from a genome-wide gene-regulatory network. Its unfathomable complexity defies the standard mode of explanation that is deeply rooted in molecular biology thinking: the reduction of observables to linear deterministic molecular pathways that are tacitly taken as chains of causation. Such culture of proximate explanation that uses qualitative arguments, simple arrow-arrow schemes or metaphors persists despite the ceaseless accumulation of 'omics' data and the rise of systems biology that now offers precise conceptual tools to explain emergent cell behaviours from gene networks. To facilitate the embrace of the principles of physics and mathematics that underlie such systems and help to bridge the gap between the formal description of theorists and the intuition of experimental biologists, we discuss in qualitative terms three perspectives outside the realm of their familiar linear-deterministic view: (i) state space (ii), high-dimensionality and (iii) heterogeneity. These concepts jointly offer a new vista on stem cell regulation that naturally explains many novel, counterintuitive observations and their inherent inevitability, obviating the need for ad hoc explanations of their existence based on natural selection. Hopefully, this expanded view will stimulate novel experimental designs.
Collapse
Affiliation(s)
- Sui Huang
- Institute for Biocomplexity and Informatics, University of Calgary, 2500 University Drive NW, Calgary, Alberta T2N 1N4, Canada.
| |
Collapse
|
129
|
Emmert-Streib F, Dehmer M. Networks for systems biology: conceptual connection of data and function. IET Syst Biol 2011; 5:185-207. [PMID: 21639592 DOI: 10.1049/iet-syb.2010.0025] [Citation(s) in RCA: 99] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
The purpose of this study is to survey the use of networks and network-based methods in systems biology. This study starts with an introduction to graph theory and basic measures allowing to quantify structural properties of networks. Then, the authors present important network classes and gene networks as well as methods for their analysis. In the last part of this study, the authors review approaches that aim at analysing the functional organisation of gene networks and the use of networks in medicine. In addition to this, the authors advocate networks as a systematic approach to general problems in systems biology, because networks are capable of assuming multiple roles that are very beneficial connecting experimental data with a functional interpretation in biological terms.
Collapse
Affiliation(s)
- F Emmert-Streib
- Queen's University Belfast, Computational Biology and Machine Learning Lab, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Belfast, UK
| | | |
Collapse
|
130
|
Pinna A, Soranzo N, Hoeschele I, de la Fuente A. Simulating systems genetics data with SysGenSIM. Bioinformatics 2011; 27:2459-62. [PMID: 21737438 DOI: 10.1093/bioinformatics/btr407] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
SUMMARY SysGenSIM is a software package to simulate Systems Genetics (SG) experiments in model organisms, for the purpose of evaluating and comparing statistical and computational methods and their implementations for analyses of SG data [e.g. methods for expression quantitative trait loci (eQTL) mapping and network inference]. SysGenSIM allows the user to select a variety of network topologies, genetic and kinetic parameters to simulate SG data ( genotyping, gene expression and phenotyping) with large gene networks with thousands of nodes. The software is encoded in MATLAB, and a user-friendly graphical user interface is provided. AVAILABILITY The open-source software code and user manual can be downloaded at: http://sysgensim.sourceforge.net/ CONTACT alf@crs4.it.
Collapse
|
131
|
Altay G, Emmert-Streib F. Structural influence of gene networks on their inference: analysis of C3NET. Biol Direct 2011; 6:31. [PMID: 21696592 PMCID: PMC3136421 DOI: 10.1186/1745-6150-6-31] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2011] [Accepted: 06/22/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The availability of large-scale high-throughput data possesses considerable challenges toward their functional analysis. For this reason gene network inference methods gained considerable interest. However, our current knowledge, especially about the influence of the structure of a gene network on its inference, is limited. RESULTS In this paper we present a comprehensive investigation of the structural influence of gene networks on the inferential characteristics of C3NET - a recently introduced gene network inference algorithm. We employ local as well as global performance metrics in combination with an ensemble approach. The results from our numerical study for various biological and synthetic network structures and simulation conditions, also comparing C3NET with other inference algorithms, lead a multitude of theoretical and practical insights into the working behavior of C3NET. In addition, in order to facilitate the practical usage of C3NET we provide an user-friendly R package, called c3net, and describe its functionality. It is available from https://r-forge.r-project.org/projects/c3net and from the CRAN package repository. CONCLUSIONS The availability of gene network inference algorithms with known inferential properties opens a new era of large-scale screening experiments that could be equally beneficial for basic biological and biomedical research with auspicious prospects. The availability of our easy to use software package c3net may contribute to the popularization of such methods.
Collapse
Affiliation(s)
- Gökmen Altay
- School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, Belfast, BT9 7BL, UK
| | | |
Collapse
|
132
|
Lopes FM, Cesar RM, Costa LDF. Gene expression complex networks: synthesis, identification, and analysis. J Comput Biol 2011; 18:1353-67. [PMID: 21548810 DOI: 10.1089/cmb.2010.0118] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdös-Rényi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabási-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree <k> variation, decreasing its network recovery rate with the increase of <k>. The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.
Collapse
Affiliation(s)
- Fabrício M Lopes
- Federal University of Technology-Paraná and Institute of Mathematics and Statistics, University of São Paulo, Brazil.
| | | | | |
Collapse
|
133
|
Structure–function relations are subtle in genetic regulatory networks. Math Biosci 2011; 231:61-8. [DOI: 10.1016/j.mbs.2011.02.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2010] [Revised: 01/10/2011] [Accepted: 02/09/2011] [Indexed: 02/01/2023]
|
134
|
Kentzoglanakis K, Poole M. A swarm intelligence framework for reconstructing gene networks: searching for biologically plausible architectures. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 9:358-371. [PMID: 21576756 DOI: 10.1109/tcbb.2011.87] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
In this paper, we investigate the problem of reverse engineering the topology of gene regulatory networks from temporal gene expression data. We adopt a computational intelligence approach comprising swarm intelligence techniques, namely particle swarm optimization (PSO) and ant colony optimization (ACO). In addition, the recurrent neural network (RNN) formalism is employed for modeling the dynamical behavior of gene regulatory systems. More specifically, ACO is used for searching the discrete space of network architectures and PSO for searching the corresponding continuous space of RNN model parameters. We propose a novel solution construction process in the context of ACO for generating biologically plausible candidate architectures. The objective is to concentrate the search effort into areas of the structure space that contain architectures which are feasible in terms of their topological resemblance to real-world networks. The proposed framework is initially applied to the reconstruction of a small artificial network that has previously been studied in the context of gene network reverse engineering. Subsequently, we consider an artificial data set with added noise for reconstructing a subnetwork of the genetic interaction network of S. cerevisiae (yeast). Finally, the framework is applied to a real-world data set for reverse engineering the SOS response system of the bacterium Escherichia coli. Results demonstrate the relative advantage of utilizing problem-specific knowledge regarding biologically plausible structural properties of gene networks over conducting a problem-agnostic search in the vast space of network architectures.
Collapse
|
135
|
Pavlopoulos GA, Secrier M, Moschopoulos CN, Soldatos TG, Kossida S, Aerts J, Schneider R, Bagos PG. Using graph theory to analyze biological networks. BioData Min 2011; 4:10. [PMID: 21527005 PMCID: PMC3101653 DOI: 10.1186/1756-0381-4-10] [Citation(s) in RCA: 311] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2010] [Accepted: 04/28/2011] [Indexed: 11/10/2022] Open
Abstract
Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can be done by examining the elementary constituents individually and then how these are connected. The myriad components of a system and their interactions are best characterized as networks and they are mainly represented as graphs where thousands of nodes are connected with thousands of vertices. In this article we demonstrate approaches, models and methods from the graph theory universe and we discuss ways in which they can be used to reveal hidden properties and features of a network. This network profiling combined with knowledge extraction will help us to better understand the biological significance of the system.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Department of Computer Science and Biomedical Informatics, University of Central Greece, Lamia, 35100, Greece
- Faculty of Engineering - ESAT/SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, 3001, Leuven-Heverlee, Belgium
| | - Maria Secrier
- Structural and Computational Biology Unit, EMBL, Meyerhofstrasse 1, 69117, Heidelberg, Germany
| | - Charalampos N Moschopoulos
- Department of Computer Engineering & Informatics, University of Patras, Rio, 6500, Patras, Greece
- Bioinformatics & Medical Informatics Team, Biomedical Research Foundation, Academy of Athens, Soranou Efessiou 4, 11527, Athens, Greece
| | | | - Sophia Kossida
- Bioinformatics & Medical Informatics Team, Biomedical Research Foundation, Academy of Athens, Soranou Efessiou 4, 11527, Athens, Greece
| | - Jan Aerts
- Faculty of Engineering - ESAT/SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, 3001, Leuven-Heverlee, Belgium
| | - Reinhard Schneider
- Structural and Computational Biology Unit, EMBL, Meyerhofstrasse 1, 69117, Heidelberg, Germany
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Campus Limpertsberg, 162 A, avenue de la Faïencerie, L-1511 Luxembourg
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of Central Greece, Lamia, 35100, Greece
| |
Collapse
|
136
|
Mittal N, Scherrer T, Gerber AP, Janga SC. Interplay between posttranscriptional and posttranslational interactions of RNA-binding proteins. J Mol Biol 2011; 409:466-79. [PMID: 21501624 DOI: 10.1016/j.jmb.2011.03.064] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2011] [Revised: 03/02/2011] [Accepted: 03/29/2011] [Indexed: 11/17/2022]
Abstract
RNA-binding proteins (RBPs) play important roles in the posttranscriptional control of gene expression. However, our understanding of how RBPs interact with each other at different regulatory levels to coordinate the RNA metabolism of the cell is rather limited. Here, we construct the posttranscriptional regulatory network among 69 experimentally studied RBPs in yeast to show that more than one-third of the RBPs autoregulate their expression at the posttranscriptional level and demonstrate that autoregulatory RBPs show reduced protein noise with a tendency to encode for hubs in this network. We note that in- and outdegrees in the posttranscriptional RBP-RBP regulatory network exhibit gaussian and scale-free distributions, respectively. This network was also densely interconnected with extensive cross-talk between RBPs belonging to different posttranscriptional steps, regulating varying numbers of cellular RNA targets. We show that feed-forward loops and superposed feed-forward/feedback loops are the most significant three-node subgraphs in this network. Analysis of the corresponding protein-protein interaction (posttranslational) network revealed that it is more modular than the posttranscriptional regulatory network. There is significant overlap between the regulatory and protein-protein interaction networks, with RBPs that potentially control each other at the posttranscriptional level tending to physically interact and being part of the same ribonucleoprotein (RNP) complex. Our observations put forward a model wherein RBPs could be classified into those that can stably interact with a limited number of protein partners, forming stable RNP complexes, and others that form transient hubs, having the ability to interact with multiple RBPs forming many RNPs in the cell.
Collapse
Affiliation(s)
- Nitish Mittal
- Biozentrum, University of Basel, Klingelbergstrasse, Switzerland
| | | | | | | |
Collapse
|
137
|
Kim TM, Park PJ. Advances in analysis of transcriptional regulatory networks. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2011; 3:21-35. [PMID: 21069662 DOI: 10.1002/wsbm.105] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
A transcriptional regulatory network represents a molecular framework in which developmental or environmental cues are transformed into differential expression of genes. Transcriptional regulation is mediated by the combinatorial interplay between cis-regulatory DNA elements and trans-acting transcription factors, and is perhaps the most important mechanism for controlling gene expression. Recent innovations, most notably the method for detecting protein-DNA interactions genome-wide, can help provide a comprehensive catalog of cis-regulatory elements and their interaction with given trans-acting factors in a given condition. A transcriptional regulatory network that integrates such information can lead to a systems-level understanding of regulatory mechanisms. In this review, we will highlight the key aspects of current knowledge on eukaryotic transcriptional regulation, especially on known transcription factors and their interacting regulatory elements. Then we will review some recent technical advances for genome-wide mapping of DNA-protein interactions based on high-throughput sequencing. Finally, we will discuss the types of biological insights that can be obtained from a network-level understanding of transcription regulation as well as future challenges in the field.
Collapse
Affiliation(s)
- Tae-Min Kim
- Center for Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | | |
Collapse
|
138
|
Ram R, Chetty M. A Markov-blanket-based model for gene regulatory network inference. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:353-367. [PMID: 21233520 DOI: 10.1109/tcbb.2009.70] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
An efficient two-step Markov blanket method for modeling and inferring complex regulatory networks from large-scale microarray data sets is presented. The inferred gene regulatory network (GRN) is based on the time series gene expression data capturing the underlying gene interactions. For constructing a highly accurate GRN, the proposed method performs: 1) discovery of a gene's Markov Blanket (MB), 2) formulation of a flexible measure to determine the network's quality, 3) efficient searching with the aid of a guided genetic algorithm, and 4) pruning to obtain a minimal set of correct interactions. Investigations are carried out using both synthetic as well as yeast cell cycle gene expression data sets. The realistic synthetic data sets validate the robustness of the method by varying topology, sample size, time delay, noise, vertex in-degree, and the presence of hidden nodes. It is shown that the proposed approach has excellent inferential capabilities and high accuracy even in the presence of noise. The gene network inferred from yeast cell cycle data is investigated for its biological relevance using well-known interactions, sequence analysis, motif patterns, and GO data. Further, novel interactions are predicted for the unknown genes of the network and their influence on other genes is also discussed.
Collapse
Affiliation(s)
- Ramesh Ram
- Gippsland School of Information Technology, Monash University, Gippsland Campus, VIC 3842, Australia.
| | | |
Collapse
|
139
|
Pflieger D, Gonnet F, de la Fuente van Bentem S, Hirt H, de la Fuente A. Linking the proteins--elucidation of proteome-scale networks using mass spectrometry. MASS SPECTROMETRY REVIEWS 2011; 30:268-297. [PMID: 21337599 DOI: 10.1002/mas.20278] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2009] [Revised: 10/05/2009] [Accepted: 10/05/2009] [Indexed: 05/30/2023]
Abstract
Proteomes are intricate. Typically, thousands of proteins interact through physical association and post-translational modifications (PTMs) to give rise to the emergent functions of cells. Understanding these functions requires one to study proteomes as "systems" rather than collections of individual protein molecules. The abstraction of the interacting proteome to "protein networks" has recently gained much attention, as networks are effective representations, that lose specific molecular details, but provide the ability to see the proteome as a whole. Mostly two aspects of the proteome have been represented by network models: proteome-wide physical protein-protein-binding interactions organized into Protein Interaction Networks (PINs), and proteome-wide PTM relations organized into Protein Signaling Networks (PSNs). Mass spectrometry (MS) techniques have been shown to be essential to reveal both of these aspects on a proteome-wide scale. Techniques such as affinity purification followed by MS have been used to elucidate protein-protein interactions, and MS-based quantitative phosphoproteomics is critical to understand the structure and dynamics of signaling through the proteome. We here review the current state-of-the-art MS-based analytical pipelines for the purpose to characterize proteome-scale networks.
Collapse
Affiliation(s)
- Delphine Pflieger
- Laboratoire Analyse et Modélisation pour la Biologie et l'Environnement, Université d'Evry Val d'Essonne, CNRS UMR 8587, Evry, France
| | | | | | | | | |
Collapse
|
140
|
Early Career Research Award Lecture. Structure, evolution and dynamics of transcriptional regulatory networks. Biochem Soc Trans 2011; 38:1155-78. [PMID: 20863280 DOI: 10.1042/bst0381155] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The availability of entire genome sequences and the wealth of literature on gene regulation have enabled researchers to model an organism's transcriptional regulation system in the form of a network. In such a network, TFs (transcription factors) and TGs (target genes) are represented as nodes and regulatory interactions between TFs and TGs are represented as directed links. In the present review, I address the following topics pertaining to transcriptional regulatory networks. (i) Structure and organization: first, I introduce the concept of networks and discuss our understanding of the structure and organization of transcriptional networks. (ii) Evolution: I then describe the different mechanisms and forces that influence network evolution and shape network structure. (iii) Dynamics: I discuss studies that have integrated information on dynamics such as mRNA abundance or half-life, with data on transcriptional network in order to elucidate general principles of regulatory network dynamics. In particular, I discuss how cell-to-cell variability in the expression level of TFs could permit differential utilization of the same underlying network by distinct members of a genetically identical cell population. Finally, I conclude by discussing open questions for future research and highlighting the implications for evolution, development, disease and applications such as genetic engineering.
Collapse
|
141
|
Abstract
The availability of high-throughput methods to detect protein interactions made construction of comprehensive protein interaction networks for several important model organisms possible. Many studies have since focused on uncovering the structural principles of these networks and relating these structures to biological processes. On a global scale, there are striking similarities in the structure of different protein interaction networks, even when distantly related species, such as the yeast Saccharomyces cerevisiae and the fruit fly Drosophila melanogaster, are compared. However, there is also considerable variance in network structures caused by the gain and loss of genes and mutations which alter the interaction behavior of the encoded proteins. Here, we focus on the current state of knowledge on the structure of protein interaction networks and the evolutionary processes that shaped these structures.
Collapse
Affiliation(s)
- Andreas Schüler
- Bioinformatics Division, School of Biological Sciences, Institute for Evolution and Biodiversity, University of Muenster, Münster, Germany
| | | |
Collapse
|
142
|
Ho JWK, Charleston MA. Network modelling of gene regulation. Biophys Rev 2010; 3:1-13. [PMID: 28510232 DOI: 10.1007/s12551-010-0041-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2010] [Accepted: 11/04/2010] [Indexed: 11/28/2022] Open
Abstract
Gene regulatory network (GRN) modelling has gained increasing attention in the past decade. Many computational modelling techniques have been proposed to facilitate the inference and analysis of GRN. However, there is often confusion about the aim of GRN modelling, and how a gene network model can be fully utilised as a tool for systems biology. The aim of the present article is to provide an overview of this rapidly expanding subject. In particular, we review some fundamental concepts of systems biology and discuss the role of network modelling in understanding complex biological systems. Several commonly used network modelling paradigms are surveyed with emphasis on their practical use in systems biology research.
Collapse
Affiliation(s)
- Joshua W K Ho
- School of Information Technologies, The University of Sydney, Sydney, NSW, 2006, Australia.
| | - Michael A Charleston
- School of Information Technologies, The University of Sydney, Sydney, NSW, 2006, Australia.,Centre for Mathematical Biology, The University of Sydney, Sydney, NSW, 2006, Australia
| |
Collapse
|
143
|
Reverse engineering gene regulatory networks related to quorum sensing in the plant pathogen Pectobacterium atrosepticum. Methods Mol Biol 2010; 673:253-81. [PMID: 20835805 DOI: 10.1007/978-1-60761-842-3_17] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2023]
Abstract
The objective of the project reported in the present chapter was the reverse engineering of gene regulatory networks related to quorum sensing in the plant pathogen Pectobacterium atrosepticum from micorarray gene expression profiles, obtained from the wild-type and eight knockout strains. To this end, we have applied various recent methods from multivariate statistics and machine learning: graphical Gaussian models, sparse Bayesian regression, LASSO (least absolute shrinkage and selection operator), Bayesian networks, and nested effects models. We have investigated the degree of similarity between the predictions obtained with the different approaches, and we have assessed the consistency of the reconstructed networks in terms of global topological network properties, based on the node degree distribution. The chapter concludes with a biological evaluation of the predicted network structures.
Collapse
|
144
|
Lelandais G, Devaux F. Comparative Functional Genomics of Stress Responses in Yeasts. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2010; 14:501-15. [DOI: 10.1089/omi.2010.0029] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Gaëlle Lelandais
- Dynamique des Structures et Interactions des Macromolécules Biologiques (DSIMB), INSERM UMR-S 665, Université Paris Diderot, Paris France
| | - Frédéric Devaux
- Laboratoire de génomique des microorganismes, CNRS FRE3214, Université Pierre et Marie Curie, Institut des Cordeliers, Paris, France
| |
Collapse
|
145
|
Altay G, Emmert-Streib F. Inferring the conservative causal core of gene regulatory networks. BMC SYSTEMS BIOLOGY 2010; 4:132. [PMID: 20920161 PMCID: PMC2955605 DOI: 10.1186/1752-0509-4-132] [Citation(s) in RCA: 106] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/04/2010] [Accepted: 09/28/2010] [Indexed: 11/18/2022]
Abstract
Background Inferring gene regulatory networks from large-scale expression data is an important problem that received much attention in recent years. These networks have the potential to gain insights into causal molecular interactions of biological processes. Hence, from a methodological point of view, reliable estimation methods based on observational data are needed to approach this problem practically. Results In this paper, we introduce a novel gene regulatory network inference (GRNI) algorithm, called C3NET. We compare C3NET with four well known methods, ARACNE, CLR, MRNET and RN, conducting in-depth numerical ensemble simulations and demonstrate also for biological expression data from E. coli that C3NET performs consistently better than the best known GRNI methods in the literature. In addition, it has also a low computational complexity. Since C3NET is based on estimates of mutual information values in conjunction with a maximization step, our numerical investigations demonstrate that our inference algorithm exploits causal structural information in the data efficiently. Conclusions For systems biology to succeed in the long run, it is of crucial importance to establish methods that extract large-scale gene networks from high-throughput data that reflect the underlying causal interactions among genes or gene products. Our method can contribute to this endeavor by demonstrating that an inference algorithm with a neat design permits not only a more intuitive and possibly biological interpretation of its working mechanism but can also result in superior results.
Collapse
Affiliation(s)
- Gökmen Altay
- Computational Biology and Machine Learning, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, 97 Lisburn Road, Belfast, BT9 7BL, UK
| | | |
Collapse
|
146
|
Zhang M, Lu LJ. Investigating the validity of current network analysis on static conglomerate networks by protein network stratification. BMC Bioinformatics 2010; 11:466. [PMID: 20846443 PMCID: PMC2949894 DOI: 10.1186/1471-2105-11-466] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2010] [Accepted: 09/16/2010] [Indexed: 01/25/2023] Open
Abstract
Background A molecular network perspective forms the foundation of systems biology. A common practice in analyzing protein-protein interaction (PPI) networks is to perform network analysis on a conglomerate network that is an assembly of all available binary interactions in a given organism from diverse data sources. Recent studies on network dynamics suggested that this approach might have ignored the dynamic nature of context-dependent molecular systems. Results In this study, we employed a network stratification strategy to investigate the validity of the current network analysis on conglomerate PPI networks. Using the genome-scale tissue- and condition-specific proteomics data in Arabidopsis thaliana, we present here the first systematic investigation into this question. We stratified a conglomerate A. thaliana PPI network into three levels of context-dependent subnetworks. We then focused on three types of most commonly conducted network analyses, i.e., topological, functional and modular analyses, and compared the results from these network analyses on the conglomerate network and five stratified context-dependent subnetworks corresponding to specific tissues. Conclusions We found that the results based on the conglomerate PPI network are often significantly different from those of context-dependent subnetworks corresponding to specific tissues or conditions. This conclusion depends neither on relatively arbitrary cutoffs (such as those defining network hubs or bottlenecks), nor on specific network clustering algorithms for module extraction, nor on the possible high false positive rates of binary interactions in PPI networks. We also found that our conclusions are likely to be valid in human PPI networks. Furthermore, network stratification may help resolve many controversies in current research of systems biology.
Collapse
Affiliation(s)
- Minlu Zhang
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, OH 45229, USA
| | | |
Collapse
|
147
|
Evolution of gene regulatory networks by fluctuating selection and intrinsic constraints. PLoS Comput Biol 2010; 6. [PMID: 20700492 PMCID: PMC2916849 DOI: 10.1371/journal.pcbi.1000873] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2009] [Accepted: 06/30/2010] [Indexed: 11/23/2022] Open
Abstract
Various characteristics of complex gene regulatory networks (GRNs) have been discovered during the last decade, e.g., redundancy, exponential indegree distributions, scale-free outdegree distributions, mutational robustness, and evolvability. Although progress has been made in this field, it is not well understood whether these characteristics are the direct products of selection or those of other evolutionary forces such as mutational biases and biophysical constraints. To elucidate the causal factors that promoted the evolution of complex GRNs, we examined the effect of fluctuating environmental selection and some intrinsic constraining factors on GRN evolution by using an individual-based model. We found that the evolution of complex GRNs is remarkably promoted by fixation of beneficial gene duplications under unpredictably fluctuating environmental conditions and that some internal factors inherent in organisms, such as mutational bias, gene expression costs, and constraints on expression dynamics, are also important for the evolution of GRNs. The results indicate that various biological properties observed in GRNs could evolve as a result of not only adaptation to unpredictable environmental changes but also non-adaptive processes owing to the properties of the organisms themselves. Our study emphasizes that evolutionary models considering such intrinsic constraining factors should be used as null models to analyze the effect of selection on GRN evolution. Various organismal traits, including the morphology of multicellular species and metabolism in unicellular species, are determined by the amount and combinations of proteins in the cell. The complex regulatory network plays an important role in controlling the protein profiles in a cell. Recent studies have revealed that gene regulatory networks have many interesting structural and mutational features such as their scale-free structure, mutational robustness, and evolvability. However, why and how these features have emerged from evolution is unknown. In this paper, we constructed an evolutionary model of gene regulatory networks and simulated its evolution under various environmental conditions. The results show that most features of known gene regulatory networks evolve as a result of adaptation to unpredictable environmental fluctuations. In addition, some internal organismal factors, such as mutational bias, gene expression costs, and constraints on expression dynamics, are also important for GRN evolution observed in real organisms. Thus, these GRN features appear to evolve as a result of not only adaptation to unpredictable environmental changes but also non-adaptive processes owing to the properties of the organisms themselves.
Collapse
|
148
|
Chen L, Wang H, Zhang L, Li W, Wang Q, Shang Y, He Y, He W, Li X, Tai J, Li X. Uncovering packaging features of co-regulated modules based on human protein interaction and transcriptional regulatory networks. BMC Bioinformatics 2010; 11:392. [PMID: 20649980 PMCID: PMC2914056 DOI: 10.1186/1471-2105-11-392] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2010] [Accepted: 07/22/2010] [Indexed: 01/23/2023] Open
Abstract
Background Network co-regulated modules are believed to have the functionality of packaging multiple biological entities, and can thus be assumed to coordinate many biological functions in their network neighbouring regions. Results Here, we weighted edges of a human protein interaction network and a transcriptional regulatory network to construct an integrated network, and introduce a probabilistic model and a bipartite graph framework to exploit human co-regulated modules and uncover their specific features in packaging different biological entities (genes, protein complexes or metabolic pathways). Finally, we identified 96 human co-regulated modules based on this method, and evaluate its effectiveness by comparing it with four other methods. Conclusions Dysfunctions in co-regulated interactions often occur in the development of cancer. Therefore, we focussed on an example co-regulated module and found that it could integrate a number of cancer-related genes. This was extended to causal dysfunctions of some complexes maintained by several physically interacting proteins, thus coordinating several metabolic pathways that directly underlie cancer.
Collapse
Affiliation(s)
- Lina Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Hei Longjiang Province, China.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
149
|
Paixão T, Azevedo RBR. Redundancy and the evolution of cis-regulatory element multiplicity. PLoS Comput Biol 2010; 6:e1000848. [PMID: 20628617 PMCID: PMC2900288 DOI: 10.1371/journal.pcbi.1000848] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2009] [Accepted: 06/02/2010] [Indexed: 01/10/2023] Open
Abstract
The promoter regions of many genes contain multiple binding sites for the same transcription factor (TF). One possibility is that this multiplicity evolved through transitional forms showing redundant cis-regulation. To evaluate this hypothesis, we must disentangle the relative contributions of different evolutionary mechanisms to the evolution of binding site multiplicity. Here, we attempt to do this using a model of binding site evolution. Our model considers binding sequences and their interactions with TFs explicitly, and allows us to cast the evolution of gene networks into a neutral network framework. We then test some of the model's predictions using data from yeast. Analysis of the model suggested three candidate nonadaptive processes favoring the evolution of cis-regulatory element redundancy and multiplicity: neutral evolution in long promoters, recombination and TF promiscuity. We find that recombination rate is positively associated with binding site multiplicity in yeast. Our model also indicated that weak direct selection for multiplicity (partial redundancy) can play a major role in organisms with large populations. Our data suggest that selection for changes in gene expression level may have contributed to the evolution of multiple binding sites in yeast. We conclude that the evolution of cis-regulatory element redundancy and multiplicity is impacted by many aspects of the biology of an organism: both adaptive and nonadaptive processes, both changes in cis to binding sites and in trans to the TFs that interact with them, both the functional setting of the promoter and the population genetic context of the individuals carrying them. TFs regulate gene expression by binding to specific sequences in the promoter regions of their target genes. Promoters often contain multiple copies of the same TF binding sites. How does this multiplicity evolve? One possibility is that individuals with multiple, redundant binding sites have higher fitness. However, nonadaptive processes are also likely to be important. Here, we develop a mathematical model of the evolution of TF binding sites to help us disentangle how different evolutionary mechanisms contribute to the evolution of binding site redundancy and multiplicity. We show that recombination is expected to promote the evolution of multiple binding sites. This prediction is corroborated by genome-wide data from yeast. Another important factor in the evolution of multiplicity predicted in our analysis is TF promiscuity, that is, the ability of a TF to bind to multiple sequences. In addition, our analysis indicated that direct selection can have large effects on the evolution of redundancy and multiplicity. Data from yeast identified selection for changes in expression level as a candidate mechanism for the evolution of multiple binding sites. We conclude that, although selection may play a major role in the evolution of multiplicity in regulatory regions, nonadaptive forces can also lead to high levels of multiplicity.
Collapse
Affiliation(s)
- Tiago Paixão
- Department of Biology and Biochemistry, University of Houston, Houston, Texas, United States of America
| | - Ricardo B. R. Azevedo
- Department of Biology and Biochemistry, University of Houston, Houston, Texas, United States of America
- * E-mail:
| |
Collapse
|
150
|
Nacher J, Araki N. Structural characterization and modeling of ncRNA–protein interactions. Biosystems 2010; 101:10-9. [DOI: 10.1016/j.biosystems.2010.02.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2009] [Revised: 02/12/2010] [Accepted: 02/15/2010] [Indexed: 12/25/2022]
|