151
|
Nacher J, Araki N. Structural characterization and modeling of ncRNA–protein interactions. Biosystems 2010; 101:10-9. [DOI: 10.1016/j.biosystems.2010.02.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2009] [Revised: 02/12/2010] [Accepted: 02/15/2010] [Indexed: 12/25/2022]
|
152
|
Bailly-Bechet M, Braunstein A, Pagnani A, Weigt M, Zecchina R. Inference of sparse combinatorial-control networks from gene-expression data: a message passing approach. BMC Bioinformatics 2010; 11:355. [PMID: 20587029 PMCID: PMC2909222 DOI: 10.1186/1471-2105-11-355] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2009] [Accepted: 06/29/2010] [Indexed: 11/18/2022] Open
Abstract
Background Transcriptional gene regulation is one of the most important mechanisms in controlling many essential cellular processes, including cell development, cell-cycle control, and the cellular response to variations in environmental conditions. Genes are regulated by transcription factors and other genes/proteins via a complex interconnection network. Such regulatory links may be predicted using microarray expression data, but most regulation models suppose transcription factor independence, which leads to spurious links when many genes have highly correlated expression levels. Results We propose a new algorithm to infer combinatorial control networks from gene-expression data. Based on a simple model of combinatorial gene regulation, it includes a message-passing approach which avoids explicit sampling over putative gene-regulatory networks. This algorithm is shown to recover the structure of a simple artificial cell-cycle network model for baker's yeast. It is then applied to a large-scale yeast gene expression dataset in order to identify combinatorial regulations, and to a data set of direct medical interest, namely the Pleiotropic Drug Resistance (PDR) network. Conclusions The algorithm we designed is able to recover biologically meaningful interactions, as shown by recent experimental results [1]. Moreover, new cases of combinatorial control are predicted, showing how simple models taking this phenomenon into account can lead to informative predictions and allow to extract more putative regulatory interactions from microarray databases.
Collapse
Affiliation(s)
- Marc Bailly-Bechet
- ISI Foundation Viale Settimio Severo 65, Villa Gualino, I-10133 Torino, Italy
| | | | | | | | | |
Collapse
|
153
|
Algorithms and complexity analyses for control of singleton attractors in Boolean networks. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2010:521407. [PMID: 18795107 DOI: 10.1155/2008/521407] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2008] [Revised: 04/04/2008] [Accepted: 06/02/2008] [Indexed: 11/17/2022]
Abstract
A Boolean network (BN) is a mathematical model of genetic networks. We propose several algorithms for control of singleton attractors in BN. We theoretically estimate the average-case time complexities of the proposed algorithms, and confirm them by computer experiments. The results suggest the importance of gene ordering. Especially, setting internal nodes ahead yields shorter computational time than setting external nodes ahead in various types of algorithms. We also present a heuristic algorithm which does not look for the optimal solution but for the solution whose computational time is shorter than that of the exact algorithms.
Collapse
|
154
|
Algorithms for finding small attractors in Boolean networks. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2010:20180. [PMID: 18253467 DOI: 10.1155/2007/20180] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2006] [Revised: 11/24/2006] [Accepted: 02/13/2007] [Indexed: 11/18/2022]
Abstract
A Boolean network is a model used to study the interactions between different genes in genetic regulatory networks. In this paper, we present several algorithms using gene ordering and feedback vertex sets to identify singleton attractors and small attractors in Boolean networks. We analyze the average case time complexities of some of the proposed algorithms. For instance, it is shown that the outdegree-based ordering algorithm for finding singleton attractors works in O(1.19(n)) time for K = 2, which is much faster than the naive O(2(n)) time algorithm, where n is the number of genes and K is the maximum indegree. We performed extensive computational experiments on these algorithms, which resulted in good agreement with theoretical results. In contrast, we give a simple and complete proof for showing that finding an attractor with the shortest period is NP-hard.
Collapse
|
155
|
The carbon assimilation network in Escherichia coli is densely connected and largely sign-determined by directions of metabolic fluxes. PLoS Comput Biol 2010; 6:e1000812. [PMID: 20548959 PMCID: PMC2883603 DOI: 10.1371/journal.pcbi.1000812] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2009] [Accepted: 05/07/2010] [Indexed: 11/30/2022] Open
Abstract
Gene regulatory networks consist of direct interactions but also include indirect interactions mediated by metabolites and signaling molecules. We describe how these indirect interactions can be derived from a model of the underlying biochemical reaction network, using weak time-scale assumptions in combination with sensitivity criteria from metabolic control analysis. We apply this approach to a model of the carbon assimilation network in Escherichia coli. Our results show that the derived gene regulatory network is densely connected, contrary to what is usually assumed. Moreover, the network is largely sign-determined, meaning that the signs of the indirect interactions are fixed by the flux directions of biochemical reactions, independently of specific parameter values and rate laws. An inversion of the fluxes following a change in growth conditions may affect the signs of the indirect interactions though. This leads to a feedback structure that is at the same time robust to changes in the kinetic properties of enzymes and that has the flexibility to accommodate radical changes in the environment. The regulation of gene expression is tightly interwoven with metabolism and signal transduction. A realistic view of gene regulatory networks should therefore not only include direct interactions resulting from transcription regulation, but also indirect regulatory interactions mediated by metabolic effectors and signaling molecules. Ignoring these indirect interactions during the analysis of the network dynamics may lead crucial feedback loops to be missed. We present a method for systematically deriving indirect interactions from a model of the underlying biochemical reaction network, using weak time-scale assumptions in combination with sensitivity criteria from metabolic control analysis. This approach leads to novel insights as exemplified here on the carbon assimilation network of E. coli. We show that the derived gene regulatory network is densely connected, that the signs of the indirect interactions are largely fixed by the direction of metabolic fluxes, and that a change in flux direction may invert the sign of indirect interactions. Therefore the feedback structure of the network is much more complex than usually assumed; it appears robust to changes in the kinetic properties of its components and it can be flexibly rewired when the environment changes.
Collapse
|
156
|
Bhardwaj N, Carson MB, Abyzov A, Yan KK, Lu H, Gerstein MB. Analysis of combinatorial regulation: scaling of partnerships between regulators with the number of governed targets. PLoS Comput Biol 2010; 6:e1000755. [PMID: 20523742 PMCID: PMC2877725 DOI: 10.1371/journal.pcbi.1000755] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2009] [Accepted: 03/22/2010] [Indexed: 12/17/2022] Open
Abstract
Through combinatorial regulation, regulators partner with each other to control common targets and this allows a small number of regulators to govern many targets. One interesting question is that given this combinatorial regulation, how does the number of regulators scale with the number of targets? Here, we address this question by building and analyzing co-regulation (co-transcription and co-phosphorylation) networks that describe partnerships between regulators controlling common genes. We carry out analyses across five diverse species: Escherichia coli to human. These reveal many properties of partnership networks, such as the absence of a classical power-law degree distribution despite the existence of nodes with many partners. We also find that the number of co-regulatory partnerships follows an exponential saturation curve in relation to the number of targets. (For E. coli and Bacillus subtilis, only the beginning linear part of this curve is evident due to arrangement of genes into operons.) To gain intuition into the saturation process, we relate the biological regulation to more commonplace social contexts where a small number of individuals can form an intricate web of connections on the internet. Indeed, we find that the size of partnership networks saturates even as the complexity of their output increases. We also present a variety of models to account for the saturation phenomenon. In particular, we develop a simple analytical model to show how new partnerships are acquired with an increasing number of target genes; with certain assumptions, it reproduces the observed saturation. Then, we build a more general simulation of network growth and find agreement with a wide range of real networks. Finally, we perform various down-sampling calculations on the observed data to illustrate the robustness of our conclusions. A regulatory network consists of regulators such as transcription factors or kinases that control the expression or activity of their target genes. Almost always, there are multiple regulators partnering together to control their targets. Compared to more commonplace contexts, these regulators can be thought of as managers in a social or corporate setting controlling their common subordinates. One interesting question that we address here in this study is how the number of governing regulators scales with the number of governed targets. We build and analyze co-regulation (co-transcription and co-phosphorylation) networks that describe partnerships between regulators controlling common genes. We use a simple framework across five species that demonstrate a wide range of evolution: Escherichia coli to human. The analysis reveals many properties of partnership networks and shows that the number of co-regulatory partnerships follows an exponential saturation curve with the number of targets. To gain more intuition, we explore more commonplace contexts and find that exponential saturation relationship also exists in several social networks. Finally, we propose a simple model to explain this relationship that also exists in a simulated evolutionary environment.
Collapse
Affiliation(s)
- Nitin Bhardwaj
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
| | - Matthew B. Carson
- Bioinformatics Program, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Alexej Abyzov
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
| | - Koon-Kiu Yan
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
| | - Hui Lu
- Bioinformatics Program, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Mark B. Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
- Department of Computer Science, Yale University, New Haven, Connecticut, United States of America
- * E-mail:
| |
Collapse
|
157
|
Altay G, Emmert-Streib F. Revealing differences in gene network inference algorithms on the network level by ensemble methods. ACTA ACUST UNITED AC 2010; 26:1738-44. [PMID: 20501553 DOI: 10.1093/bioinformatics/btq259] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
MOTIVATION The inference of regulatory networks from large-scale expression data holds great promise because of the potentially causal interpretation of these networks. However, due to the difficulty to establish reliable methods based on observational data there is so far only incomplete knowledge about possibilities and limitations of such inference methods in this context. RESULTS In this article, we conduct a statistical analysis investigating differences and similarities of four network inference algorithms, ARACNE, CLR, MRNET and RN, with respect to local network-based measures. We employ ensemble methods allowing to assess the inferability down to the level of individual edges. Our analysis reveals the bias of these inference methods with respect to the inference of various network components and, hence, provides guidance in the interpretation of inferred regulatory networks from expression data. Further, as application we predict the total number of regulatory interactions in human B cells and hypothesize about the role of Myc and its targets regarding molecular information processing.
Collapse
Affiliation(s)
- Gökmen Altay
- Computational Biology and Machine Learning, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, 97 Lisburn Road, Belfast BT9 7BL, UK
| | | |
Collapse
|
158
|
Reimand J, Vaquerizas JM, Todd AE, Vilo J, Luscombe NM. Comprehensive reanalysis of transcription factor knockout expression data in Saccharomyces cerevisiae reveals many new targets. Nucleic Acids Res 2010; 38:4768-77. [PMID: 20385592 PMCID: PMC2919724 DOI: 10.1093/nar/gkq232] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Transcription factor (TF) perturbation experiments give valuable insights into gene regulation. Genome-scale evidence from microarray measurements may be used to identify regulatory interactions between TFs and targets. Recently, Hu and colleagues published a comprehensive study covering 269 TF knockout mutants for the yeast Saccharomyces cerevisiae. However, the information that can be extracted from this valuable dataset is limited by the method employed to process the microarray data. Here, we present a reanalysis of the original data using improved statistical techniques freely available from the BioConductor project. We identify over 100,000 differentially expressed genes-nine times the total reported by Hu et al. We validate the biological significance of these genes by assessing their functions, the occurrence of upstream TF-binding sites, and the prevalence of protein-protein interactions. The reanalysed dataset outperforms the original across all measures, indicating that we have uncovered a vastly expanded list of relevant targets. In summary, this work presents a high-quality reanalysis that maximizes the information contained in the Hu et al. compendium. The dataset is available from ArrayExpress (accession: E-MTAB-109) and it will be invaluable to any scientist interested in the yeast transcriptional regulatory system.
Collapse
Affiliation(s)
- Jüri Reimand
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK.
| | | | | | | | | |
Collapse
|
159
|
Analysis of diverse regulatory networks in a hierarchical context shows consistent tendencies for collaboration in the middle levels. Proc Natl Acad Sci U S A 2010; 107:6841-6. [PMID: 20351254 DOI: 10.1073/pnas.0910867107] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Gene regulatory networks have been shown to share some common aspects with commonplace social governance structures. Thus, we can get some intuition into their organization by arranging them into well-known hierarchical layouts. These hierarchies, in turn, can be placed between the extremes of autocracies, with well-defined levels and clear chains of command, and democracies, without such defined levels and with more co-regulatory partnerships between regulators. In general, the presence of partnerships decreases the variation in information flow amongst nodes within a level, more evenly distributing stress. Here we study various regulatory networks (transcriptional, modification, and phosphorylation) for five diverse species, Escherichia coli to human. We specify three levels of regulators--top, middle, and bottom--which collectively govern the non-regulator targets lying in the lowest fourth level. We define quantities for nodes, levels, and entire networks that measure their degree of collaboration and autocratic vs. democratic character. We show individual regulators have a range of partnership tendencies: Some regulate their targets in combination with other regulators in local instantiations of democratic structure, whereas others regulate mostly in isolation, in more autocratic fashion. Overall, we show that in all networks studied the middle level has the highest collaborative propensity and coregulatory partnerships occur most frequently amongst midlevel regulators, an observation that has parallels in corporate settings where middle managers must interact most to ensure organizational effectiveness. There is, however, one notable difference between networks in different species: The amount of collaborative regulation and democratic character increases markedly with overall genomic complexity.
Collapse
|
160
|
Nowick K, Stubbs L. Lineage-specific transcription factors and the evolution of gene regulatory networks. Brief Funct Genomics 2010; 9:65-78. [PMID: 20081217 DOI: 10.1093/bfgp/elp056] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Nature is replete with examples of diverse cell types, tissues and body plans, forming very different creatures from genomes with similar gene complements. However, while the genes and the structures of proteins they encode can be highly conserved, the production of those proteins in specific cell types and at specific developmental time points might differ considerably between species. A full understanding of the factors that orchestrate gene expression will be essential to fully understand evolutionary variety. Transcription factor (TF) proteins, which form gene regulatory networks (GRNs) to act in cooperative or competitive partnerships to regulate gene expression, are key components of these unique regulatory programs. Although many TFs are conserved in structure and function, certain classes of TFs display extensive levels of species diversity. In this review, we highlight families of TFs that have expanded through gene duplication events to create species-unique repertoires in different evolutionary lineages. We discuss how the hierarchical structures of GRNs allow for flexible small to large-scale phenotypic changes. We survey evidence that explains how newly evolved TFs may be integrated into an existing GRN and how molecular changes in TFs might impact the GRNs. Finally, we review examples of traits that evolved due to lineage-specific TFs and species differences in GRNs.
Collapse
Affiliation(s)
- Katja Nowick
- Department of Cell and Developmental Biology, Institute for Genomic Biology, University of Illinois, 1206 W. Gregory Drive, Urbana, IL 61802, USA
| | | |
Collapse
|
161
|
Tuğrul M, Kabakçioğlu A. Anomalies in the transcriptional regulatory network of the yeast Saccharomyces cerevisiae. J Theor Biol 2009; 263:328-36. [PMID: 20004671 DOI: 10.1016/j.jtbi.2009.12.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2009] [Revised: 11/25/2009] [Accepted: 12/02/2009] [Indexed: 10/20/2022]
Abstract
We investigate the structural and dynamical properties of the transcriptional regulatory network of the Yeast Saccharomyces cerevisiae and compare it with two "unbiased" ensembles: one obtained by reshuffling the edges and the other generated by mimicking the transcriptional regulation mechanism within the cell. Both ensembles reproduce the degree distributions (the first-by construction-exactly and the second approximately), degree-degree correlations and the k-core structure observed in Yeast. An exceptionally large dynamically relevant core network found in Yeast in comparison with the second ensemble points to a strong bias towards a collective organization which is achieved by subtle modifications in the network's degree distributions. We use a Boolean model of regulatory dynamics with various classes of update functions to represent in vivo regulatory interactions. We find that the Yeast's core network has a qualitatively different behavior, accommodating on average multiple attractors unlike typical members of both reference ensembles which converge to a single dominant attractor. Finally, we investigate the robustness of the networks and find that the stability depends strongly on the used function class. The robustness measure is squeezed into a narrower band around the order-chaos boundary when Boolean inputs are required to be nonredundant on each node. However, the difference between the reference models and the Yeast's core is marginal, suggesting that the dynamically stable network elements are located mostly on the peripherals of the regulatory network. Consistently, the statistically significant three-node motifs in the dynamical core of Yeast turn out to be different from and less stable than those found in the full transcriptional regulatory network.
Collapse
Affiliation(s)
- M Tuğrul
- Department of Physics, Koç University, Sariyer 34450 Istanbul, Turkey
| | | |
Collapse
|
162
|
Zhang SQ, Ching WK, Tsing NK, Leung HY, Guo D. A new multiple regression approach for the construction of genetic regulatory networks. Artif Intell Med 2009; 48:153-60. [PMID: 19963359 DOI: 10.1016/j.artmed.2009.11.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2008] [Revised: 09/04/2009] [Accepted: 09/08/2009] [Indexed: 11/15/2022]
Abstract
OBJECTIVE Reconstruction of a genetic regulatory network from a given time-series gene expression data is an important research topic in systems biology. One of the main difficulties in building a genetic regulatory network lies in the fact that practical data set has a huge number of genes vs. a small number of sampling time points. In this paper, we propose a new linear regression model that may overcome this difficulty for uncovering the regulatory relationship in a genetic network. METHODS The proposed multiple regression model makes use of the scale-free property of a real biological network. In particular, a filter is constructed by using this scale-free property and some appropriate statistical tests to remove redundant interactions among the genes. A model is then constructed by minimizing the gap between the observed and the predicted data. RESULTS Numerical examples based on yeast gene expression data are given to demonstrate that the proposed model fits the practical data very well. Some interesting properties of the genes and the underlying network are also observed. CONCLUSIONS In conclusion, we propose a new multiple regression model based on the scale-free property of real biological network for genetic regulatory network inference. Numerical results using yeast cell cycle gene expression dataset show the effectiveness of our method. We expect that the proposed method can be widely used for genetic network inference using high-throughput gene expression data from various species for systems biology discovery.
Collapse
Affiliation(s)
- Shu-Qin Zhang
- School of Mathematical Sciences, Fudan University, Shanghai, China
| | | | | | | | | |
Collapse
|
163
|
Information content based model for the topological properties of the gene regulatory network of Escherichia coli. J Theor Biol 2009; 263:281-94. [PMID: 19962388 DOI: 10.1016/j.jtbi.2009.11.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2009] [Revised: 11/21/2009] [Accepted: 11/23/2009] [Indexed: 11/22/2022]
Abstract
Gene regulatory networks (GRN) are being studied with increasingly precise quantitative tools and can provide a testing ground for ideas regarding the emergence and evolution of complex biological networks. We analyze the global statistical properties of the transcriptional regulatory network of the prokaryote Escherichia coli, identifying each operon with a node of the network. We propose a null model for this network using the content-based approach applied earlier to the eukaryote Saccharomyces cerevisiae (Balcan et al., 2007). Random sequences that represent promoter regions and binding sequences are associated with the nodes. The length distributions of these sequences are extracted from the relevant databases. The network is constructed by testing for the occurrence of binding sequences within the promoter regions. The ensemble of emergent networks yields an exponentially decaying in-degree distribution and a putative power law dependence for the out-degree distribution with a flat tail, in agreement with the data. The clustering coefficient, degree-degree correlation, rich club coefficient and k-core visualization all agree qualitatively with the empirical network to an extent not yet achieved by any other computational model, to our knowledge. The significant statistical differences can point the way to further research into non-adaptive and adaptive processes in the evolution of the E. coli GRN.
Collapse
|
164
|
Bassetti B, Zarei M, Cosentino Lagomarsino M, Bianconi G. Statistical mechanics of the "Chinese restaurant process": lack of self-averaging, anomalous finite-size effects, and condensation. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2009; 80:066118. [PMID: 20365242 DOI: 10.1103/physreve.80.066118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2009] [Revised: 09/22/2009] [Indexed: 05/29/2023]
Abstract
The Pitman-Yor, or Chinese restaurant process, is a stochastic process that generates distributions following a power law with exponents lower than 2, as found in numerous physical, biological, technological, and social systems. We discuss its rich behavior with the tools and viewpoint of statistical mechanics. We show that this process invariably gives rise to a condensation, i.e., a distribution dominated by a finite number of classes. We also evaluate thoroughly the finite-size effects, finding that the lack of stationary state and self-averaging of the process creates realization-dependent cutoffs and behavior of the distributions with no equivalent in other statistical mechanical models.
Collapse
Affiliation(s)
- Bruno Bassetti
- Dipartimento di Fisica, Universitá degli Studi di Milano, via Celoria 16, Milano, Italy
| | | | | | | |
Collapse
|
165
|
Demongeot J, Ben Amor H, Elena A, Gillois P, Noual M, Sené S. Robustness in regulatory interaction networks. A generic approach with applications at different levels: physiologic, metabolic and genetic. Int J Mol Sci 2009; 10:4437-4473. [PMID: 20057955 PMCID: PMC2790118 DOI: 10.3390/ijms10104437] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Revised: 10/02/2009] [Accepted: 10/14/2009] [Indexed: 12/26/2022] Open
Abstract
Regulatory interaction networks are often studied on their dynamical side (existence of attractors, study of their stability). We focus here also on their robustness, that is their ability to offer the same spatiotemporal patterns and to resist to external perturbations such as losses of nodes or edges in the networks interactions architecture, changes in their environmental boundary conditions as well as changes in the update schedule (or updating mode) of the states of their elements (e.g., if these elements are genes, their synchronous coexpression mode versus their sequential expression). We define the generic notions of boundary, core, and critical vertex or edge of the underlying interaction graph of the regulatory network, whose disappearance causes dramatic changes in the number and nature of attractors (e.g., passage from a bistable behaviour to a unique periodic regime) or in the range of their basins of stability. The dynamic transition of states will be presented in the framework of threshold Boolean automata rules. A panorama of applications at different levels will be given: brain and plant morphogenesis, bulbar cardio-respiratory regulation, glycolytic/oxidative metabolic coupling, and eventually cell cycle and feather morphogenesis genetic control.
Collapse
Affiliation(s)
- Jacques Demongeot
- Université J. Fourier de Grenoble, TIMC-IMAG, CNRS UMR 5525, Faculté de Médecine, 38700 La Tronche, France; E-Mails:
(H.B.);
(A.E.);
(P.G.)
| | - Hedi Ben Amor
- Université J. Fourier de Grenoble, TIMC-IMAG, CNRS UMR 5525, Faculté de Médecine, 38700 La Tronche, France; E-Mails:
(H.B.);
(A.E.);
(P.G.)
| | - Adrien Elena
- Université J. Fourier de Grenoble, TIMC-IMAG, CNRS UMR 5525, Faculté de Médecine, 38700 La Tronche, France; E-Mails:
(H.B.);
(A.E.);
(P.G.)
| | - Pierre Gillois
- Université J. Fourier de Grenoble, TIMC-IMAG, CNRS UMR 5525, Faculté de Médecine, 38700 La Tronche, France; E-Mails:
(H.B.);
(A.E.);
(P.G.)
| | - Mathilde Noual
- Université de Lyon, École Normale Supérieure Lyon, LIP, CNRS UMR 5668, 69007 Lyon, France
- IXXI, Institut rhône-alpin des systèmes complexes, 69007 Lyon, France; E-Mails:
(M.N.);
(S.S.)
| | - Sylvain Sené
- Université d’Evry Val d’Essonne, IBISC, CNRS FRE 3190, 91000 Evry, France
- IXXI, Institut rhône-alpin des systèmes complexes, 69007 Lyon, France; E-Mails:
(M.N.);
(S.S.)
| |
Collapse
|
166
|
Conant GC. Rapid reorganization of the transcriptional regulatory network after genome duplication in yeast. Proc Biol Sci 2009; 277:869-76. [PMID: 19923128 DOI: 10.1098/rspb.2009.1592] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
I study the reorganization of the yeast transcriptional regulatory network after whole-genome duplication (WGD). Individual transcription factors (TFs) were computationally removed from the regulatory network, and the resulting networks were analysed. TF gene pairs that survive in duplicate from WGD show detectable redundancy as a result of that duplication. However, in most other respects, these duplicated TFs are indistinguishable from other TFs in the genome, suggesting that the duplicate TFs produced by WGD were rapidly diverted to distinct functional roles in the regulatory network. Separately, I find that genes targeted by many TFs appear to be preferentially retained in duplicate after WGD, an effect I attribute to selection to maintain dosage balance in the regulatory network after WGD.
Collapse
Affiliation(s)
- Gavin C Conant
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA.
| |
Collapse
|
167
|
Zeng T, Li J. Maximization of negative correlations in time-course gene expression data for enhancing understanding of molecular pathways. Nucleic Acids Res 2009; 38:e1. [PMID: 19854949 PMCID: PMC2800212 DOI: 10.1093/nar/gkp822] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Positive correlation can be diversely instantiated as shifting, scaling or geometric pattern, and it has been extensively explored for time-course gene expression data and pathway analysis. Recently, biological studies emerge a trend focusing on the notion of negative correlations such as opposite expression patterns, complementary patterns and self-negative regulation of transcription factors (TFs). These biological ideas and primitive observations motivate us to formulate and investigate the problem of maximizing negative correlations. The objective is to discover all maximal negative correlations of statistical and biological significance from time-course gene expression data for enhancing our understanding of molecular pathways. Given a gene expression matrix, a maximal negative correlation is defined as an activation–inhibition two-way expression pattern (AIE pattern). We propose a parameter-free algorithm to enumerate the complete set of AIE patterns from a data set. This algorithm can identify significant negative correlations that cannot be identified by the traditional clustering/biclustering methods. To demonstrate the biological usefulness of AIE patterns in the analysis of molecular pathways, we conducted deep case studies for AIE patterns identified from Yeast cell cycle data sets. In particular, in the analysis of the Lysine biosynthesis pathway, new regulation modules and pathway components were inferred according to a significant negative correlation which is likely caused by a co-regulation of the TFs at the higher layer of the biological network. We conjecture that maximal negative correlations between genes are actually a common characteristic in molecular pathways, which can provide insights into the cell stress response study, drug response evaluation, etc.
Collapse
Affiliation(s)
- Tao Zeng
- School of Computer Engineering & Bioinformatics Research Center, Nanyang Technological University, Singapore
| | | |
Collapse
|
168
|
Kim H, Lee JK, Park T. Inference of large-scale gene regulatory networks using regression-based network approach. J Bioinform Comput Biol 2009; 7:717-35. [PMID: 19634200 DOI: 10.1142/s0219720009004278] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2008] [Revised: 03/16/2009] [Accepted: 03/17/2009] [Indexed: 11/18/2022]
Abstract
The gene regulatory network modeling plays a key role in search for relationships among genes. Many modeling approaches have been introduced to find the causal relationship between genes using time series microarray data. However, they have been suffering from high dimensionality, overfitting, and heavy computation time. Further, the selection of a best model among several possible competing models is not guaranteed that it is the best one. In this study, we propose a simple procedure for constructing large scale gene regulatory networks using a regression-based network approach. We determine the optimal out-degree of network structure by using the sum of squared coefficients which are obtained from all appropriate regression models. Through the simulated data, accuracy of estimation and robustness against noise are computed in order to compare with the vector autoregressive regression model. Our method shows high accuracy and robustness for inferring large-scale gene networks. Also it is applied to Caulobacter crescentus cell cycle data consisting of 1472 genes. It shows that many genes are regulated by two transcription factors, ctrA and gcrA, that are known for global regulators.
Collapse
Affiliation(s)
- Haseong Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, San 56-1, Shilim-dong, Korea.
| | | | | |
Collapse
|
169
|
Acencio ML, Lemke N. Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information. BMC Bioinformatics 2009; 10:290. [PMID: 19758426 PMCID: PMC2753850 DOI: 10.1186/1471-2105-10-290] [Citation(s) in RCA: 101] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2008] [Accepted: 09/16/2009] [Indexed: 11/21/2022] Open
Abstract
Background The identification of essential genes is important for the understanding of the minimal requirements for cellular life and for practical purposes, such as drug design. However, the experimental techniques for essential genes discovery are labor-intensive and time-consuming. Considering these experimental constraints, a computational approach capable of accurately predicting essential genes would be of great value. We therefore present here a machine learning-based computational approach relying on network topological features, cellular localization and biological process information for prediction of essential genes. Results We constructed a decision tree-based meta-classifier and trained it on datasets with individual and grouped attributes-network topological features, cellular compartments and biological processes-to generate various predictors of essential genes. We showed that the predictors with better performances are those generated by datasets with integrated attributes. Using the predictor with all attributes, i.e., network topological features, cellular compartments and biological processes, we obtained the best predictor of essential genes that was then used to classify yeast genes with unknown essentiality status. Finally, we generated decision trees by training the J48 algorithm on datasets with all network topological features, cellular localization and biological process information to discover cellular rules for essentiality. We found that the number of protein physical interactions, the nuclear localization of proteins and the number of regulating transcription factors are the most important factors determining gene essentiality. Conclusion We were able to demonstrate that network topological features, cellular localization and biological process information are reliable predictors of essential genes. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing essentiality.
Collapse
Affiliation(s)
- Marcio L Acencio
- Department of Physics and Biophysics, São Paulo State University, Botucatu, São Paulo, Brazil.
| | | |
Collapse
|
170
|
Ciandrini L, Maffi C, Motta A, Bassetti B, Cosentino Lagomarsino M. Feedback topology and XOR-dynamics in Boolean networks with varying input structure. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2009; 80:026122. [PMID: 19792215 DOI: 10.1103/physreve.80.026122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2008] [Revised: 05/28/2009] [Indexed: 05/28/2023]
Abstract
We analyze a model of fixed in-degree random Boolean networks in which the fraction of input-receiving nodes is controlled by the parameter gamma. We investigate analytically and numerically the dynamics of graphs under a parallel XOR updating scheme. This scheme is interesting because it is accessible analytically and its phenomenology is at the same time under control and as rich as the one of general Boolean networks. We give analytical formulas for the dynamics on general graphs, showing that with a XOR-type evolution rule, dynamic features are direct consequences of the topological feedback structure, in analogy with the role of relevant components in Kauffman networks. Considering graphs with fixed in-degree, we characterize analytically and numerically the feedback regions using graph decimation algorithms (Leaf Removal). With varying gamma , this graph ensemble shows a phase transition that separates a treelike graph region from one in which feedback components emerge. Networks near the transition point have feedback components made of disjoint loops, in which each node has exactly one incoming and one outgoing link. Using this fact, we provide analytical estimates of the maximum period starting from topological considerations.
Collapse
Affiliation(s)
- L Ciandrini
- Dip. di Fisica Nucleare e Teorica, Università di Pavia, Via Bassi 6, 27100 Pavia, Italy.
| | | | | | | | | |
Collapse
|
171
|
Nicolau M, Schoenauer M. On the evolution of scale-free topologies with a gene regulatory network model. Biosystems 2009; 98:137-48. [PMID: 19577613 DOI: 10.1016/j.biosystems.2009.06.006] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2009] [Revised: 06/20/2009] [Accepted: 06/27/2009] [Indexed: 11/30/2022]
Abstract
A novel approach to generating scale-free network topologies is introduced, based on an existing artificial gene regulatory network model. From this model, different interaction networks can be extracted, based on an activation threshold. By using an evolutionary computation approach, the model is allowed to evolve, in order to reach specific network statistical measures. The results obtained show that, when the model uses a duplication and divergence initialisation, such as seen in nature, the resulting regulation networks not only are closer in topology to scale-free networks, but also require only a few evolutionary cycles to achieve a satisfactory error value.
Collapse
Affiliation(s)
- Miguel Nicolau
- Projet TAO - INRIA Saclay - Ile-de-France, LRI - Université Paris-Sud, Orsay, France.
| | | |
Collapse
|
172
|
Bhardwaj N, Lu H. Co-expression among constituents of a motif in the protein-protein interaction network. J Bioinform Comput Biol 2009; 7:1-17. [PMID: 19226657 DOI: 10.1142/s0219720009003959] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2008] [Revised: 09/19/2008] [Accepted: 09/22/2008] [Indexed: 11/18/2022]
Abstract
Almost all cellular functions are the results of well-coordinated interactions between various proteins. A more connected hub or motif in the interaction network is expected to be more important, and any perturbation in this motif would be more damaging to the smooth performance of the related functions. Thus, some coherent robustness of these hubs has to be derived. Here, we provide the global evidence that interaction hubs obtain their robustness against uneven protein concentrations through co-expression of the constituents, and that the degree of co-expression correlates strongly with the complexity of the embedded motif. We calculated the gene expression correlations between the proteins embedded in 3-, 4-, 5-, and 6-node interaction motifs of increasing complexities, and compared them to those between proteins from random motifs of similar complexities. We find that as the connectedness of these motifs increases, there is higher co-expression between the constituent proteins. For example, when the expression correlation is 0.7, the kernel density of the correlation increases from 0.152 for 4-node motifs with three edges to 0.403 for 4-node cliques. This implies that the robustness of the interaction system emerges from a proportionate synchronicity among the constituents of the motif via co-expression. We further show that such biological coherence via co-expression of component proteins can be reinforced by integrating conservation data in the analysis. For example, with addition of evolutionary information from other genomes, the ratio of kernel density for interaction and random data in the case of 5- and 6-node cliques in yeast increases from 37.8 to 123 and 98.4 to 1300, respectively, given that the expression correlation is 0.8. Our results show that genes whose products are involved in motifs have transcription and translation properties that minimize the noise in final protein concentrations, compared to random sets of genes.
Collapse
Affiliation(s)
- Nitin Bhardwaj
- Bioinformatics Program, University of Illinois at Chicago, 820 S. Woods Street, Room 103, Chicago, IL 60607, USA.
| | | |
Collapse
|
173
|
Bachman P, Liu Y. Structure discovery in PPI networks using pattern-based network decomposition. Bioinformatics 2009; 25:1814-21. [PMID: 19447784 DOI: 10.1093/bioinformatics/btp297] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION The large, complex networks of interactions between proteins provide a lens through which one can examine the structure and function of biological systems. Previous analyses of these continually growing networks have primarily followed either of two approaches: large-scale statistical analysis of holistic network properties, or small-scale analysis of local topological features. Meanwhile, investigation of meso-scale network structure (above that of individual functional modules, while maintaining the significance of individual proteins) has been hindered by the computational complexity of structural search in networks. Examining protein-protein interaction (PPI) networks at the meso-scale may provide insights into the presence and form of relationships between individual protein complexes and functional modules. RESULTS In this article, we present an efficient algorithm for performing sub-graph isomorphism queries on a network and show its computational advantage over previous methods. We also present a novel application of this form of topological search which permits analysis of a network's structure at a scale between that of individual functional modules and that of network-wide properties. This analysis provides support for the presence of hierarchical modularity in the PPI network of Saccharomyces cerevisiae.
Collapse
Affiliation(s)
- Philip Bachman
- Department of Computer Science and Department of Molecular Biology, University of Texas at Dallas, Richardson, TX 75083-0688, USA
| | | |
Collapse
|
174
|
Evolutionary rates and centrality in the yeast gene regulatory network. Genome Biol 2009; 10:R35. [PMID: 19358738 PMCID: PMC2688926 DOI: 10.1186/gb-2009-10-4-r35] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2008] [Accepted: 04/09/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transcription factors play a fundamental role in regulating physiological responses and developmental processes. Here we examine the evolution of the yeast transcription factors in the context of the structure of the gene regulatory network. RESULTS In contrast to previous results for the protein-protein interaction and metabolic networks, we find that the position of a gene within the transcription network affects the rate of protein evolution such that more central transcription factors tend to evolve faster. Centrality is also positively correlated with expression variability, suggesting that the higher rate of divergence among central transcription factors may be due to their role in controlling information flow and may be the result of adaptation to changing environmental conditions. Alternatively, more central transcription factors could be more buffered against environmental perturbations and, therefore, less subject to strong purifying selection. Importantly, the relationship between centrality and evolutionary rates is independent of expression level, expression variability and gene essentiality. CONCLUSIONS Our analysis of the transcription network highlights the role of network structure on protein evolutionary rate. Further, the effect of network centrality on nucleotide divergence is different among the metabolic, protein-protein and transcriptional networks, suggesting that the effect of gene position is dependant on the function of the specific network under study. A better understanding of how these three cellular networks interact with one another may be needed to fully examine the impact of network structure on the function and evolution of biological systems.
Collapse
|
175
|
Stewart AJ, Seymour RM, Pomiankowski A. Degree dependence in rates of transcription factor evolution explains the unusual structure of transcription networks. Proc Biol Sci 2009; 276:2493-501. [PMID: 19364737 DOI: 10.1098/rspb.2009.0210] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Transcription networks have an unusual structure. In both prokaryotes and eukaryotes, the number of target genes regulated by each transcription factor, its out-degree, follows a broad tailed distribution. By contrast, the number of transcription factors regulating a target gene, its in-degree, follows a much narrower distribution, which has no broad tail. We constructed a model of transcription network evolution through trans- and cis-mutations, gene duplication and deletion. The effects of these different evolutionary processes on the network structure are enough to produce an asymmetrical in- and out-degree distribution. However, the parameter values required to replicate known in- and out-degree distributions are unrealistic. We then considered variation in the rate of evolution of a gene dependent upon its position in the network. When transcription factors with many regulatory interactions are constrained to evolve more slowly than those with few interactions, the details of the in- and out-degree distributions of transcription networks can be fully reproduced over a range of plausible parameter values. The networks produced by our model depend on the relative rates of the different evolutionary processes. By determining the circumstances under which the networks with the correct degree distributions are produced, we are able to assess the relative importance of the different evolutionary processes in our model during evolution.
Collapse
Affiliation(s)
- Alexander J Stewart
- CoMPLEX, University College London, Physics Building, Gower Street, London WC1E 6BT, UK.
| | | | | |
Collapse
|
176
|
Baralla A, Mentzen WI, De La Fuente A. Inferring Gene Networks: Dream or Nightmare? Ann N Y Acad Sci 2009; 1158:246-56. [DOI: 10.1111/j.1749-6632.2008.04099.x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
177
|
Balaji S, Iyer LM, Babu MM, Aravind L. Comparison of transcription regulatory interactions inferred from high-throughput methods: what do they reveal? Trends Genet 2009; 24:319-23. [PMID: 18514968 DOI: 10.1016/j.tig.2008.04.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2007] [Revised: 04/15/2008] [Accepted: 04/16/2008] [Indexed: 11/17/2022]
Abstract
We compared the transcription regulatory interactions inferred from three high-throughput methods. Because these methods use different principles, they have few interactions in common, suggesting they capture distinct facets of the transcription regulatory program. We show that these methods uncover disparate biological phenomena: long-range interactions between telomeres and transcription factors, downstream effects of interference with ribosome biogenesis and a protein-aggregation response. Through a detailed analysis of the latter, we predict components of the system responding to protein-aggregation stress.
Collapse
Affiliation(s)
- S Balaji
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20894, USA.
| | | | | | | |
Collapse
|
178
|
Gerlee P, Lundh T, Zhang B, Anderson ARA. Gene divergence and pathway duplication in the metabolic network of yeast and digital organisms. J R Soc Interface 2009; 6:1233-45. [PMID: 19324678 DOI: 10.1098/rsif.2008.0514] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We have studied the metabolic gene-function network in yeast and digital organisms evolved in the artificial life platform Avida. The gene-function network is a bipartite network in which a link exists between a gene and a function (pathway) if that function depends on that gene, and can also be viewed as a decomposition of the more traditional functional gene networks, where two genes are linked if they share any function. We show that the gene-function network exhibits two distinct degree distributions: the gene degree distribution is scale-free while the pathway distribution is exponential. This is true for both yeast and digital organisms, which suggests that this is a general property of evolving systems, and we propose that the scale-free gene degree distribution is due to pathway duplication, i.e. the development of a new pathway where the original function is still retained. Pathway duplication would serve as preferential attachment for the genes, and the experiments with Avida revealed precisely this; genes involved in many pathways are more likely to increase their connectivity. Measuring the overlap between different pathways, in terms of the genes that constitute them, showed that pathway duplication also is a likely mechanism in yeast evolution. This analysis sheds new light on the evolution of genes and functionality, and suggests that function duplication could be an important mechanism in evolution.
Collapse
Affiliation(s)
- P Gerlee
- Center for Models of Life, Niels Bohr Institute, Blegdamsvej 17, 2100 Copenhagen, Denmark.
| | | | | | | |
Collapse
|
179
|
Negative autoregulation linearizes the dose-response and suppresses the heterogeneity of gene expression. Proc Natl Acad Sci U S A 2009; 106:5123-8. [PMID: 19279212 DOI: 10.1073/pnas.0809901106] [Citation(s) in RCA: 218] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Although several recent studies have focused on gene autoregulation, the effects of negative feedback (NF) on gene expression are not fully understood. Our purpose here was to determine how the strength of NF regulation affects the characteristics of gene expression in yeast cells harboring chromosomally integrated transcriptional cascades that consist of the yEGFP reporter controlled by (i) the constitutively expressed tetracycline repressor TetR or (ii) TetR repressing its own expression. Reporter gene expression in the cascade without feedback showed a steep (sigmoidal) dose-response and a wide, nearly bimodal yEGFP distribution, giving rise to a noise peak at intermediate levels of induction. We developed computational models that reproduced the steep dose-response and the noise peak and predicted that negative autoregulation changes reporter expression from bimodal to unimodal and transforms the dose-response from sigmoidal to linear. Prompted by these predictions, we constructed a "linearizer" circuit by adding TetR autoregulation to our original cascade and observed a massive (7-fold) reduction of noise at intermediate induction and linearization of dose-response before saturation. A simple mathematical argument explained these findings and indicated that linearization is highly robust to parameter variations. These findings have important implications for gene expression control in eukaryotic cells, including the design of synthetic expression systems.
Collapse
|
180
|
Bhadra S, Bhattacharyya C, Chandra NR, Mian IS. A linear programming approach for estimating the structure of a sparse linear genetic network from transcript profiling data. Algorithms Mol Biol 2009; 4:5. [PMID: 19239685 PMCID: PMC2654898 DOI: 10.1186/1748-7188-4-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2008] [Accepted: 02/24/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data. RESULTS The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the INSILICO1, INSILICO2 and INSILICO3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification. CONCLUSION A statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational - experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.
Collapse
|
181
|
Balleza E, López-Bojorquez LN, Martínez-Antonio A, Resendis-Antonio O, Lozada-Chávez I, Balderas-Martínez YI, Encarnación S, Collado-Vides J. Regulation by transcription factors in bacteria: beyond description. FEMS Microbiol Rev 2009; 33:133-51. [PMID: 19076632 PMCID: PMC2704942 DOI: 10.1111/j.1574-6976.2008.00145.x] [Citation(s) in RCA: 133] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Transcription is an essential step in gene expression and its understanding has been one of the major interests in molecular and cellular biology. By precisely tuning gene expression, transcriptional regulation determines the molecular machinery for developmental plasticity, homeostasis and adaptation. In this review, we transmit the main ideas or concepts behind regulation by transcription factors and give just enough examples to sustain these main ideas, thus avoiding a classical ennumeration of facts. We review recent concepts and developments: cis elements and trans regulatory factors, chromosome organization and structure, transcriptional regulatory networks (TRNs) and transcriptomics. We also summarize new important discoveries that will probably affect the direction of research in gene regulation: epigenetics and stochasticity in transcriptional regulation, synthetic circuits and plasticity and evolution of TRNs. Many of the new discoveries in gene regulation are not extensively tested with wetlab approaches. Consequently, we review this broad area in Inference of TRNs and Dynamical Models of TRNs. Finally, we have stepped backwards to trace the origins of these modern concepts, synthesizing their history in a timeline schema.
Collapse
Affiliation(s)
- Enrique Balleza
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, Mexico
| | | | | | | | | | | | | | | |
Collapse
|
182
|
Lagomarsino MC, Bassetti B, Castellani G, Remondini D. Functional models for large-scale gene regulation networks: realism and fiction. MOLECULAR BIOSYSTEMS 2009; 5:335-44. [PMID: 19396369 DOI: 10.1039/b816841p] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
High-throughput experiments are shedding light on the topology of large regulatory networks and at the same time their functional states, namely the states of activation of the nodes (for example transcript or protein levels) in different conditions, times, environments. We now possess a certain amount of information about these two levels of description, stored in libraries, databases and ontologies. A current challenge is to bridge the gap between topology and function, i.e. developing quantitative models aimed at characterizing the expression patterns of large sets of genes. However, approaches that work well for small networks become impossible to master at large scales, mainly because parameters proliferate. In this review we discuss the state of the art of large-scale functional network models, addressing the issue of what can be considered as "realistic" and what the main limitations may be. We also show some directions for future work, trying to set the goals that future models should try to achieve. Finally, we will emphasize the possible benefits in the understanding of biological mechanisms underlying complex multifactorial diseases, and in the development of novel strategies for the description and the treatment of such pathologies.
Collapse
|
183
|
Chen TY, Ho JWK, Liu H, Xie X. An innovative approach for testing bioinformatics programs using metamorphic testing. BMC Bioinformatics 2009; 10:24. [PMID: 19152705 PMCID: PMC2657898 DOI: 10.1186/1471-2105-10-24] [Citation(s) in RCA: 91] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2008] [Accepted: 01/19/2009] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Recent advances in experimental and computational technologies have fueled the development of many sophisticated bioinformatics programs. The correctness of such programs is crucial as incorrectly computed results may lead to wrong biological conclusion or misguided downstream experimentation. Common software testing procedures involve executing the target program with a set of test inputs and then verifying the correctness of the test outputs. However, due to the complexity of many bioinformatics programs, it is often difficult to verify the correctness of the test outputs. Therefore our ability to perform systematic software testing is greatly hindered. RESULTS We propose to use a novel software testing technique, metamorphic testing (MT), to test a range of bioinformatics programs. Instead of requiring a mechanism to verify whether an individual test output is correct, the MT technique verifies whether a pair of test outputs conform to a set of domain specific properties, called metamorphic relations (MRs), thus greatly increases the number and variety of test cases that can be applied. To demonstrate how MT is used in practice, we applied MT to test two open-source bioinformatics programs, namely GNLab and SeqMap. In particular we show that MT is simple to implement, and is effective in detecting faults in a real-life program and some artificially fault-seeded programs. Further, we discuss how MT can be applied to test programs from various domains of bioinformatics. CONCLUSION This paper describes the application of a simple, effective and automated technique to systematically test a range of bioinformatics programs. We show how MT can be implemented in practice through two real-life case studies. Since many bioinformatics programs, particularly those for large scale simulation and data analysis, are hard to test systematically, their developers may benefit from using MT as part of the testing strategy. Therefore our work represents a significant step towards software reliability in bioinformatics.
Collapse
Affiliation(s)
- Tsong Yueh Chen
- School of Information Technologies, The University of Sydney, Sydney, NSW 2006, Australia.
| | | | | | | |
Collapse
|
184
|
Andrecut M, Huang S, Kauffman SA. Heuristic approach to sparse approximation of gene regulatory networks. J Comput Biol 2009; 15:1173-86. [PMID: 18844584 DOI: 10.1089/cmb.2008.0087] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Determining the structure of the gene regulatory network using the information in genomewide profiles of mRNA abundance, such as microarray data, poses several challenges. Typically, "static" rather than dynamical profile measurements, such as those taken from steady state tissues in various conditions, are the starting point. This makes the inference of causal relationships between genes difficult. Moreover, the paucity of samples relative to the gene number leads to problems such as overfitting and underconstrained regression analysis. Here we present a novel method for the sparse approximation of gene regulatory networks that addresses these issues. It is formulated as a sparse combinatorial optimization problem which has a globally optimal solution in terms of l(0) norm error. In order to seek an approximate solution of the l(0) optimization problem, we consider a heuristic approach based on iterative greedy algorithms. We apply our method to a set of gene expression profiles comprising of 24,102 genes measured over 79 human tissues. The inferred network is a signed directed graph, hence predicts causal relationships. It exhibits typical characteristics of regulatory networks organism with partially known network topology, such as the average number of inputs per gene as well as the in-degree and out-degree distribution.
Collapse
Affiliation(s)
- M Andrecut
- Institute for Biocomplexity and Informatics, University of Calgary, Calgary, Alberta, Canada.
| | | | | |
Collapse
|
185
|
Modelling Transcriptional Regulation with a Mixture of Factor Analyzers and Variational Bayesian Expectation Maximization. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2009:601068. [PMID: 19572011 PMCID: PMC3171433 DOI: 10.1155/2009/601068] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2008] [Accepted: 02/27/2009] [Indexed: 11/17/2022]
Abstract
Understanding the mechanisms of gene transcriptional regulation through analysis of high-throughput postgenomic data is one of the central problems of computational systems biology. Various approaches have been proposed, but most of them fail to address at least one of the following objectives: (1) allow for the fact that transcription factors are potentially subject to posttranscriptional regulation; (2) allow for the fact that transcription factors cooperate as a functional complex in regulating gene expression, and (3) provide a model and a learning algorithm with manageable computational complexity. The objective of the present study is to propose and test a method that addresses these three issues. The model we employ is a mixture of factor analyzers, in which the latent variables correspond to different transcription factors, grouped into complexes or modules. We pursue inference in a Bayesian framework, using the Variational Bayesian Expectation Maximization (VBEM) algorithm for approximate inference of the posterior distributions of the model parameters, and estimation of a lower bound on the marginal likelihood for model selection. We have evaluated the performance of the proposed method on three criteria: activity profile reconstruction, gene clustering, and network inference.
Collapse
|
186
|
Abstract
The availability of completely sequenced genomes and the wealth of literature on gene regulation have enabled researchers to model the transcriptional regulation system of some organisms in the form of a network. In order to reconstruct such networks in non-model organisms, three principal approaches have been taken. First, one can transfer the interactions between homologous components from a model organism to the organism of interest. Second, microarray experiments can be used to detect patterns in gene expression that stem from regulatory interactions. Finally, knowledge of experimentally characterized transcription factor binding sites can be used to analyze the promoter sequences in a genome in order to identify potential binding sites. In this chapter, we will focus in detail on the first approach and describe methods to reconstruct and analyze the transcriptional regulatory networks of uncharacterized organisms by using a known regulatory network as a template.
Collapse
|
187
|
Sellerio A, Bassetti B, Isambert H, Cosentino Lagomarsino M. A comparative evolutionary study of transcription networks. The global role of feedback and hierachical structures. ACTA ACUST UNITED AC 2009; 5:170-9. [DOI: 10.1039/b815339f] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
188
|
Birmelé E, Elati M, Rouveirol C, Ambroise C. Identification of functional modules based on transcriptional regulation structure. BMC Proc 2008; 2 Suppl 4:S4. [PMID: 19091051 PMCID: PMC2654972 DOI: 10.1186/1753-6561-2-s4-s4] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background Identifying gene functional modules is an important step towards elucidating gene functions at a global scale. Clustering algorithms mostly rely on co-expression of genes, that is group together genes having similar expression profiles. Results We propose to cluster genes by co-regulation rather than by co-expression. We therefore present an inference algorithm for detecting co-regulated groups from gene expression data and introduce a method to cluster genes given that inferred regulatory structure. Finally, we propose to validate the clustering through a score based on the GO enrichment of the obtained groups of genes. Conclusion We evaluate the methods on the stress response of S. Cerevisiae data and obtain better scores than clustering obtained directly from gene expression.
Collapse
Affiliation(s)
- Etienne Birmelé
- Laboratoire Statistique et Génome, UMR CNRS 8071, INRA 1152, Tour Evry 2, F-91000 Evry, France.
| | | | | | | |
Collapse
|
189
|
Balázsi G, Heath AP, Shi L, Gennaro ML. The temporal response of the Mycobacterium tuberculosis gene regulatory network during growth arrest. Mol Syst Biol 2008; 4:225. [PMID: 18985025 PMCID: PMC2600667 DOI: 10.1038/msb.2008.63] [Citation(s) in RCA: 90] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2008] [Accepted: 09/19/2008] [Indexed: 12/13/2022] Open
Abstract
The virulence of Mycobacterium tuberculosis depends on the ability of the bacilli to switch between replicative (growth) and non-replicative (dormancy) states in response to host immunity. However, the gene regulatory events associated with transition to dormancy are largely unknown. To address this question, we have assembled the largest M. tuberculosis transcriptional-regulatory network to date, and characterized the temporal response of this network during adaptation to stationary phase and hypoxia, using published microarray data. Distinct sets of transcriptional subnetworks (origons) were responsive at various stages of adaptation, showing a gradual progression of network response under both conditions. Most of the responsive origons were in common between the two conditions and may help define a general transcriptional signature of M. tuberculosis growth arrest. These results open the door for a systems-level understanding of transition to non-replicative persistence, a phenotypic state that prevents sterilization of infection by the host immune response and promotes the establishment of latent M. tuberculosis infection, a condition found in two billion people worldwide.
Collapse
Affiliation(s)
- Gábor Balázsi
- Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77054, USA.
| | | | | | | |
Collapse
|
190
|
Eijssen LMT, Lindsey PJ, Peeters R, Westra RL, van Eijsden RGE, Bolotin-Fukuhara M, Smeets HJM, Vlietinck RFM. A novel stepwise analysis procedure of genome-wide expression profiles identifies transcript signatures of thiamine genes as classifiers of mitochondrial mutants. Yeast 2008; 25:129-40. [PMID: 18081196 DOI: 10.1002/yea.1573] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
To extract functional information on genes and processes from large expression datasets, analysis methods are required that can computationally deal with these amounts of data, are tunable to specific research questions, and construct classifiers that are not overspecific to the dataset at hand. To satisfy these requirements, a stepwise procedure that combines elements from principal component analysis and discriminant analysis, was developed to specifically retrieve genes involved in processes of interest and classify samples based upon those genes. In a global expression dataset of 300 gene knock-outs in Saccharomyces cerevisiae, the procedure successfully classified samples with similar 'cellular component' Gene Ontology annotations of the knock-out gene by expression signatures of limited numbers of genes. The genes discriminating 'mitochondrion' from the other subgroups were evaluated in more detail. The thiamine pathway turned out to be one of the processes involved and was successfully evaluated in a logistic model to predict whether yeast knock-outs were mitochondrial or not. Further, this pathway is biologically related to the mitochondrial system. Hence, this strongly indicates that our approach is effective and efficient in extracting meaningful information from large microarray experiments and assigning functions to yet uncharacterized genes.
Collapse
Affiliation(s)
- L M T Eijssen
- Department of Genetics and Cell Biology, Maastricht University, PO Box 616, 6200 MD Maastricht, The Netherlands.
| | | | | | | | | | | | | | | |
Collapse
|
191
|
Calcott B, Balcan D, Hohenlohe PA. A publish-subscribe model of genetic networks. PLoS One 2008; 3:e3245. [PMID: 18802467 PMCID: PMC2531231 DOI: 10.1371/journal.pone.0003245] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2008] [Accepted: 08/27/2008] [Indexed: 11/19/2022] Open
Abstract
We present a simple model of genetic regulatory networks in which regulatory connections among genes are mediated by a limited number of signaling molecules. Each gene in our model produces (publishes) a single gene product, which regulates the expression of other genes by binding to regulatory regions that correspond (subscribe) to that product. We explore the consequences of this publish-subscribe model of regulation for the properties of single networks and for the evolution of populations of networks. Degree distributions of randomly constructed networks, particularly multimodal in-degree distributions, which depend on the length of the regulatory sequences and the number of possible gene products, differed from simpler Boolean NK models. In simulated evolution of populations of networks, single mutations in regulatory or coding regions resulted in multiple changes in regulatory connections among genes, or alternatively in neutral change that had no effect on phenotype. This resulted in remarkable evolvability in both number and length of attractors, leading to evolved networks far beyond the expectation of these measures based on random distributions. Surprisingly, this rapid evolution was not accompanied by changes in degree distribution; degree distribution in the evolved networks was not substantially different from that of randomly generated networks. The publish-subscribe model also allows exogenous gene products to create an environment, which may be noisy or stable, in which dynamic behavior occurs. In simulations, networks were able to evolve moderate levels of both mutational and environmental robustness.
Collapse
Affiliation(s)
- Brett Calcott
- Philosophy Program, RSSS, Australian National University, Canberra, Australia
- Centre for Macroevolution and Macroecology, Australian National University, Canberra, Australia
| | - Duygu Balcan
- School of Informatics, Indiana University, Bloomington, Indiana, United States of America
| | - Paul A. Hohenlohe
- Department of Zoology, Oregon State University, Corvallis, Oregon, United States of America
- Center for Ecology and Evolutionary Biology, University of Oregon, Eugene, Oregon, United States of America
- * E-mail:
| |
Collapse
|
192
|
Dadon Z, Wagner N, Ashkenasy G. The Road to Non-Enzymatic Molecular Networks. Angew Chem Int Ed Engl 2008; 47:6128-36. [DOI: 10.1002/anie.200702552] [Citation(s) in RCA: 114] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
193
|
Dadon Z, Wagner N, Ashkenasy G. Der Weg zu nichtenzymatischen molekularen Netzwerken. Angew Chem Int Ed Engl 2008. [DOI: 10.1002/ange.200702552] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
194
|
Recovering genetic regulatory networks from chromatin immunoprecipitation and steady-state microarray data. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2008:248747. [PMID: 18584039 PMCID: PMC3171391 DOI: 10.1155/2008/248747] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2007] [Accepted: 05/20/2008] [Indexed: 01/29/2023]
Abstract
Recent advances in high-throughput DNA microarrays and chromatin immunoprecipitation (ChIP) assays have enabled the learning of the structure and functionality of genetic regulatory networks. In light of these heterogeneous data sets, this paper proposes a novel approach for reconstruction of genetic regulatory networks based on the posterior probabilities of gene regulations. Built within the framework of Bayesian statistics and computational Monte Carlo techniques, the proposed approach prevents the dichotomy of classifying gene interactions as either being connected or disconnected, thereby it reduces significantly the inference errors. Simulation results corroborate the superior performance of the proposed approach relative to the existing state-of-the-art algorithms. A genetic regulatory network for Saccharomyces cerevisiae is inferred based on the published real data sets, and biological meaningful results are discussed.
Collapse
|
195
|
Swindell WR. Genes regulated by caloric restriction have unique roles within transcriptional networks. Mech Ageing Dev 2008; 129:580-92. [PMID: 18634819 DOI: 10.1016/j.mad.2008.06.001] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2008] [Revised: 06/09/2008] [Accepted: 06/15/2008] [Indexed: 02/06/2023]
Abstract
Caloric restriction (CR) has received much interest as an intervention that delays age-related disease and increases lifespan. Whole-genome microarrays have been used to identify specific genes underlying these effects, and in mice, this has led to the identification of genes with expression responses to CR that are shared across multiple tissue types. Such CR-regulated genes represent strong candidates for future investigation, but have been understood only as a list, without regard to their broader role within transcriptional networks. In this study, co-expression and network properties of CR-regulated genes were investigated using data generated by more than 600 Affymetrix microarrays. This analysis identified groups of co-expressed genes and regulatory factors associated with the mammalian CR response, and uncovered surprising network properties of CR-regulated genes. Genes downregulated by CR were highly connected and located in dense network regions. In contrast, CR-upregulated genes were weakly connected and positioned in sparse network regions. Some network properties were mirrored by CR-regulated genes from invertebrate models, suggesting an evolutionary basis for the observed patterns. These findings contribute to a systems-level picture of how CR influences transcription within mammalian cells, and point towards a comprehensive understanding of CR in terms of its influence on biological networks.
Collapse
Affiliation(s)
- William R Swindell
- Department of Pathology, University of Michigan, Ann Arbor, MI 48109-2200, USA.
| |
Collapse
|
196
|
Veber P, Guziolowski C, Le Borgne M, Radulescu O, Siegel A. Inferring the role of transcription factors in regulatory networks. BMC Bioinformatics 2008; 9:228. [PMID: 18460200 PMCID: PMC2422845 DOI: 10.1186/1471-2105-9-228] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2007] [Accepted: 05/06/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Expression profiles obtained from multiple perturbation experiments are increasingly used to reconstruct transcriptional regulatory networks, from well studied, simple organisms up to higher eukaryotes. Admittedly, a key ingredient in developing a reconstruction method is its ability to integrate heterogeneous sources of information, as well as to comply with practical observability issues: measurements can be scarce or noisy. In this work, we show how to combine a network of genetic regulations with a set of expression profiles, in order to infer the functional effect of the regulations, as inducer or repressor. Our approach is based on a consistency rule between a network and the signs of variation given by expression arrays. RESULTS We evaluate our approach in several settings of increasing complexity. First, we generate artificial expression data on a transcriptional network of E. coli extracted from the literature (1529 nodes and 3802 edges), and we estimate that 30% of the regulations can be annotated with about 30 profiles. We additionally prove that at most 40.8% of the network can be inferred using our approach. Second, we use this network in order to validate the predictions obtained with a compendium of real expression profiles. We describe a filtering algorithm that generates particularly reliable predictions. Finally, we apply our inference approach to S. cerevisiae transcriptional network (2419 nodes and 4344 interactions), by combining ChIP-chip data and 15 expression profiles. We are able to detect and isolate inconsistencies between the expression profiles and a significant portion of the model (15% of all the interactions). In addition, we report predictions for 14.5% of all interactions. CONCLUSION Our approach does not require accurate expression levels nor times series. Nevertheless, we show on both data, real and artificial, that a relatively small number of perturbation experiments are enough to determine a significant portion of regulatory effects. This is a key practical asset compared to statistical methods for network reconstruction. We demonstrate that our approach is able to provide accurate predictions, even when the network is incomplete and the data is noisy.
Collapse
Affiliation(s)
- Philippe Veber
- Centre INRIA Rennes Bretagne Atlantique, IRISA, Rennes, France.
| | | | | | | | | |
Collapse
|
197
|
Isalan M, Lemerle C, Michalodimitrakis K, Horn C, Beltrao P, Raineri E, Garriga-Canut M, Serrano L. Evolvability and hierarchy in rewired bacterial gene networks. Nature 2008; 452:840-5. [PMID: 18421347 PMCID: PMC2666274 DOI: 10.1038/nature06847] [Citation(s) in RCA: 240] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2007] [Accepted: 02/22/2008] [Indexed: 11/09/2022]
Abstract
Sequencing DNA from several organisms has revealed that duplication and drift of existing genes have primarily moulded the contents of a given genome. Though the effect of knocking out or overexpressing a particular gene has been studied in many organisms, no study has systematically explored the effect of adding new links in a biological network. To explore network evolvability, we constructed 598 recombinations of promoters (including regulatory regions) with different transcription or sigma-factor genes in Escherichia coli, added over a wild-type genetic background. Here we show that approximately 95% of new networks are tolerated by the bacteria, that very few alter growth, and that expression level correlates with factor position in the wild-type network hierarchy. Most importantly, we find that certain networks consistently survive over the wild type under various selection pressures. Therefore new links in the network are rarely a barrier for evolution and can even confer a fitness advantage.
Collapse
Affiliation(s)
- Mark Isalan
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), UPF, 08003 Barcelona, Spain.
| | | | | | | | | | | | | | | |
Collapse
|
198
|
Pieroni E, de la Fuente van Bentem S, Mancosu G, Capobianco E, Hirt H, de la Fuente A. Protein networking: insights into global functional organization of proteomes. Proteomics 2008; 8:799-816. [PMID: 18297653 DOI: 10.1002/pmic.200700767] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The formulation of network models from global protein studies is essential to understand the functioning of organisms. Network models of the proteome enable the application of Complex Network Analysis, a quantitative framework to investigate large complex networks using techniques from graph theory, statistical physics, dynamical systems and other fields. This approach has provided many insights into the functional organization of the proteome so far and will likely continue to do so. Currently, several network concepts have emerged in the field of proteomics. It is important to highlight the differences between these concepts, since different representations allow different insights into functional organization. One such concept is the protein interaction network, which contains proteins as nodes and undirected edges representing the occurrence of binding in large-scale protein-protein interaction studies. A second concept is the protein-signaling network, in which the nodes correspond to levels of post-translationally modified forms of proteins and directed edges to causal effects through post-translational modification, such as phosphorylation. Several other network concepts were introduced for proteomics. Although all formulated as networks, the concepts represent widely different physical systems. Therefore caution should be taken when applying relevant topological analysis. We review recent literature formulating and analyzing such networks.
Collapse
Affiliation(s)
- Enrico Pieroni
- CRS4 Bioinformatica, c/o Parco Tecnologico POLARIS, Pula, Italy
| | | | | | | | | | | |
Collapse
|
199
|
Lozada-Chávez I, Angarica VE, Collado-Vides J, Contreras-Moreira B. The role of DNA-binding specificity in the evolution of bacterial regulatory networks. J Mol Biol 2008; 379:627-43. [PMID: 18466918 DOI: 10.1016/j.jmb.2008.04.008] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2008] [Accepted: 04/02/2008] [Indexed: 11/25/2022]
Abstract
Understanding the mechanisms by which transcriptional regulatory networks (TRNs) change through evolution is a fundamental problem.Here, we analyze this question using data from Escherichia coli and Bacillus subtilis, and find that paralogy relationships are insufficient to explain the global or local role observed for transcription factors (TFs) within regulatory networks. Our results provide a picture in which DNA-binding specificity, a molecular property that can be measured in different ways, is a predictor of the role of transcription factors. In particular, we observe that global regulators consistently display low levels of binding specificity, while displaying comparatively higher expression values in microarray experiments. In addition, we find a strong negative correlation between binding specificity and the number of co-regulators that help coordinate genetic expression on a genomic scale. A close look at several orthologous TFs,including FNR, a regulator found to be global in E. coli and local in B.subtilis, confirms the diagnostic value of specificity in order to understand their regulatory function, and highlights the importance of evaluating the metabolic and ecological relevance of effectors as another variable in the evolutionary equation of regulatory networks. Finally, a general model is presented that integrates some evolutionary forces and molecular properties,aiming to explain how regulons grow and shrink, as bacteria tune their regulation to increase adaptation.
Collapse
Affiliation(s)
- Irma Lozada-Chávez
- Programa de Genómica Computacional, Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Av. Universidad s/n, Cuernavaca, 62210 Morelos, México.
| | | | | | | |
Collapse
|
200
|
Zhao W, Serpedin E, Dougherty ER. Inferring connectivity of genetic regulatory networks using information-theoretic criteria. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008; 5:262-274. [PMID: 18451435 DOI: 10.1109/tcbb.2007.1067] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Recently, the concept of mutual information has been proposed for inferring the structure of genetic regulatory networks from gene expression profiling. After analyzing the limitations of mutual information in inferring the gene-to-gene interactions, this paper introduces the concept of conditional mutual information and based on it proposes two novel algorithms to infer the connectivity structure of genetic regulatory networks. One of the proposed algorithms exhibits a better accuracy while the other algorithm excels in simplicity and flexibility. By exploiting the mutual information and conditional mutual information, a practical metric is also proposed to assess the likeliness of direct connectivity between genes. This novel metric resolves a common limitation associated with the current inference algorithms, namely the situations where the gene connectivity is established in terms of the dichotomy of being either connected or disconnected. Based on the data sets generated by synthetic networks, the performance of the proposed algorithms is compared favorably relative to existing state-of-the-art schemes. The proposed algorithms are also applied on realistic biological measurements, such as the cutaneous melanoma data set, and biological meaningful results are inferred.
Collapse
Affiliation(s)
- Wentao Zhao
- Department of Electrical and Computer Engineering, Texas A & M University, College Station, TX 77843-3128, USA.
| | | | | |
Collapse
|