151
|
Cases I, de Lorenzo V. Promoters in the environment: transcriptional regulation in its natural context. Nat Rev Microbiol 2005; 3:105-18. [PMID: 15685222 DOI: 10.1038/nrmicro1084] [Citation(s) in RCA: 167] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Transcriptional activation of many bacterial promoters in their natural environment is not a simple on/off decision. The expression of cognate genes is integrated in layers of iterative regulatory networks that ensure the performance not only of the whole cell, but also of the bacterial population, and even the microbial community, in a changing environment. Unlike in vitro systems, where transcription initiation can be recreated with a handful of essential components, in vivo, promoters must process various physicochemical and metabolic signals to determine their output. This helps to achieve optimal bacterial fitness in extremely competitive niches. Promoters therefore merge specific responses to distinct signals with inclusive reactions to more general environmental changes.
Collapse
Affiliation(s)
- Ildefonso Cases
- Centro Nacional de Biotecnología, Consejo Superior de Investigaciones Científicas, Campus de Cantoblanco, 28049 Madrid, Spain
| | | |
Collapse
|
152
|
Jacques PE, Gervais AL, Cantin M, Lucier JF, Dallaire G, Drouin G, Gaudreau L, Goulet J, Brzezinski R. MtbRegList, a database dedicated to the analysis of transcriptional regulation in Mycobacterium tuberculosis. Bioinformatics 2005; 21:2563-5. [PMID: 15722376 DOI: 10.1093/bioinformatics/bti321] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
UNLABELLED MtbRegList is a database dedicated to the analysis of gene expression and regulation data in Mycobacterium tuberculosis. It is designed to contain predicted and characterized regulatory DNA motifs cross-referenced with corresponding transcription factor(s), and experimentally identified transcription start sites. MtbRegList can also handle flexible and complex genomic search requests, besides having a noteworthy browsing capability. AVAILABILITY MtbRegList is freely available at http://www.USherbrooke.ca/vers/MtbRegList
Collapse
|
153
|
Liu M, Durfee T, Cabrera JE, Zhao K, Jin DJ, Blattner FR. Global transcriptional programs reveal a carbon source foraging strategy by Escherichia coli. J Biol Chem 2005; 280:15921-7. [PMID: 15705577 DOI: 10.1074/jbc.m414050200] [Citation(s) in RCA: 142] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
By exploring global gene expression of Escherichia coli growing on six different carbon sources, we discovered a striking genome transcription pattern: as carbon substrate quality declines, cells systematically increase the number of genes expressed. Gene induction occurs in a hierarchical manner and includes many factors for uptake and metabolism of better but currently unavailable carbon sources. Concomitantly, cells also increase their motility. Thus, as the growth potential of the environment decreases, cells appear to devote progressively more energy on the mere possibility of improving conditions. This adaptation is not what would be predicated by classic regulatory models alone. We also observe an inverse correlation between gene activation and rRNA synthesis suggesting that reapportioning RNA polymerase (RNAP) contributes to the expanded genome activation. Significant differences in RNAP distribution in vivo, monitored using an RNAP-green fluorescent protein fusion, from energy-rich and energy-poor carbon source cultures support this hypothesis. Together, these findings represent the integration of both substrate-specific and global regulatory systems, and may be a bacterial approximation to metazoan risk-prone foraging behavior.
Collapse
Affiliation(s)
- Mingzhu Liu
- Department of Genetics, University of Wisconsin, Madison, Wisconsin 53706, USA
| | | | | | | | | | | |
Collapse
|
154
|
Salmon KA, Hung SP, Steffen NR, Krupp R, Baldi P, Hatfield GW, Gunsalus RP. Global gene expression profiling in Escherichia coli K12: effects of oxygen availability and ArcA. J Biol Chem 2005; 280:15084-96. [PMID: 15699038 DOI: 10.1074/jbc.m414030200] [Citation(s) in RCA: 159] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The ArcAB two-component system of Escherichia coli regulates the aerobic/anaerobic expression of genes that encode respiratory proteins whose synthesis is coordinated during aerobic/anaerobic cell growth. A genomic study of E. coli was undertaken to identify other potential targets of oxygen and ArcA regulation. A group of 175 genes generated from this study and our previous study on oxygen regulation (Salmon, K., Hung, S. P., Mekjian, K., Baldi, P., Hatfield, G. W., and Gunsalus, R. P. (2003) J. Biol. Chem. 278, 29837-29855), called our gold standard gene set, have p values <0.00013 and a posterior probability of differential expression value of 0.99. These 175 genes clustered into eight expression patterns and represent genes involved in a large number of cell processes, including small molecule biosynthesis, macromolecular synthesis, and aerobic/anaerobic respiration and fermentation. In addition, 119 of these 175 genes were also identified in our previous study of the fnr allele. A MEME/weight matrix method was used to identify a new putative ArcA-binding site for all genes of the E. coli genome. 16 new sites were identified upstream of genes in our gold standard set. The strict statistical analyses that we have performed on our data allow us to predict that 1139 genes in the E. coli genome are regulated either directly or indirectly by the ArcA protein with a 99% confidence level.
Collapse
Affiliation(s)
- Kirsty A Salmon
- Department of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, California 90095-1489, USA
| | | | | | | | | | | | | |
Collapse
|
155
|
Sneppen K, Dodd IB, Shearwin KE, Palmer AC, Schubert RA, Callen BP, Egan JB. A Mathematical Model for Transcriptional Interference by RNA Polymerase Traffic in Escherichia coli. J Mol Biol 2005; 346:399-409. [PMID: 15670592 DOI: 10.1016/j.jmb.2004.11.075] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2004] [Revised: 11/25/2004] [Accepted: 11/29/2004] [Indexed: 11/17/2022]
Abstract
Interactions between RNA polymerases (RNAP) resulting from tandem or convergent arrangements of promoters can cause transcriptional interference, often with important consequences for gene expression. However, it is not known what factors determine the magnitude of interference and which mechanisms are likely to predominate in any situation. We therefore developed a mathematical model incorporating three mechanisms of transcriptional interference in bacteria: occlusion (in which passing RNAPs block access to the promoter), collisions between elongating RNAPs, and "sitting duck" interference (in which RNAP complexes waiting to fire at the promoter are removed by passing RNAP). The predictions of the model are in good agreement with a recent quantitative in vivo study of convergent promoters in E.coli. Our analysis predicts that strong occlusion requires the interfering promoter to be very strong. Collisions can also produce strong interference but only if the interfering promoter is very strong or if the convergent promoters are far apart (>200 bp). For moderate strength interfering promoters and short inter-promoter distances, strong interference is dependent on the sitting duck mechanism. Sitting duck interference is dependent on the relative strengths of the two promoters. However, it is also dependent on the "aspect ratio" (the relative rates of RNAP binding and firing) of the sensitive promoter, allowing promoters of equal strength to have very different sensitivities to transcriptional interference. The model provides a framework for using transcriptional interference to investigate various dynamic processes on DNA in vivo.
Collapse
Affiliation(s)
- Kim Sneppen
- NORDITA, Nordic Institute for Theoretical Physics, Niels Bohr Institute, Blegdamsvej 17, DK-2100 Copenhagen, Denmark.
| | | | | | | | | | | | | |
Collapse
|
156
|
Zhang Z, Gosset G, Barabote R, Gonzalez CS, Cuevas WA, Saier MH. Functional interactions between the carbon and iron utilization regulators, Crp and Fur, in Escherichia coli. J Bacteriol 2005; 187:980-90. [PMID: 15659676 PMCID: PMC545712 DOI: 10.1128/jb.187.3.980-990.2005] [Citation(s) in RCA: 104] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2004] [Accepted: 10/26/2004] [Indexed: 11/20/2022] Open
Abstract
In Escherichia coli, the ferric uptake regulator (Fur) controls expression of the iron regulon in response to iron availability while the cyclic AMP receptor protein (Crp) regulates expression of the carbon regulon in response to carbon availability. We here identify genes subject to significant changes in expression level in response to the loss of both Fur and Crp. Many iron transport genes and several carbon metabolic genes are subject to dual control, being repressed by the loss of Crp and activated by the loss of Fur. However, the sodB gene, encoding superoxide dismutase, and the aceBAK operon, encoding the glyoxalate shunt enzymes, show the opposite responses, being activated by the loss of Crp and repressed by the loss of Fur. Several other genes including the sdhA-D, sucA-D, and fumA genes, encoding key constituents of the Krebs cycle, proved to be repressed by the loss of both transcription factors. Finally, the loss of both Crp and Fur activated a heterogeneous group of genes under sigmaS control encoding, for example, the cyclopropane fatty acid synthase, Cfa, the glycogen synthesis protein, GlgS, the 30S ribosomal protein, S22, and the mechanosensitive channel protein, YggB. Many genes appeared to be regulated by the two transcription factors in an apparently additive fashion, but apparent positive or negative cooperativity characterized several putative Crp/Fur interactions. Relevant published data were evaluated, putative Crp and Fur binding sites were identified, and representative results were confirmed by real-time PCR. Molecular explanations for some, but not all, of these effects are provided.
Collapse
MESH Headings
- Bacterial Proteins/genetics
- Bacterial Proteins/metabolism
- Base Sequence
- Binding Sites
- Carbon/metabolism
- Cyclic AMP Receptor Protein
- DNA, Bacterial/chemistry
- DNA, Bacterial/genetics
- DNA, Bacterial/metabolism
- Escherichia coli/genetics
- Escherichia coli/growth & development
- Escherichia coli/metabolism
- Escherichia coli Proteins/genetics
- Escherichia coli Proteins/metabolism
- Gene Expression Regulation, Bacterial
- Gene Expression Regulation, Enzymologic
- Glucose/metabolism
- Iron/metabolism
- Kinetics
- Nucleic Acid Hybridization
- Phenotype
- Polymerase Chain Reaction
- RNA, Bacterial/genetics
- RNA, Bacterial/isolation & purification
- Receptors, Cell Surface/genetics
- Receptors, Cell Surface/metabolism
- Regulatory Sequences, Nucleic Acid
- Repressor Proteins/genetics
- Repressor Proteins/metabolism
- Transcription Factors/genetics
- Transcription Factors/metabolism
Collapse
Affiliation(s)
- Zhongge Zhang
- Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, USA
| | | | | | | | | | | |
Collapse
|
157
|
Tan K, McCue LA, Stormo GD. Making connections between novel transcription factors and their DNA motifs. Genome Res 2005; 15:312-20. [PMID: 15653829 PMCID: PMC546533 DOI: 10.1101/gr.3069205] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The key components of a transcriptional regulatory network are the connections between trans-acting transcription factors and cis-acting DNA-binding sites. In spite of several decades of intense research, only a fraction of the estimated approximately 300 transcription factors in Escherichia coli have been linked to some of their binding sites in the genome. In this paper, we present a computational method to connect novel transcription factors and DNA motifs in E. coli. Our method uses three types of mutually independent information, two of which are gleaned by comparative analysis of multiple genomes and the third one derived from similarities of transcription-factor-DNA-binding-site interactions. The different types of information are combined to calculate the probability of a given transcription-factor-DNA-motif pair being a true pair. Tested on a study set of transcription factors and their DNA motifs, our method has a prediction accuracy of 59% for the top predictions and 85% for the top three predictions. When applied to 99 novel transcription factors and 70 novel DNA motifs, our method predicted 64 transcription-factor-DNA-motif pairs. Supporting evidence for some of the predicted pairs is presented. Functional annotations are made for 23 novel transcription factors based on the predicted transcription-factor-DNA-motif connections.
Collapse
Affiliation(s)
- Kai Tan
- Department of Genetics, Washington University School of Medicine, Saint Louis, Missouri 63110, USA
| | | | | |
Collapse
|
158
|
Fischer HP. Towards quantitative biology: integration of biological information to elucidate disease pathways and to guide drug discovery. BIOTECHNOLOGY ANNUAL REVIEW 2005; 11:1-68. [PMID: 16216773 DOI: 10.1016/s1387-2656(05)11001-1] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Developing a new drug is a tedious and expensive undertaking. The recently developed high-throughput experimental technologies, summarised by the terms genomics, transcriptomics, proteomics and metabolomics provide for the first time ever the means to comprehensively monitor the molecular level of disease processes. The "-omics" technologies facilitate the systematic characterisation of a drug target's physiology, thereby helping to reduce the typically high attrition rates in discovery projects, and improving the overall efficiency of pharmaceutical research processes. Currently, the bottleneck for taking full advantage of the new experimental technologies are the rapidly growing volumes of automatically produced biological data. A lack of scalable database systems and computational tools for target discovery has been recognised as a major hurdle. In this review, an overview will be given on recent progress in computational biology that has an impact on drug discovery applications. The focus will be on novel in silico methods to reconstruct regulatory networks, signalling cascades, and metabolic pathways, with an emphasis on comparative genomics and microarray-based approaches. Promising methods, such as the mathematical simulation of pathway dynamics are discussed in the context of applications in discovery projects. The review concludes by exemplifying concrete data-driven studies in pharmaceutical research that demonstrate the value of integrated computational systems for drug target identification and validation, screening assay development, as well as drug candidate efficacy and toxicity evaluations.
Collapse
|
159
|
Babu MM, Luscombe NM, Aravind L, Gerstein M, Teichmann SA. Structure and evolution of transcriptional regulatory networks. Curr Opin Struct Biol 2004; 14:283-91. [PMID: 15193307 DOI: 10.1016/j.sbi.2004.05.004] [Citation(s) in RCA: 462] [Impact Index Per Article: 23.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The regulatory interactions between transcription factors and their target genes can be conceptualised as a directed graph. At a global level, these regulatory networks display a scale-free topology, indicating the presence of regulatory hubs. At a local level, substructures such as motifs and modules can be discerned in these networks. Despite the general organisational similarity of networks across the phylogenetic spectrum, there are interesting qualitative differences among the network components, such as the transcription factors. Although the DNA-binding domains of the transcription factors encoded by a given organism are drawn from a small set of ancient conserved superfamilies, their relative abundance often shows dramatic variation among different phylogenetic groups. Large portions of these networks appear to have evolved through extensive duplication of transcription factors and targets, often with inheritance of regulatory interactions from the ancestral gene. Interactions are conserved to varying degrees among genomes. Insights from the structure and evolution of these networks can be translated into predictions and used for engineering of the regulatory networks of different organisms.
Collapse
Affiliation(s)
- M Madan Babu
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK.
| | | | | | | | | |
Collapse
|
160
|
Hierarchical structure and modules in the Escherichia coli transcriptional regulatory network revealed by a new top-down approach. BMC Bioinformatics 2004; 5:199. [PMID: 15603590 PMCID: PMC544888 DOI: 10.1186/1471-2105-5-199] [Citation(s) in RCA: 108] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2004] [Accepted: 12/16/2004] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cellular functions are coordinately carried out by groups of genes forming functional modules. Identifying such modules in the transcriptional regulatory network (TRN) of organisms is important for understanding the structure and function of these fundamental cellular networks and essential for the emerging modular biology. So far, the global connectivity structure of TRN has not been well studied and consequently not applied for the identification of functional modules. Moreover, network motifs such as feed forward loop are recently proposed to be basic building blocks of TRN. However, their relationship to functional modules is not clear. RESULTS In this work we proposed a top-down approach to identify modules in the TRN of E. coli. By studying the global connectivity structure of the regulatory network, we first revealed a five-layer hierarchical structure in which all the regulatory relationships are downward. Based on this regulatory hierarchy, we developed a new method to decompose the regulatory network into functional modules and to identify global regulators governing multiple modules. As a result, 10 global regulators and 39 modules were identified and shown to have well defined functions. We then investigated the distribution and composition of the two basic network motifs (feed forward loop and bi-fan motif) in the hierarchical structure of TRN. We found that most of these network motifs include global regulators, indicating that these motifs are not basic building blocks of modules since modules should not contain global regulators. CONCLUSION The transcriptional regulatory network of E. coli possesses a multi-layer hierarchical modular structure without feedback regulation at transcription level. This hierarchical structure builds the basis for a new and simple decomposition method which is suitable for the identification of functional modules and global regulators in the transcriptional regulatory network of E. coli. Analysis of the distribution of feed forward loops and bi-fan motifs in the hierarchical structure suggests that these network motifs are not elementary building blocks of functional modules in the transcriptional regulatory network of E. coli.
Collapse
|
161
|
Ma HW, Kumar B, Ditges U, Gunzer F, Buer J, Zeng AP. An extended transcriptional regulatory network of Escherichia coli and analysis of its hierarchical structure and network motifs. Nucleic Acids Res 2004; 32:6643-9. [PMID: 15604458 PMCID: PMC545451 DOI: 10.1093/nar/gkh1009] [Citation(s) in RCA: 132] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Recent studies of genome-wide transcriptional regulatory network (TRN) revealed several intriguing structural and dynamic features of gene expression at a system level. Unfortunately, the network under study is often far from complete. A critical question is thus how much the network is incomplete and to what extent this would affect the results of analysis. Here we compare the Escherichia coli TRN built by Shen-Orr et al. (Nature Genet., 31, 64-68) with two TRNs reconstructed from RegulonDB and Ecocyc respectively and present an extended E.coli TRN by integrating information from these databases and literature. The scale of the extended TRN is about twice as large as the previous ones. The new network preserves the multi-layer hierarchical structure which we recently reported but has more layers. More global regulators are inferred. While the feed forward loop (FFL) is confirmed to be highly representative in the network, the distribution of the different types of FFLs is different from that based on the incomplete network. In contrast to the notion of motif aggregation and formation of homologous motif clusters, we found that most FFLs interact and form a giant motif cluster. Furthermore, we show that only a small portion of the genes is solely regulated by only one FFL. Many genes are regulated by two or more interacting FFLs or other more complicated network motifs together with transcriptional factors not belonging to any network motifs, thereby forming complex regulatory circuits. Overall, the extended TRN represents a more solid basis for structural and functional analysis of genome-wide gene regulation in E.coli.
Collapse
Affiliation(s)
- Hong-Wu Ma
- Department of Genome Analysis and Department of Mucosal Immunity, GBF-German Research Center for Biotechnology, Mascheroder Weg 1, 38124 Braunschweig, Germany
| | | | | | | | | | | |
Collapse
|
162
|
Pérez-Rueda E, Collado-Vides J, Segovia L. Phylogenetic distribution of DNA-binding transcription factors in bacteria and archaea. Comput Biol Chem 2004; 28:341-50. [PMID: 15556475 DOI: 10.1016/j.compbiolchem.2004.09.004] [Citation(s) in RCA: 73] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2004] [Revised: 09/13/2004] [Accepted: 09/15/2004] [Indexed: 11/21/2022]
Abstract
We have addressed the distribution and abundance of 75 transcription factor (TF) families in complete genomes from 90 different bacterial and archaeal species. We found that the proportion of TFs increases with genome size. The deficit of TFs in some genomes might be compensated by the presence of proteins organizing and compacting DNA, such as histone-like proteins. Nine families are represented in all the bacteria and archaea we analyzed, whereas 17 families are specific to bacteria, providing evidence for regulon specialization at an early stage of evolution between the bacterial and archeal lineages. Ten of the 17 families identified in bacteria belong exclusively to the proteobacteria defining a specific signature for this taxonomical group. In bacteria, 10 families are lost mostly in intracellular pathogens and endosymbionts, while 9 families seem to have been horizontally transferred to archaea. The winged helix-turn-helix (HTH) is by far the most abundant structure (motif) in prokaryotes, and might have been the earliest HTH motif to appear as shown by its distribution and abundance in both bacterial and archaeal cellular domains. Horizontal gene transfer and lineage-specific gene losses suggest a progressive elimination of TFs in the course of archaeal and bacterial evolution. This analysis provides a framework for discussing the selective forces directing the evolution of the transcriptional machinery in prokaryotes.
Collapse
Affiliation(s)
- Ernesto Pérez-Rueda
- Facultad de Ciencias, UAEM, Av. Universidad 1001, CP. 62210, Col. Chamilpa, Cuernavaca, Morelos, México.
| | | | | |
Collapse
|
163
|
Abstract
MOTIVATION Annotation of operons in a bacterial genome is an important step in determining an organism's transcriptional regulatory program. While extensive studies of operon structure have been carried out in a few species such as Escherichia coli, fewer resources exist to inform operon prediction in newly sequenced genomes. In particular, many extant operon finders require a large body of training examples to learn the properties of operons in the target organism. For newly sequenced genomes, such examples are generally not available; moreover, a model of operons trained on one species may not reflect the properties of other, distantly related organisms. We encountered these issues in the course of predicting operons in the genome of Bacteroides thetaiotaomicron (B.theta), a common anaerobe that is a prominent component of the normal adult human intestinal microbial community. RESULTS We describe an operon predictor designed to work without extensive training data. We rely on a small set of a priori assumptions about the properties of the genome being annotated that permit estimation of the probability that two adjacent genes lie in a common operon. Predictions integrate several sources of information, including intergenic distance, common functional annotation and a novel formulation of conserved gene order. We validate our predictor both on the known operons of E.coli and on the genome of B.theta, using expression data to evaluate our predictions in the latter.
Collapse
Affiliation(s)
- B P Westover
- Department of Computer Science and Engineering, Washington University St. Louis, MO 63130, USA
| | | | | | | |
Collapse
|
164
|
Wei GH, Liu DP, Liang CC. Charting gene regulatory networks: strategies, challenges and perspectives. Biochem J 2004; 381:1-12. [PMID: 15080794 PMCID: PMC1133755 DOI: 10.1042/bj20040311] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2004] [Revised: 04/13/2004] [Accepted: 04/13/2004] [Indexed: 11/17/2022]
Abstract
One of the foremost challenges in the post-genomic era will be to chart the gene regulatory networks of cells, including aspects such as genome annotation, identification of cis-regulatory elements and transcription factors, information on protein-DNA and protein-protein interactions, and data mining and integration. Some of these broad sets of data have already been assembled for building networks of gene regulation. Even though these datasets are still far from comprehensive, and the approach faces many important and difficult challenges, some strategies have begun to make connections between disparate regulatory events and to foster new hypotheses. In this article we review several different genomics and proteomics technologies, and present bioinformatics methods for exploring these data in order to make novel discoveries.
Collapse
Affiliation(s)
- Gong-Hong Wei
- National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) and Peking Union Medical College (PUMC), 5 Dong Dan San Tiao, Beijing 100005, P.R. China
| | - De-Pei Liu
- National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) and Peking Union Medical College (PUMC), 5 Dong Dan San Tiao, Beijing 100005, P.R. China
- To whom correspondence should be addressed (e-mail )
| | - Chih-Chuan Liang
- National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) and Peking Union Medical College (PUMC), 5 Dong Dan San Tiao, Beijing 100005, P.R. China
| |
Collapse
|
165
|
de Vos WM, Bron PA, Kleerebezem M. Post-genomics of lactic acid bacteria and other food-grade bacteria to discover gut functionality. Curr Opin Biotechnol 2004; 15:86-93. [PMID: 15081044 DOI: 10.1016/j.copbio.2004.02.006] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Recent years have seen an explosion in the number of complete or almost complete genomic sequences of lactic acid bacteria and other food-grade bacteria that are used in functional foods to increase the health of the consumer. These have been instrumental in the development of functional, comparative and other post-genomics approaches that provide the possibility to detect, unravel and understand their functionality in the human intestinal tract. In conjunction with other high-throughput approaches, these advances can be exploited in the functional food innovation cycle for developing new or designed probiotic and other bacterial products that impact gut health.
Collapse
Affiliation(s)
- Willem M de Vos
- Wageningen Center for Food Sciences and Laboratory of Microbiology, Diedenweg 20, PO Box 557, 6700 AN, Wageningen, The Netherlands.
| | | | | |
Collapse
|
166
|
|
167
|
Peter BJ, Arsuaga J, Breier AM, Khodursky AB, Brown PO, Cozzarelli NR. Genomic transcriptional response to loss of chromosomal supercoiling in Escherichia coli. Genome Biol 2004; 5:R87. [PMID: 15535863 PMCID: PMC545778 DOI: 10.1186/gb-2004-5-11-r87] [Citation(s) in RCA: 245] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2004] [Revised: 10/01/2004] [Accepted: 10/11/2004] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND The chromosome of Escherichia coli is maintained in a negatively supercoiled state, and supercoiling levels are affected by growth phase and a variety of environmental stimuli. In turn, supercoiling influences local DNA structure and can affect gene expression. We used microarrays representing nearly the entire genome of Escherichia coli MG1655 to examine the dynamics of chromosome structure. RESULTS We measured the transcriptional response to a loss of supercoiling caused either by genetic impairment of a topoisomerase or addition of specific topoisomerase inhibitors during log-phase growth and identified genes whose changes are statistically significant. Transcription of 7% of the genome (306 genes) was rapidly and reproducibly affected by changes in the level of supercoiling; the expression of 106 genes increased upon chromosome relaxation and the expression of 200 decreased. These changes are most likely to be direct effects, as the kinetics of their induction or repression closely follow the kinetics of DNA relaxation in the cells. Unexpectedly, the genes induced by relaxation have a significantly enriched AT content in both upstream and coding regions. CONCLUSIONS The 306 supercoiling-sensitive genes are functionally diverse and widely dispersed throughout the chromosome. We propose that supercoiling acts as a second messenger that transmits information about the environment to many regulatory networks in the cell.
Collapse
Affiliation(s)
- Brian J Peter
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720-3204, USA
- Current address: Neurobiology Division, MRC Laboratory of Molecular Biology, Cambridge CB2 2QH, UK
| | - Javier Arsuaga
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720-3204, USA
- Mathematics Department, University of California, Berkeley, CA 94720, USA
| | - Adam M Breier
- Graduate Group in Biophysics, University of California, Berkeley, CA 94720, USA
| | - Arkady B Khodursky
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, St. Paul, MN 55108, USA
| | - Patrick O Brown
- Department of Biochemistry and Howard Hughes Medical Institute, Stanford University, Stanford, CA 94305-5307, USA
| | - Nicholas R Cozzarelli
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720-3204, USA
| |
Collapse
|
168
|
Kiryu H, Oshima T, Asai K. Extracting relations between promoter sequences and their strengths from microarray data. Bioinformatics 2004; 21:1062-8. [PMID: 15513998 DOI: 10.1093/bioinformatics/bti094] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The relations between the promoter sequences and their strengths were extensively studied in the 1980s. Although these studies uncovered strong sequence-strength correlations, the cost of their elaborate experimental methods have been too high to be applied to a large number of promoters. On the contrary, a recent increase in the microarray data allows us to compare thousands of gene expressions with their DNA sequences. RESULTS We studied the relations between the promoter sequences and their strengths using the Escherichia coli microarray data. We modeled those relations using a simple weight matrix, which was optimized with a novel support vector regression method. It was observed that several non-consensus bases in the '-35' and '-10' regions of promoter sequences act positively on the promoter strength and that certain consensus bases have a minor effect on the strength. We analyzed outliers for which the observed gene expressions deviate from the promoter strength predictions, and identified several genes with enhanced expressions due to multiple promoters and genes under strong regulation by transcription factors. Our method is applicable to other procaryotes for which both the promoter sequences and the microarray data are available.
Collapse
Affiliation(s)
- Hisanori Kiryu
- Graduate School of Information Sciences, Nara Institute of Science and Technology 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan.
| | | | | |
Collapse
|
169
|
Reed JL, Palsson BØ. Genome-scale in silico models of E. coli have multiple equivalent phenotypic states: assessment of correlated reaction subsets that comprise network states. Genome Res 2004; 14:1797-805. [PMID: 15342562 PMCID: PMC515326 DOI: 10.1101/gr.2546004] [Citation(s) in RCA: 143] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The constraint-based analysis of genome-scale metabolic and regulatory networks has been successful in predicting phenotypes and useful for analyzing high-throughput data sets. Within this modeling framework, linear optimization has been used to study genome-scale metabolic models, resulting in the enumeration of single optimal solutions describing the best use of the network to support growth. Here mixed-integer linear programming was used to calculate and study a subset of the alternate optimal solutions for a genome-scale metabolic model of Escherichia coli (iJR904) under a wide variety of environmental conditions. Analysis of the calculated sets of optimal solutions found that: (1) only a small subset of reactions in the network have variable fluxes across optima; (2) sets of reactions that are always used together in optimal solutions, correlated reaction sets, showed moderate agreement with the currently known transcriptional regulatory structure in E. coli and available expression data, and (3) reactions that are used under certain environmental conditions can provide clues about network regulatory needs. In addition, calculation of suboptimal flux distributions, using flux variability analysis, identified reactions which are used under significantly more environmental conditions suboptimally than optimally. Together these results demonstrate the utilization of reactions in genome-scale models under a variety of different growth conditions.
Collapse
Affiliation(s)
- Jennifer L Reed
- Department of Bioengineering, University of California, San Diego, San Diego, California 92092-0412, USA
| | | |
Collapse
|
170
|
Janga SC, Moreno-Hagelsieb G. Conservation of adjacency as evidence of paralogous operons. Nucleic Acids Res 2004; 32:5392-7. [PMID: 15477389 PMCID: PMC524292 DOI: 10.1093/nar/gkh882] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Most of the analyses on the conservation of gene order are limited to orthologous genes. However, the organization of genes into operons might also result in the conservation of gene order of paralogous genes. Thus, we sought computational evidence that conservation of gene order of paralogous genes represents another level of conservation of genes in operons. We found that pairs of genes within experimentally characterized operons of Escherichia coli K12 and Bacillus subtilis tend to have more adjacently conserved paralogs than pairs of genes at transcription unit boundaries. The fraction of same strand gene pairs corresponding to conserved paralogs averages 0.07 with a maximum of 0.22 in Borrelia burgdorferi. The use of evidence from the conservation of adjacency of paralogous genes can improve the prediction of operons in E.coli K12 by approximately 0.27 over predictions using conservation of adjacency of orthologous genes alone.
Collapse
Affiliation(s)
- Sarath Chandra Janga
- Program of Computational Genomics, CIFN-UNAM, Apdo Postal 565-A, Cuernavaca, Morelos, 62100 Mexico
| | | |
Collapse
|
171
|
Warren PB, ten Wolde PR. Statistical Analysis of the Spatial Distribution of Operons in the Transcriptional Regulation Network of Escherichia coli. J Mol Biol 2004; 342:1379-90. [PMID: 15364567 DOI: 10.1016/j.jmb.2004.07.074] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2003] [Revised: 07/08/2004] [Accepted: 07/21/2004] [Indexed: 10/26/2022]
Abstract
We have performed a statistical analysis of the spatial distribution of operons along the DNA in the transcriptional regulation network of Escherichia coli. The analysis reveals that pairs of operons that regulate each other and those that are co-regulated tend to lie much closer to one another than would be expected for a random network. Moreover, these pairs of operons tend to be transcribed in diverging directions. This spatial arrangement of operons allows the upstream regulatory domains to overlap and interfere with each other and our analysis also demonstrates the statistical significance of this motif of overlapping operons. Overlapping operons afford additional regulatory control, such as the correlated or anticorrelated expression of operons. We show by a mean-field analysis of a feed-forward loop that overlapping operons can drastically enhance the performance of gene regulatory networks. Our results suggest that regulatory control can provide a selective pressure that drives operons together in the course of evolution.
Collapse
Affiliation(s)
- P B Warren
- FOM Institute for Atomic and Molecular Physics, Kruislaan 407, 1098 SJ Amsterdam, The Netherlands
| | | |
Collapse
|
172
|
Nasvall SJ, Chen P, Bjork GR. The modified wobble nucleoside uridine-5-oxyacetic acid in tRNAPro(cmo5UGG) promotes reading of all four proline codons in vivo. RNA (NEW YORK, N.Y.) 2004; 10:1662-73. [PMID: 15383682 PMCID: PMC1370651 DOI: 10.1261/rna.7106404] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
In Salmonella enterica serovar Typhimurium five of the eight family codon boxes are decoded by a tRNA having the modified nucleoside uridine-5-oxyacetic acid (cmo5U) as a wobble nucleoside present in position 34 of the tRNA. In the proline family codon box, one (tRNAProcmo5UGG) of the three tRNAs that reads the four proline codons has cmo5U34. According to theoretical predictions and several results obtained in vitro, cmo5U34 should base pair with A, G, and U in the third position of the codon but not with C. To analyze the function of cmo5U34 in tRNAProcmo5UGG in vivo, we first identified two genes (cmoA and cmoB) involved in the synthesis of cmo5U34. The null mutation cmoB2 results in tRNA having 5-hydroxyuridine (ho5U34) instead of cmo5U34, whereas the null mutation cmoA1 results in the accumulation of 5-methoxyuridine (mo5U34) and ho5U34 in tRNA. The results suggest that the synthesis of cmo5U34 occurs as follows: U34 -->(?) ho5U -->(CmoB) mo5U -->(CmoA?) cmo5U. We introduced the cmoA1 or the cmoB2 null mutations into a strain that only had tRNAProcmo5UGG and thus lacked the other two proline-specific tRNAs normally present in the cell. From analysis of growth rates of various strains and of the frequency of +1 frameshifting at a CCC-U site we conclude: (1) unexpectedly, tRNAProcmo5UGG is able to read all four proline codons; (2) the presence of ho5U34 instead of cmo5U34 in this tRNA reduces the efficiency with which it reads all four codons; and (3) the fully modified nucleoside is especially important for reading proline codons ending with U or C.
Collapse
Affiliation(s)
- S Joakim Nasvall
- Department of Molecular Biology, Umeå University, S-90 187 Umeå, Sweden
| | | | | |
Collapse
|
173
|
Kelley BP, Yuan B, Lewitter F, Sharan R, Stockwell BR, Ideker T. PathBLAST: a tool for alignment of protein interaction networks. Nucleic Acids Res 2004; 32:W83-8. [PMID: 15215356 PMCID: PMC441549 DOI: 10.1093/nar/gkh411] [Citation(s) in RCA: 292] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2004] [Revised: 04/01/2004] [Accepted: 04/01/2004] [Indexed: 11/13/2022] Open
Abstract
PathBLAST is a network alignment and search tool for comparing protein interaction networks across species to identify protein pathways and complexes that have been conserved by evolution. The basic method searches for high-scoring alignments between pairs of protein interaction paths, for which proteins of the first path are paired with putative orthologs occurring in the same order in the second path. This technique discriminates between true- and false-positive interactions and allows for functional annotation of protein interaction pathways based on similarity to the network of another, well-characterized species. PathBLAST is now available at http://www.pathblast.org/ as a web-based query. In this implementation, the user specifies a short protein interaction path for query against a target protein-protein interaction network selected from a network database. PathBLAST returns a ranked list of matching paths from the target network along with a graphical view of these paths and the overlap among them. Target protein-protein interaction networks are currently available for Helicobacter pylori, Saccharomyces cerevisiae, Caenorhabditis elegans and Drosophila melanogaster. Just as BLAST enables rapid comparison of protein sequences between genomes, tools such as PathBLAST are enabling comparative genomics at the network level.
Collapse
Affiliation(s)
- Brian P Kelley
- Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA
| | | | | | | | | | | |
Collapse
|
174
|
Rigali S, Schlicht M, Hoskisson P, Nothaft H, Merzbacher M, Joris B, Titgemeyer F. Extending the classification of bacterial transcription factors beyond the helix-turn-helix motif as an alternative approach to discover new cis/trans relationships. Nucleic Acids Res 2004; 32:3418-26. [PMID: 15247334 PMCID: PMC443547 DOI: 10.1093/nar/gkh673] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Transcription factors (TFs) of bacterial helix-turn-helix superfamilies exhibit different effector-binding domains (EBDs) fused to a DNA-binding domain with a common feature. In a previous study of the GntR superfamily, we demonstrated that classifying members into subfamilies according to the EBD heterogeneity highlighted unsuspected and accurate TF-binding site signatures. In this work, we present how such in silico analysis can provide prediction tools to discover new cis/trans relationships. The TF-binding site consensus of the HutC/GntR subfamily was used to (i) predict target sites within the Streptomyces coelicolor genome, (ii) discover a new HutC/GntR regulon and (iii) discover its specific TF. By scanning the S.coelicolor genome we identified a presumed new HutC regulon that comprises genes of the phosphotransferase system (PTS) specific for the uptake of N-acetylglucosamine (PTS(Nag)). A weight matrix was derived from the compilation of the predicted cis-acting elements upstream of each gene of the presumed regulon. Under the assumption that TFs are often subject to autoregulation, we used this matrix to scan the upstream region of the 24 HutC-like members of S.coelicolor. orf SCO5231 (dasR) was selected as the best candidate according to the high score of a 16 bp sequence identified in its upstream region. Our prediction that DasR regulates the PTS(Nag) regulon was confirmed by in vivo and in vitro experiments. In conclusion, our in silico approach permitted to highlight the specific TF of a regulon out of the 673 orfs annotated as 'regulatory proteins' within the genome of S.coelicolor.
Collapse
Affiliation(s)
- Sébastien Rigali
- Centre d'Ingénierie des Protéines, Université de Liège, Institut de Chimie B6a, B-4000, Liège, Belgium.
| | | | | | | | | | | | | |
Collapse
|
175
|
Nickels BE, Mukhopadhyay J, Garrity SJ, Ebright RH, Hochschild A. The sigma 70 subunit of RNA polymerase mediates a promoter-proximal pause at the lac promoter. Nat Struct Mol Biol 2004; 11:544-50. [PMID: 15122345 DOI: 10.1038/nsmb757] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2003] [Accepted: 03/15/2004] [Indexed: 12/20/2022]
Abstract
The sigma(70) subunit of RNA polymerase plays an essential role in transcription initiation. In addition, sigma(70) has a critical regulatory role during transcription elongation at the bacteriophage lambda late promoter, lambda P(R'). At this promoter, sigma(70) mediates a pause in early elongation through contact with a DNA sequence element in the initially transcribed region that resembles a promoter -10 element. Here we provide evidence that sigma(70) also mediates a pause in early elongation at the lac promoter (plac). Like that at lambda P(R'), the pause at plac is facilitated by a sequence element in the initially transcribed region that resembles a promoter -10 element. Using biophysical analysis, we demonstrate that the pause-inducing sequence element at plac stabilizes the interaction between sigma(70) and the remainder of the transcription elongation complex. Bioinformatic analysis suggests that promoter-proximal sigma(70)-dependent pauses may play a role in the regulation of many bacterial promoters.
Collapse
Affiliation(s)
- Bryce E Nickels
- Department of Microbiology and Molecular Genetics, Harvard Medical School, 200 Longwood Avenue, Boston, Massachusetts 02115, USA
| | | | | | | | | |
Collapse
|
176
|
Evangelisti AM, Wagner A. Molecular evolution in the yeast transcriptional regulation network. ACTA ACUST UNITED AC 2004; 302:392-411. [PMID: 15287103 DOI: 10.1002/jez.b.20027] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We analyze the structure of the yeast transcriptional regulation network, as revealed by chromatin immunoprecipitation experiments, and characterize the molecular evolution of both its transcriptional regulators and their target (regulated) genes. We test the hypothesis that highly connected genes are more important to the function of gene networks. Three lines of evidence-the rate of molecular evolution of network genes, the rate at which network genes undergo gene duplication, and the effects of synthetic null mutation in network genes-provide no strong support for this hypothesis. In addition, we ask how network genes diverge in their transcriptional regulation after duplication. Both loss (subfunctionalization) and gain (neofunctionalization) of transcription factor binding play a role in this divergence, which is often rapid. On the one hand, gene duplicates experience a net loss in the number of transcription factors binding to them, indicating the importance of losing transcription factor binding sites after gene duplication. On the other hand, the number of transcription factors that bind to highly diverged duplicates is significantly greater than would be expected if loss of binding played the only role in the divergence of duplicate genes.
Collapse
|