1
|
Sepúlveda-Rebolledo P, González-Rosales C, Dopson M, Pérez-Rueda E, Holmes DS, Valdés JH. Comparative genomics sheds light on transcription factor-mediated regulation in the extreme acidophilic Acidithiobacillia representatives. Res Microbiol 2024; 175:104135. [PMID: 37678513 DOI: 10.1016/j.resmic.2023.104135] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 08/28/2023] [Accepted: 08/30/2023] [Indexed: 09/09/2023]
Abstract
Extreme acidophiles thrive in acidic environments, confront a multitude of challenges, and demonstrate remarkable adaptability in their metabolism to cope with the ever-changing environmental fluctuations, which encompass variations in temperature, pH levels, and the availability of electron acceptors and donors. The survival and proliferation of members within the Acidithiobacillia class rely on the deployment of transcriptional regulatory systems linked to essential physiological traits. The study of these transcriptional regulatory systems provides valuable insights into critical processes, such as energy metabolism and nutrient assimilation, and how they integrate into major genetic-metabolic circuits. In this study, we examined the transcriptional regulatory repertoires and potential interactions of forty-three Acidithiobacillia complete and draft genomes, encompassing nine species. To investigate the function and diversity of Transcription Factors (TFs) and their DNA Binding Sites (DBSs), we conducted a genome-wide comparative analysis, which allowed us to identify these regulatory elements in representatives of Acidithiobacillia. We classified TFs into gene families and compared their occurrence among all representatives, revealing conservation patterns across the class. The results identified conserved regulators for several pathways, including iron and sulfur oxidation, the main pathways for energy acquisition, providing new evidence for viable regulatory interactions and branch-specific conservation in Acidithiobacillia. The identification of TFs and DBSs not only corroborates existing experimental information for selected species, but also introduces novel candidates for experimental validation. Moreover, these promising candidates have the potential for further extension to new representatives within the class.
Collapse
Affiliation(s)
- Pedro Sepúlveda-Rebolledo
- Centro de Genómica y Bioinformática and PhD. Program on Integrative Genomics, Facultad de Ciencias, Universidad Mayor, Santiago (8580745), Chile.
| | - Carolina González-Rosales
- Center for Bioinformatics and Genome Biology, Fundación Ciencia & Vida, Santiago (8580638), Chile; Centre for Ecology and Evolution in Microbial Model Systems, Linnaeus University, SE-391 82 Kalmar, Sweden.
| | - Mark Dopson
- Centre for Ecology and Evolution in Microbial Model Systems, Linnaeus University, SE-391 82 Kalmar, Sweden.
| | - Ernesto Pérez-Rueda
- Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Unidad Académica del Estado de Yucatán, Mérida, Yucatán, Mexico.
| | - David S Holmes
- Center for Bioinformatics and Genome Biology, Fundación Ciencia & Vida, Santiago (8580638), Chile; Facultad de Medicina y Ciencia, Universidad San Sebastián, Santiago (7510156), Chile.
| | - Jorge H Valdés
- Center for Bioinformatics and Integrative Biology, Facultad de Ciencias de la Vida, Universidad Andrés Bello, Santiago (8370146), Chile.
| |
Collapse
|
2
|
Regulation of Gene Expression in Shewanella oneidensis MR-1 during Electron Acceptor Limitation and Bacterial Nanowire Formation. Appl Environ Microbiol 2016; 82:5428-43. [PMID: 27342561 DOI: 10.1128/aem.01615-16] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 06/22/2016] [Indexed: 12/12/2022] Open
Abstract
UNLABELLED In limiting oxygen as an electron acceptor, the dissimilatory metal-reducing bacterium Shewanella oneidensis MR-1 rapidly forms nanowires, extensions of its outer membrane containing the cytochromes MtrC and OmcA needed for extracellular electron transfer. RNA sequencing (RNA-Seq) analysis was employed to determine differential gene expression over time from triplicate chemostat cultures that were limited for oxygen. We identified 465 genes with decreased expression and 677 genes with increased expression. The coordinated increased expression of heme biosynthesis, cytochrome maturation, and transport pathways indicates that S. oneidensis MR-1 increases cytochrome production, including the transcription of genes encoding MtrA, MtrC, and OmcA, and transports these decaheme cytochromes across the cytoplasmic membrane during electron acceptor limitation and nanowire formation. In contrast, the expression of the mtrA and mtrC homologs mtrF and mtrD either remains unaffected or decreases under these conditions. The ompW gene, encoding a small outer membrane porin, has 40-fold higher expression during oxygen limitation, and it is proposed that OmpW plays a role in cation transport to maintain electrical neutrality during electron transfer. The genes encoding the anaerobic respiration regulator cyclic AMP receptor protein (CRP) and the extracytoplasmic function sigma factor RpoE are among the transcription factor genes with increased expression. RpoE might function by signaling the initial response to oxygen limitation. Our results show that RpoE activates transcription from promoters upstream of mtrC and omcA The transcriptome and mutant analyses of S. oneidensis MR-1 nanowire production are consistent with independent regulatory mechanisms for extending the outer membrane into tubular structures and for ensuring the electron transfer function of the nanowires. IMPORTANCE Shewanella oneidensis MR-1 has the capacity to transfer electrons to its external surface using extensions of the outer membrane called bacterial nanowires. These bacterial nanowires link the cell's respiratory chain to external surfaces, including oxidized metals important in bioremediation, and explain why S. oneidensis can be utilized as a component of microbial fuel cells, a form of renewable energy. In this work, we use differential gene expression analysis to focus on which genes function to produce the nanowires and promote extracellular electron transfer during oxygen limitation. Among the genes that are expressed at high levels are those encoding cytochrome proteins necessary for electron transfer. Shewanella coordinates the increased expression of regulators, metabolic pathways, and transport pathways to ensure that cytochromes efficiently transfer electrons along the nanowires.
Collapse
|
3
|
UpCoT: an integrated pipeline tool for clustering upstream DNA sequences of orthologous genes in prokaryotic genomes. 3 Biotech 2016; 6:74. [PMID: 28330144 PMCID: PMC4755962 DOI: 10.1007/s13205-016-0363-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2015] [Accepted: 01/08/2016] [Indexed: 11/30/2022] Open
Abstract
UpCoT is a pipeline tool developed by automating the series of steps involved in prediction of cis-regulatory elements. UpCoT generates orthologs for each gene in target genome using bi-directional best blast hit against the reference genomes, then identifies potential orthologous transcriptional units using intergenic distance. Finally it generates the FASTA files containing upstream sequences of orthologous transcriptional units of each gene in target genome. The inputs of UpCoT are protein sequence files (*.faa), genome sequence files (*.fna) and gene co-ordinate files (*.ptt) for target and reference genomes. The clustered-upstream DNA sequences can be used by motif prediction tool, such as MEME, Bio-prospector, Gibbs motif sampler, MDscan for prediction of conserved DNA elements. We tested the performance of UpCoT by selecting the genome of Synechocystis sp PCC 6803 as the target and 13 different cyanobacterial genomes as reference. The clustered upstream sequences generated by UpCoT of groES, ycf24 and nirA were used for cis-regulatory element prediction. The results were consistent with the experimentally identified cis-regulatory elements. Therefore, UpCoT is a reliable and automated pipeline package for prediction of orthologs, orthologous transcriptional units, and orthologous upstream sequences of a selected prokaryotic genome. UpCoT can be downloaded from http://jssplab.uohyd.ac.in/upcot/.
Collapse
|
4
|
Fernandez L, Mercader JM, Planas-Fèlix M, Torrents D. Adaptation to environmental factors shapes the organization of regulatory regions in microbial communities. BMC Genomics 2014; 15:877. [PMID: 25294412 PMCID: PMC4287501 DOI: 10.1186/1471-2164-15-877] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Accepted: 09/24/2014] [Indexed: 11/10/2022] Open
Abstract
Background It has been shown in a number of metagenomic studies that the addition and removal of specific genes have allowed microbiomes to adapt to specific environmental conditions by losing and gaining specific functions. But it is not known whether and how the regulation of gene expression also contributes to adaptation. Results We have here characterized and analyzed the metaregulome of three different environments, as well as their impact in the adaptation to particular variable physico-chemical conditions. For this, we have developed a computational protocol to extract regulatory regions and their corresponding transcription factors binding sites directly from metagenomic reads and applied it to three well known environments: Acid Mine, Whale Fall, and Waseca Farm. Taking the density of regulatory sites in promoters as a measure of the potential and complexity of gene regulation, we found it to be quantitatively the same in all three environments, despite their different physico-chemical conditions and species composition. However, we found that each environment distributes their regulatory potential differently across their functional space. Among the functions with highest regulatory potential in each niche, we found significant enrichment of processes related to sensing and buffering external variable factors specific to each environment, like for example, the availability of co-factors in deep sea, of oligosaccharides in soil and the regulation of pH in the acid mine. Conclusions These results highlight the potential impact of gene regulation in the adaptation of bacteria to the different habitats through the distribution of their regulatory potential among specific functions, and point to critical environmental factors that challenge the growth of any microbial community. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-877) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | - David Torrents
- Joint IRB-BSC program on Computational Biology, BSC, Jordi Girona, 29, 08034 Barcelona, Spain.
| |
Collapse
|
5
|
Ravcheev DA, Khoroshkin MS, Laikova ON, Tsoy OV, Sernova NV, Petrova SA, Rakhmaninova AB, Novichkov PS, Gelfand MS, Rodionov DA. Comparative genomics and evolution of regulons of the LacI-family transcription factors. Front Microbiol 2014; 5:294. [PMID: 24966856 PMCID: PMC4052901 DOI: 10.3389/fmicb.2014.00294] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2014] [Accepted: 05/28/2014] [Indexed: 12/31/2022] Open
Abstract
DNA-binding transcription factors (TFs) are essential components of transcriptional regulatory networks in bacteria. LacI-family TFs (LacI-TFs) are broadly distributed among certain lineages of bacteria. The majority of characterized LacI-TFs sense sugar effectors and regulate carbohydrate utilization genes. The comparative genomics approaches enable in silico identification of TF-binding sites and regulon reconstruction. To study the function and evolution of LacI-TFs, we performed genomics-based reconstruction and comparative analysis of their regulons. For over 1300 LacI-TFs from over 270 bacterial genomes, we predicted their cognate DNA-binding motifs and identified target genes. Using the genome context and metabolic subsystem analyses of reconstructed regulons, we tentatively assigned functional roles and predicted candidate effectors for 78 and 67% of the analyzed LacI-TFs, respectively. Nearly 90% of the studied LacI-TFs are local regulators of sugar utilization pathways, whereas the remaining 125 global regulators control large and diverse sets of metabolic genes. The global LacI-TFs include the previously known regulators CcpA in Firmicutes, FruR in Enterobacteria, and PurR in Gammaproteobacteria, as well as the three novel regulators—GluR, GapR, and PckR—that are predicted to control the central carbohydrate metabolism in three lineages of Alphaproteobacteria. Phylogenetic analysis of regulators combined with the reconstructed regulons provides a model of evolutionary diversification of the LacI protein family. The obtained genomic collection of in silico reconstructed LacI-TF regulons in bacteria is available in the RegPrecise database (http://regprecise.lbl.gov). It provides a framework for future structural and functional classification of the LacI protein family and identification of molecular determinants of the DNA and ligand specificity. The inferred regulons can be also used for functional gene annotation and reconstruction of sugar catabolic networks in diverse bacterial lineages.
Collapse
Affiliation(s)
- Dmitry A Ravcheev
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia
| | - Matvei S Khoroshkin
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia
| | - Olga N Laikova
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia
| | - Olga V Tsoy
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia ; Faculty of Bioengineering and Bioinformatics, Moscow State University Moscow, Russia
| | - Natalia V Sernova
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia
| | - Svetlana A Petrova
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia ; Faculty of Bioengineering and Bioinformatics, Moscow State University Moscow, Russia
| | | | - Pavel S Novichkov
- Lawrence Berkeley National Laboratory, Genomics Division Berkeley, CA, USA
| | - Mikhail S Gelfand
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia
| | - Dmitry A Rodionov
- Research Scientific Center for Bioinformatics, A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences Moscow, Russia ; Department of Bioinformatics, Sanford-Burnham Medical Research Institute La Jolla, CA, USA
| |
Collapse
|
6
|
Leyn SA, Kazanov MD, Sernova NV, Ermakova EO, Novichkov PS, Rodionov DA. Genomic reconstruction of the transcriptional regulatory network in Bacillus subtilis. J Bacteriol 2013; 195:2463-73. [PMID: 23504016 PMCID: PMC3676070 DOI: 10.1128/jb.00140-13] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2013] [Accepted: 03/11/2013] [Indexed: 12/26/2022] Open
Abstract
The adaptation of microorganisms to their environment is controlled by complex transcriptional regulatory networks (TRNs), which are still only partially understood even for model species. Genome scale annotation of regulatory features of genes and TRN reconstruction are challenging tasks of microbial genomics. We used the knowledge-driven comparative-genomics approach implemented in the RegPredict Web server to infer TRN in the model Gram-positive bacterium Bacillus subtilis and 10 related Bacillales species. For transcription factor (TF) regulons, we combined the available information from the DBTBS database and the literature with bioinformatics tools, allowing inference of TF binding sites (TFBSs), comparative analysis of the genomic context of predicted TFBSs, functional assignment of target genes, and effector prediction. For RNA regulons, we used known RNA regulatory motifs collected in the Rfam database to scan genomes and analyze the genomic context of new RNA sites. The inferred TRN in B. subtilis comprises regulons for 129 TFs and 24 regulatory RNA families. First, we analyzed 66 TF regulons with previously known TFBSs in B. subtilis and projected them to other Bacillales genomes, resulting in refinement of TFBS motifs and identification of novel regulon members. Second, we inferred motifs and described regulons for 28 experimentally studied TFs with previously unknown TFBSs. Third, we discovered novel motifs and reconstructed regulons for 36 previously uncharacterized TFs. The inferred collection of regulons is available in the RegPrecise database (http://regprecise.lbl.gov/) and can be used in genetic experiments, metabolic modeling, and evolutionary analysis.
Collapse
Affiliation(s)
- Semen A. Leyn
- Sanford-Burnham Medical Research Institute, La Jolla, California, USA
- A. A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
| | - Marat D. Kazanov
- A. A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
| | - Natalia V. Sernova
- A. A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
| | - Ekaterina O. Ermakova
- A. A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
| | | | - Dmitry A. Rodionov
- Sanford-Burnham Medical Research Institute, La Jolla, California, USA
- A. A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
7
|
Cipriano MJ, Novichkov PN, Kazakov AE, Rodionov DA, Arkin AP, Gelfand MS, Dubchak I. RegTransBase--a database of regulatory sequences and interactions based on literature: a resource for investigating transcriptional regulation in prokaryotes. BMC Genomics 2013; 14:213. [PMID: 23547897 PMCID: PMC3639892 DOI: 10.1186/1471-2164-14-213] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2012] [Accepted: 03/22/2013] [Indexed: 11/10/2022] Open
Abstract
Background Due to the constantly growing number of sequenced microbial genomes, comparative genomics has been playing a major role in the investigation of regulatory interactions in bacteria. Regulon inference mostly remains a field of semi-manual examination since absence of a knowledgebase and informatics platform for automated and systematic investigation restricts opportunities for computational prediction. Additionally, confirming computationally inferred regulons by experimental data is critically important. Description RegTransBase is an open-access platform with a user-friendly web interface publicly available at http://regtransbase.lbl.gov. It consists of two databases – a manually collected hierarchical regulatory interactions database based on more than 7000 scientific papers which can serve as a knowledgebase for verification of predictions, and a large set of curated by experts transcription factor binding sites used in regulon inference by a variety of tools. RegTransBase captures the knowledge from published scientific literature using controlled vocabularies and contains various types of experimental data, such as: the activation or repression of transcription by an identified direct regulator; determination of the transcriptional regulatory function of a protein (or RNA) directly binding to DNA or RNA; mapping of binding sites for a regulatory protein; characterization of regulatory mutations. Analysis of the data collected from literature resulted in the creation of Putative Regulons from Experimental Data that are also available in RegTransBase. Conclusions RegTransBase is a powerful user-friendly platform for the investigation of regulation in prokaryotes. It uses a collection of validated regulatory sequences that can be easily extracted and used to infer regulatory interactions by comparative genomics techniques thus assisting researchers in the interpretation of transcriptional regulation data.
Collapse
Affiliation(s)
- Michael J Cipriano
- Department of Microbiology, University of California Davis, Davis, CA 95616, USA
| | | | | | | | | | | | | |
Collapse
|
8
|
Rodionov DA, Novichkov PS, Stavrovskaya ED, Rodionova IA, Li X, Kazanov MD, Ravcheev DA, Gerasimova AV, Kazakov AE, Kovaleva GY, Permina EA, Laikova ON, Overbeek R, Romine MF, Fredrickson JK, Arkin AP, Dubchak I, Osterman AL, Gelfand MS. Comparative genomic reconstruction of transcriptional networks controlling central metabolism in the Shewanella genus. BMC Genomics 2011; 12 Suppl 1:S3. [PMID: 21810205 PMCID: PMC3223726 DOI: 10.1186/1471-2164-12-s1-s3] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Background Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in bacteria is one of the critical tasks of modern genomics. The Shewanella genus is comprised of metabolically versatile gamma-proteobacteria, whose lifestyles and natural environments are substantially different from Escherichia coli and other model bacterial species. The comparative genomics approaches and computational identification of regulatory sites are useful for the in silico reconstruction of transcriptional regulatory networks in bacteria. Results To explore conservation and variations in the Shewanella transcriptional networks we analyzed the repertoire of transcription factors and performed genomics-based reconstruction and comparative analysis of regulons in 16 Shewanella genomes. The inferred regulatory network includes 82 transcription factors and their DNA binding sites, 8 riboswitches and 6 translational attenuators. Forty five regulons were newly inferred from the genome context analysis, whereas others were propagated from previously characterized regulons in the Enterobacteria and Pseudomonas spp.. Multiple variations in regulatory strategies between the Shewanella spp. and E. coli include regulon contraction and expansion (as in the case of PdhR, HexR, FadR), numerous cases of recruiting non-orthologous regulators to control equivalent pathways (e.g. PsrA for fatty acid degradation) and, conversely, orthologous regulators to control distinct pathways (e.g. TyrR, ArgR, Crp). Conclusions We tentatively defined the first reference collection of ~100 transcriptional regulons in 16 Shewanella genomes. The resulting regulatory network contains ~600 regulated genes per genome that are mostly involved in metabolism of carbohydrates, amino acids, fatty acids, vitamins, metals, and stress responses. Several reconstructed regulons including NagR for N-acetylglucosamine catabolism were experimentally validated in S. oneidensis MR-1. Analysis of correlations in gene expression patterns helps to interpret the reconstructed regulatory network. The inferred regulatory interactions will provide an additional regulatory constrains for an integrated model of metabolism and regulation in S. oneidensis MR-1.
Collapse
Affiliation(s)
- Dmitry A Rodionov
- Sanford-Burnham Medical Research Institute, La Jolla, California, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Sahota G, Stormo GD. Novel sequence-based method for identifying transcription factor binding sites in prokaryotic genomes. ACTA ACUST UNITED AC 2010; 26:2672-7. [PMID: 20807838 DOI: 10.1093/bioinformatics/btq501] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Computational techniques for microbial genomic sequence analysis are becoming increasingly important. With next-generation sequencing technology and the human microbiome project underway, current sequencing capacity is significantly greater than the speed at which organisms of interest can be studied experimentally. Most related computational work has been focused on sequence assembly, gene annotation and metabolic network reconstruction. We have developed a method that will primarily use available sequence data in order to determine prokaryotic transcription factor (TF) binding specificities. RESULTS Specificity determining residues (critical residues) were identified from crystal structures of DNA-protein complexes and TFs with the same critical residues were grouped into specificity classes. The putative binding regions for each class were defined as the set of promoters for each TF itself (autoregulatory) and the immediately upstream and downstream operons. MEME was used to find putative motifs within each separate class. Tests on the LacI and TetR TF families, using RegulonDB annotated sites, showed the sensitivity of prediction 86% and 80%, respectively. AVAILABILITY http://ural.wustl.edu/∼gsahota/HTHmotif/
Collapse
Affiliation(s)
- Gurmukh Sahota
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63108, USA
| | | |
Collapse
|
10
|
Zhang S, Li S, Pham PT, Su Z. Simultaneous prediction of transcription factor binding sites in a group of prokaryotic genomes. BMC Bioinformatics 2010; 11:397. [PMID: 20653963 PMCID: PMC2920276 DOI: 10.1186/1471-2105-11-397] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2010] [Accepted: 07/23/2010] [Indexed: 11/24/2022] Open
Abstract
Background Our current understanding of transcription factor binding sites (TFBSs) in sequenced prokaryotic genomes is very limited due to the lack of an accurate and efficient computational method for the prediction of TFBSs at a genome scale. In an attempt to change this situation, we have recently developed a comparative genomics based algorithm called GLECLUBS for de novo genome-wide prediction of TFBSs in a target genome. Although GLECLUBS has achieved rather high prediction accuracy of TFBSs in a target genome, it is still not efficient enough to be applied to all the sequenced prokaryotic genomes. Results Here, we designed a new algorithm based on GLECLUBS called extended GLECLUBS (eGLECLUBS) for simultaneous prediction of TFBSs in a group of related prokaryotic genomes. When tested on a group of γ-proteobacterial genomes including E. coli K12, a group of firmicutes genomes including B. subtilis and a group of cyanobacterial genomes using the same parameter settings, eGLECLUBS predicts more than 82% of known TFBSs in extracted inter-operonic sequences in both E. coli K12 and B. subtilis. Because each genome in a group is equally treated, it is highly likely that similar prediction accuracy has been achieved for each genome in the group. Conclusions We have developed a new algorithm for genome-wide de novo prediction of TFBSs in a group of related prokaryotic genomes. The algorithm has achieved the same level of accuracy and robustness as its predecessor GLECLUBS, but can work on dozens of genomes at the same time.
Collapse
Affiliation(s)
- Shaoqiang Zhang
- Department of Bioinformatics and Genomics, Center for Bioinformatics Research, the University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | | | | | | |
Collapse
|
11
|
Karpinets TV, Romine MF, Schmoyer DD, Kora GH, Syed MH, Leuze MR, Serres MH, Park BH, Samatova NF, Uberbacher EC. Shewanella knowledgebase: integration of the experimental data and computational predictions suggests a biological role for transcription of intergenic regions. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2010; 2010:baq012. [PMID: 20627862 PMCID: PMC2911847 DOI: 10.1093/database/baq012] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Shewanellae are facultative γ-proteobacteria whose remarkable respiratory versatility has resulted in interest in their utility for bioremediation of heavy metals and radionuclides and for energy generation in microbial fuel cells. Extensive experimental efforts over the last several years and the availability of 21 sequenced Shewanella genomes made it possible to collect and integrate a wealth of information on the genus into one public resource providing new avenues for making biological discoveries and for developing a system level understanding of the cellular processes. The Shewanella knowledgebase was established in 2005 to provide a framework for integrated genome-based studies on Shewanella ecophysiology. The present version of the knowledgebase provides access to a diverse set of experimental and genomic data along with tools for curation of genome annotations and visualization and integration of genomic data with experimental data. As a demonstration of the utility of this resource, we examined a single microarray data set from Shewanella oneidensis MR-1 for new insights into regulatory processes. The integrated analysis of the data predicted a new type of bacterial transcriptional regulation involving co-transcription of the intergenic region with the downstream gene and suggested a biological role for co-transcription that likely prevents the binding of a regulator of the upstream gene to the regulator binding site located in the intergenic region. Database URL:http://shewanella-knowledgebase.org:8080/Shewanella/ or http://spruce.ornl.gov:8080/Shewanella/
Collapse
|
12
|
Zhang S, Xu M, Li S, Su Z. Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes. Nucleic Acids Res 2009; 37:e72. [PMID: 19383880 PMCID: PMC2691844 DOI: 10.1093/nar/gkp248] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Although cis-regulatory binding sites (CRBSs) are at least as important as the coding sequences in a genome, our general understanding of them in most sequenced genomes is very limited due to the lack of efficient and accurate experimental and computational methods for their characterization, which has largely hindered our understanding of many important biological processes. In this article, we describe a novel algorithm for genome-wide de novo prediction of CRBSs with high accuracy. We designed our algorithm to circumvent three identified difficulties for CRBS prediction using comparative genomics principles based on a new method for the selection of reference genomes, a new metric for measuring the similarity of CRBSs, and a new graph clustering procedure. When operon structures are correctly predicted, our algorithm can predict 81% of known individual binding sites belonging to 94% of known cis-regulatory motifs in the Escherichia coli K12 genome, while achieving high prediction specificity. Our algorithm has also achieved similar prediction accuracy in the Bacillus subtilis genome, suggesting that it is very robust, and thus can be applied to any other sequenced prokaryotic genome. When compared with the prior state-of-the-art algorithms, our algorithm outperforms them in both prediction sensitivity and specificity.
Collapse
Affiliation(s)
- Shaoqiang Zhang
- Department of Bioinformatics and Genomics, Bioinformatics Research Center, the University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | | | | | | |
Collapse
|
13
|
Xu X, Ji Y, Stormo GD. Discovering cis-regulatory RNAs in Shewanella genomes by Support Vector Machines. PLoS Comput Biol 2009; 5:e1000338. [PMID: 19343219 PMCID: PMC2659441 DOI: 10.1371/journal.pcbi.1000338] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2008] [Accepted: 02/24/2009] [Indexed: 12/31/2022] Open
Abstract
An increasing number of cis-regulatory RNA elements have been found to regulate gene expression post-transcriptionally in various biological processes in bacterial systems. Effective computational tools for large-scale identification of novel regulatory RNAs are strongly desired to facilitate our exploration of gene regulation mechanisms and regulatory networks. We present a new computational program named RSSVM (RNA Sampler+Support Vector Machine), which employs Support Vector Machines (SVMs) for efficient identification of functional RNA motifs from random RNA secondary structures. RSSVM uses a set of distinctive features to represent the common RNA secondary structure and structural alignment predicted by RNA Sampler, a tool for accurate common RNA secondary structure prediction, and is trained with functional RNAs from a variety of bacterial RNA motif/gene families covering a wide range of sequence identities. When tested on a large number of known and random RNA motifs, RSSVM shows a significantly higher sensitivity than other leading RNA identification programs while maintaining the same false positive rate. RSSVM performs particularly well on sets with low sequence identities. The combination of RNA Sampler and RSSVM provides a new, fast, and efficient pipeline for large-scale discovery of regulatory RNA motifs. We applied RSSVM to multiple Shewanella genomes and identified putative regulatory RNA motifs in the 5′ untranslated regions (UTRs) in S. oneidensis, an important bacterial organism with extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. From 1002 sets of 5′-UTRs of orthologous operons, we identified 166 putative regulatory RNA motifs, including 17 of the 19 known RNA motifs from Rfam, an additional 21 RNA motifs that are supported by literature evidence, 72 RNA motifs overlapping predicted transcription terminators or attenuators, and other candidate regulatory RNA motifs. Our study provides a list of promising novel regulatory RNA motifs potentially involved in post-transcriptional gene regulation. Combined with the previous cis-regulatory DNA motif study in S. oneidensis, this genome-wide discovery of cis-regulatory RNA motifs may offer more comprehensive views of gene regulation at a different level in this organism. The RSSVM software, predictions, and analysis results on Shewanella genomes are available at http://ural.wustl.edu/resources.html#RSSVM. RNA is remarkably versatile, acting not only as messengers to transfer genetic information from DNA to protein but also as critical structural components and catalytic enzymes in the cell. More intriguingly, RNA elements in messenger RNAs have been widely found in bacteria to control the expression of their downstream genes. The functions of these RNA elements are intrinsically linked to their secondary structures, which are usually conserved across multiple closely related species during evolution and often shared by genes in the same metabolic pathways. We developed a new computational approach to find putative functional RNA elements by looking for conserved RNA secondary structures that are distinguished from random RNA secondary structures in the orthologous RNA sequences from related species. We applied this approach to multiple Shewanella genomes and predicted putative regulatory RNA elements in Shewanella oneidensis, a bacterium that has extraordinary respiratory and metal reducing abilities and great potential for bioremediation and alternative energy generation. Our findings not only recovered many RNA elements that are known or supported by literature evidence but also included exciting novel RNA elements for further exploration.
Collapse
Affiliation(s)
- Xing Xu
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Yongmei Ji
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Gary D. Stormo
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
- * E-mail:
| |
Collapse
|