Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wang H, Benham CJ. Promoter prediction and annotation of microbial genomes based on DNA sequence and structural responses to superhelical stress. BMC Bioinformatics 2006;7:248. [PMID: 16677393 PMCID: PMC1468432 DOI: 10.1186/1471-2105-7-248] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2006] [Accepted: 05/05/2006] [Indexed: 11/10/2022] Open

For:	Wang H, Benham CJ. Promoter prediction and annotation of microbial genomes based on DNA sequence and structural responses to superhelical stress. BMC Bioinformatics 2006;7:248. [PMID: 16677393 PMCID: PMC1468432 DOI: 10.1186/1471-2105-7-248] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2006] [Accepted: 05/05/2006] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

Benvenuti JL, Casa PL, Pessi de Abreu F, Martinez GS, de Avila E Silva S. From straight to curved: A historical perspective of DNA shape. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2024;193:46-54. [PMID: 39260792 DOI: 10.1016/j.pbiomolbio.2024.09.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Revised: 07/30/2024] [Accepted: 09/04/2024] [Indexed: 09/13/2024]

Coppens L, Wicke L, Lavigne R. SAPPHIRE.CNN: Implementation of dRNA-seq-driven, species-specific promoter prediction using convolutional neural networks. Comput Struct Biotechnol J 2022;20:4969-4974. [PMID: 36147675 PMCID: PMC9478156 DOI: 10.1016/j.csbj.2022.09.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 09/03/2022] [Accepted: 09/05/2022] [Indexed: 11/22/2022] Open

Zhang M, Jia C, Li F, Li C, Zhu Y, Akutsu T, Webb GI, Zou Q, Coin LJM, Song J. Critical assessment of computational tools for prokaryotic and eukaryotic promoter prediction. Brief Bioinform 2022;23:6502561. [PMID: 35021193 PMCID: PMC8921625 DOI: 10.1093/bib/bbab551] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 11/12/2021] [Accepted: 11/30/2021] [Indexed: 01/13/2023] Open

Affiliation(s)

Meng Zhang
Cangzhi Jia Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
Fuyi Li
Chen Li
Yan Zhu
Tatsuya Akutsu
Geoffrey I Webb Department of Data Science and Artificial Intelligence, Monash University, Melbourne, VIC 3800, Australia,Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
Quan Zou Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
Lachlan J M Coin Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
Jiangning Song Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:

Collapse

Orlov MA, Sorokin AA. DNA sequence, physics, and promoter function: Analysis of high-throughput data On T7 promoter variants activity. J Bioinform Comput Biol 2021;18:2040001. [PMID: 32404013 DOI: 10.1142/s0219720020400016] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Abstract

RNA polymerase/promoter recognition represents a basic problem of molecular biology. Decades-long efforts were made in the area, and yet certain challenges persist. The usage of certain most suitable model subjects is pivotal for the research. System of T7 bacteriophage RNA-polymerase/T7 native promoter represents an exceptional example for the purpose. Moreover, it has been studied the most and successfully applied to aims of biotechnology and bioengineering. Both structural simplicity and high specificity of this molecular duo are the reason for this. Despite highly similar sequences of distinct T7 native promoters, the T7 RNA-polymerase enzyme is capable of binding respective promoter in a highly specific and adjustable manner. One explanation here is that the process relies primarily on DNA physical properties rather than nucleotide sequence. Here, we address the issue by analyzing massive data recently published by Komura and colleagues. This initial study employed Next Generation Sequencing (NGS) in order to quantify activity of promoter variants including ones with multiple substitutions. As a result of our work substantial bias in simultaneous occurrence of single-nucleotide sequence alterations was found: the highest rate of co-occurrence was evidenced within specificity loop of binding region while the lowest - in initiation region of promoter. If both location and a kind of nucleotides involved in replacement (both initial and resulting) are taken into consideration, one can easily note that N to A substitutions are most preferred ones across the whole 19 b.p.-long sequence. At the same time, N to C are tolerated only at crucial position in recognition loop of binding region, and N to G are uniformly least tolerable. Later in this work the complete set of variants was split into groups with mutations (1) exclusively in binding region; (2) exclusively in melting region; (3) in both regions. Among these three groups second comprises extremely few variants (at triple-digit rate lesser than in two other groups, 46 versus over one and six thousand). Yet these are all promoter with substantial to high activity. This group two appeared heterogenous by primary sequence; indeed, upon further subdivision into above versus below average activity subgroups first one was found to comprise promoters with negligible conservation at -2 position of melting region; the second was hardly conserved in this region at all. This draws our attention to perfect consensus sequence of class III T7 promoter with -2 nucleotide randomized (all four are present by one to several copies in the previously published source dataset), the picture becomes even more pronounced. We therefore suggest that mutations at the position therefore do not cause significant changes in terms of promoter activity. At the same time, such modifications dramatically change DNA physical properties which were calculated in our study (namely electrostatic potential and propensity to bend). One possible suggestion here is that -2 nucleotide might function as a generic switch; if so, substitution -2A to -2T has important regulatory consequences. The fact that that -2 b.p. is the most evidently different nucleotide between class II versus class III promoters of T7 genome and that it also distinguishes the class III promoter in T7 genome versus promoters of its relative but reproductively isolated bacteriophage T3. In other words, it appears feasible that mutation at -2 nucleotide does not impede promoter activity yet alter its physical properties thus affecting differential RNA polymerase/promoter interaction.

Collapse

Ma M, Welch RD, Garza AG. The σ⁵⁴ system directly regulates bacterial natural product genes. Sci Rep 2021;11:4771. [PMID: 33637792 PMCID: PMC7910581 DOI: 10.1038/s41598-021-84057-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 02/05/2021] [Indexed: 01/31/2023] Open

Mishra A, Dhanda S, Siwach P, Aggarwal S, Jayaram B. A novel method SEProm for prokaryotic promoter prediction based on DNA structure and energetics. Bioinformatics 2020;36:2375-2384. [PMID: 31909789 DOI: 10.1093/bioinformatics/btz941] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 11/08/2019] [Accepted: 01/02/2020] [Indexed: 11/13/2022] Open

Chedin F, Benham CJ. Emerging roles for R-loop structures in the management of topological stress. J Biol Chem 2020;295:4684-4695. [PMID: 32107311 DOI: 10.1074/jbc.rev119.006364] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open

Orlov M, Garanina I, Fisunov GY, Sorokin A. Comparative Analysis of Mycoplasma gallisepticum vlhA Promoters. Front Genet 2018;9:569. [PMID: 30519256 PMCID: PMC6258824 DOI: 10.3389/fgene.2018.00569] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 11/06/2018] [Indexed: 12/15/2022] Open

Abstract

Mycoplasma gallisepticum is an intracellular parasite affecting respiratory tract of poultry that belongs to class Mollicutes. M. gallisepticum features numerous variable lipoprotein hemagglutinin genes (vlhA) that play a role in immune escape. The vlhA promoters have a set of distinct properties in comparison to promoters of the other genes. The vlhA promoters carry a variable GAA repeats region at approximately 40 nts upstream of transcription start site. The promoters have been considered active only in the presence of exactly 12 GAA repeats. The mechanisms of vlhA expression regulation and GAA number variation are not described. Here we tried to understand these mechanisms using different computational methods. We conducted a comparative analysis among several M. gallisepticum strains. Nucleotide sequences analysis showed the presence of highly conserved regions flanking repeated trinucleotides that are not linked to GAA number variation. VlhA genes with 12 GAA repeats and their orthologs in 12 M. gallisepticum strains are more conserved than other vlhA genes and have narrower GAA number distribution. We conducted comparative analysis of physicochemical profiles of M. gallisepticum vlhA and sigma-70 promoters. Stress-induced duplex destabilization (SIDD) profiles showed that sigma-70 group is characterized by the common to prokaryotic promoters sharp maxima while vlhA promoters are hardly destabilized with the region between GAA repeats and transcription start site having zero opening probability. Electrostatic potential profiles of vlhA promoters indicate the presence of the distinct patterns that appear to govern initial stages of specific DNA-protein recognition. Open state dynamics profiles of vlhA demonstrate the pattern that might facilitate transcription bubble formation. Obtained data could be the basis for experimental identification of mechanisms of phase variation in M. gallisepticum.

Collapse

He W, Jia C, Duan Y, Zou Q. 70ProPred: a predictor for discovering sigma70 promoters based on combining multiple features. BMC SYSTEMS BIOLOGY 2018;12:44. [PMID: 29745856 PMCID: PMC5998878 DOI: 10.1186/s12918-018-0570-1] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Ryasik A, Orlov M, Zykova E, Ermak T, Sorokin A. Bacterial promoter prediction: Selection of dynamic and static physical properties of DNA for reliable sequence classification. J Bioinform Comput Biol 2018;16:1840003. [DOI: 10.1142/s0219720018400036] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Umarov RK, Solovyev VV. Recognition of prokaryotic and eukaryotic promoters using convolutional deep learning neural networks. PLoS One 2017;12:e0171410. [PMID: 28158264 PMCID: PMC5291440 DOI: 10.1371/journal.pone.0171410] [Citation(s) in RCA: 130] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2016] [Accepted: 01/20/2017] [Indexed: 11/18/2022] Open

Abstract

Accurate computational identification of promoters remains a challenge as these key DNA regulatory regions have variable structures composed of functional motifs that provide gene-specific initiation of transcription. In this paper we utilize Convolutional Neural Networks (CNN) to analyze sequence characteristics of prokaryotic and eukaryotic promoters and build their predictive models. We trained a similar CNN architecture on promoters of five distant organisms: human, mouse, plant (Arabidopsis), and two bacteria (Escherichia coli and Bacillus subtilis). We found that CNN trained on sigma70 subclass of Escherichia coli promoter gives an excellent classification of promoters and non-promoter sequences (Sn = 0.90, Sp = 0.96, CC = 0.84). The Bacillus subtilis promoters identification CNN model achieves Sn = 0.91, Sp = 0.95, and CC = 0.86. For human, mouse and Arabidopsis promoters we employed CNNs for identification of two well-known promoter classes (TATA and non-TATA promoters). CNN models nicely recognize these complex functional regions. For human promoters Sn/Sp/CC accuracy of prediction reached 0.95/0.98/0,90 on TATA and 0.90/0.98/0.89 for non-TATA promoter sequences, respectively. For Arabidopsis we observed Sn/Sp/CC 0.95/0.97/0.91 (TATA) and 0.94/0.94/0.86 (non-TATA) promoters. Thus, the developed CNN models, implemented in CNNProm program, demonstrated the ability of deep learning approach to grasp complex promoter sequence characteristics and achieve significantly higher accuracy compared to the previously developed promoter prediction programs. We also propose random substitution procedure to discover positionally conserved promoter functional elements. As the suggested approach does not require knowledge of any specific promoter features, it can be easily extended to identify promoters and other complex functional regions in sequences of many other and especially newly sequenced genomes. The CNNProm program is available to run at web server http://www.softberry.com.

Collapse

Nikolic M, Stankovic T, Djordjevic M. Contribution of bacterial promoter elements to transcription start site detection accuracy. J Bioinform Comput Biol 2016;15:1650038. [PMID: 27908222 DOI: 10.1142/s0219720016500384] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Kumar A, Manivelan V, Bansal M. Structural features of DNA are conserved in the promoter region of orthologous genes across different strains ofHelicobacter pylori. FEMS Microbiol Lett 2016;363:fnw207. [DOI: 10.1093/femsle/fnw207] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/25/2016] [Indexed: 12/19/2022] Open

Abbas MM, Mohie-Eldin MM, EL-Manzalawy Y. Assessing the effects of data selection and representation on the development of reliable E. coli sigma 70 promoter region predictors. PLoS One 2015;10:e0119721. [PMID: 25803493 PMCID: PMC4372424 DOI: 10.1371/journal.pone.0119721] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Accepted: 01/26/2015] [Indexed: 11/27/2022] Open

Abstract

As the number of sequenced bacterial genomes increases, the need for rapid and reliable tools for the annotation of functional elements (e.g., transcriptional regulatory elements) becomes more desirable. Promoters are the key regulatory elements, which recruit the transcriptional machinery through binding to a variety of regulatory proteins (known as sigma factors). The identification of the promoter regions is very challenging because these regions do not adhere to specific sequence patterns or motifs and are difficult to determine experimentally. Machine learning represents a promising and cost-effective approach for computational identification of prokaryotic promoter regions. However, the quality of the predictors depends on several factors including: i) training data; ii) data representation; iii) classification algorithms; iv) evaluation procedures. In this work, we create several variants of E. coli promoter data sets and utilize them to experimentally examine the effect of these factors on the predictive performance of E. coli σ70 promoter models. Our results suggest that under some combinations of the first three criteria, a prediction model might perform very well on cross-validation experiments while its performance on independent test data is drastically very poor. This emphasizes the importance of evaluating promoter region predictors using independent test data, which corrects for the over-optimistic performance that might be estimated using the cross-validation procedure. Our analysis of the tested models shows that good prediction models often perform well despite how the non-promoter data was obtained. On the other hand, poor prediction models seems to be more sensitive to the choice of non-promoter sequences. Interestingly, the best performing sequence-based classifiers outperform the best performing structure-based classifiers on both cross-validation and independent test performance evaluation experiments. Finally, we propose a meta-predictor method combining two top performing sequence-based and structure-based classifiers and compare its performance with some of the state-of-the-art E. coli σ70 promoter prediction methods.

Collapse

Lloréns-Rico V, Lluch-Senar M, Serrano L. Distinguishing between productive and abortive promoters using a random forest classifier in Mycoplasma pneumoniae. Nucleic Acids Res 2015;43:3442-53. [PMID: 25779052 PMCID: PMC4402517 DOI: 10.1093/nar/gkv170] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2014] [Accepted: 02/22/2015] [Indexed: 12/01/2022] Open

Notari DL, Molin A, Davanzo V, Picolotto D, Ribeiro HG, Silva SDAE. IntergenicDB: a database for intergenic sequences. Bioinformation 2014;10:381-3. [PMID: 25097383 PMCID: PMC4110431 DOI: 10.6026/97320630010381] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2014] [Accepted: 05/24/2014] [Indexed: 12/01/2022] Open

Meysman P, Collado-Vides J, Morett E, Viola R, Engelen K, Laukens K. Structural properties of prokaryotic promoter regions correlate with functional features. PLoS One 2014;9:e88717. [PMID: 24516674 PMCID: PMC3918002 DOI: 10.1371/journal.pone.0088717] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2013] [Accepted: 01/10/2014] [Indexed: 12/31/2022] Open

Bansal M, Kumar A, Yella VR. Role of DNA sequence based structural features of promoters in transcription initiation and gene expression. Curr Opin Struct Biol 2014;25:77-85. [PMID: 24503515 DOI: 10.1016/j.sbi.2014.01.007] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2013] [Accepted: 01/07/2014] [Indexed: 11/18/2022]

Meysman P, Marchal K, Engelen K. DNA structural properties in the classification of genomic transcription regulation elements. Bioinform Biol Insights 2012;6:155-68. [PMID: 22837642 PMCID: PMC3399529 DOI: 10.4137/bbi.s9426] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Redefining Escherichia coli σ(70) promoter elements: -15 motif as a complement of the -10 motif. J Bacteriol 2011;193:6305-14. [PMID: 21908667 DOI: 10.1128/jb.05947-11] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Rangannan V, Bansal M. PromBase: a web resource for various genomic features and predicted promoters in prokaryotic genomes. BMC Res Notes 2011;4:257. [PMID: 21781326 PMCID: PMC3160392 DOI: 10.1186/1756-0500-4-257] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2011] [Accepted: 07/22/2011] [Indexed: 12/19/2022] Open

Abstract

Background

As more and more genomes are being sequenced, an overview of their genomic features and annotation of their functional elements, which control the expression of each gene or transcription unit of the genome, is a fundamental challenge in genomics and bioinformatics.

Findings

Relative stability of DNA sequence has been used to predict promoter regions in 913 microbial genomic sequences with GC-content ranging from 16.6% to 74.9%. Irrespective of the genome GC-content the relative stability based promoter prediction method has already been proven to be robust in terms of recall and precision. The predicted promoter regions for the 913 microbial genomes have been accumulated in a database called PromBase. Promoter search can be carried out in PromBase either by specifying the gene name or the genomic position. Each predicted promoter region has been assigned to a reliability class (low, medium, high, very high and highest) based on the difference between its average free energy and the downstream region. The recall and precision values for each class are shown graphically in PromBase. In addition, PromBase provides detailed information about base composition, CDS and CG/TA skews for each genome and various DNA sequence dependent structural properties (average free energy, curvature and bendability) in the vicinity of all annotated translation start sites (TLS).

Conclusion

PromBase is a database, which contains predicted promoter regions and detailed analysis of various genomic features for 913 microbial genomes. PromBase can serve as a valuable resource for comparative genomics study and help the experimentalist to rapidly access detailed information on various genomic features and putative promoter regions in any given genome. This database is freely accessible for academic and non- academic users via the worldwide web http://nucleix.mbu.iisc.ernet.in/prombase/.

Collapse

Sershen CL, Mell JC, Madden SM, Benham CJ. Superhelical duplex destabilization and the recombination position effect. PLoS One 2011;6:e20798. [PMID: 21695263 PMCID: PMC3111454 DOI: 10.1371/journal.pone.0020798] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2010] [Accepted: 05/12/2011] [Indexed: 11/19/2022] Open

Herbig A, Nieselt K. nocoRNAc: characterization of non-coding RNAs in prokaryotes. BMC Bioinformatics 2011;12:40. [PMID: 21281482 PMCID: PMC3230914 DOI: 10.1186/1471-2105-12-40] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2010] [Accepted: 01/31/2011] [Indexed: 11/10/2022] Open

Abstract

Background

The interest in non-coding RNAs (ncRNAs) constantly rose during the past few years because of the wide spectrum of biological processes in which they are involved. This led to the discovery of numerous ncRNA genes across many species. However, for most organisms the non-coding transcriptome still remains unexplored to a great extent. Various experimental techniques for the identification of ncRNA transcripts are available, but as these methods are costly and time-consuming, there is a need for computational methods that allow the detection of functional RNAs in complete genomes in order to suggest elements for further experiments. Several programs for the genome-wide prediction of functional RNAs have been developed but most of them predict a genomic locus with no indication whether the element is transcribed or not.

Results

We present NOCORNAc, a program for the genome-wide prediction of ncRNA transcripts in bacteria. NOCORNAc incorporates various procedures for the detection of transcriptional features which are then integrated with functional ncRNA loci to determine the transcript coordinates. We applied RNAz and NOCORNAc to the genome of Streptomyces coelicolor and detected more than 800 putative ncRNA transcripts most of them located antisense to protein-coding regions. Using a custom design microarray we profiled the expression of about 400 of these elements and found more than 300 to be transcribed, 38 of them are predicted novel ncRNA genes in intergenic regions. The expression patterns of many ncRNAs are similarly complex as those of the protein-coding genes, in particular many antisense ncRNAs show a high expression correlation with their protein-coding partner.

Conclusions

We have developed NOCORNAc, a framework that facilitates the automated characterization of functional ncRNAs. NOCORNAc increases the confidence of predicted ncRNA loci, especially if they contain transcribed ncRNAs. NOCORNAc is not restricted to intergenic regions, but it is applicable to the prediction of ncRNA transcripts in whole microbial genomes. The software as well as a user guide and example data is available at http://www.zbit.uni-tuebingen.de/pas/nocornac.htm.

Collapse

Apostolaki A, Kalosakas G. Targets of DNA-binding proteins in bacterial promoter regions present enhanced probabilities for spontaneous thermal openings. Phys Biol 2011;8:026006. [DOI: 10.1088/1478-3975/8/2/026006] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Identification of TATA and TATA-less promoters in plant genomes by integrating diversity measure, GC-Skew and DNA geometric flexibility. Genomics 2010;97:112-20. [PMID: 21112384 DOI: 10.1016/j.ygeno.2010.11.002] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2010] [Revised: 11/05/2010] [Accepted: 11/12/2010] [Indexed: 11/20/2022]

Eukaryotic and prokaryotic promoter prediction using hybrid approach. Theory Biosci 2010;130:91-100. [DOI: 10.1007/s12064-010-0114-8] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2010] [Accepted: 10/23/2010] [Indexed: 12/27/2022]

Rangannan V, Bansal M. High-quality annotation of promoter regions for 913 bacterial genomes. ACTA ACUST UNITED AC 2010;26:3043-50. [PMID: 20956245 DOI: 10.1093/bioinformatics/btq577] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Bland C, Newsome AS, Markovets AA. Promoter prediction in E. coli based on SIDD profiles and Artificial Neural Networks. BMC Bioinformatics 2010;11 Suppl 6:S17. [PMID: 20946600 PMCID: PMC3026364 DOI: 10.1186/1471-2105-11-s6-s17] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Vollbrecht E, Duvick J, Schares JP, Ahern KR, Deewatthanawong P, Xu L, Conrad LJ, Kikuchi K, Kubinec TA, Hall BD, Weeks R, Unger-Wallace E, Muszynski M, Brendel VP, Brutnell TP. Genome-wide distribution of transposed Dissociation elements in maize. THE PLANT CELL 2010;22:1667-85. [PMID: 20581308 PMCID: PMC2910982 DOI: 10.1105/tpc.109.073452] [Citation(s) in RCA: 92] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/12/2009] [Revised: 04/09/2010] [Accepted: 06/09/2010] [Indexed: 05/18/2023]

Dineen DG, Wilm A, Cunningham P, Higgins DG. High DNA melting temperature predicts transcription start site location in human and mouse. Nucleic Acids Res 2010;37:7360-7. [PMID: 19820114 PMCID: PMC2794178 DOI: 10.1093/nar/gkp821] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open

Mallios RR, Ojcius DM, Ardell DH. An iterative strategy combining biophysical criteria and duration hidden Markov models for structural predictions of Chlamydia trachomatis sigma66 promoters. BMC Bioinformatics 2009;10:271. [PMID: 19715597 PMCID: PMC2743672 DOI: 10.1186/1471-2105-10-271] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2009] [Accepted: 08/28/2009] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Promoter identification is a first step in the quest to explain gene regulation in bacteria. It has been demonstrated that the initiation of bacterial transcription depends upon the stability and topology of DNA in the promoter region as well as the binding affinity between the RNA polymerase sigma-factor and promoter. However, promoter prediction algorithms to date have not explicitly used an ensemble of these factors as predictors. In addition, most promoter models have been trained on data from Escherichia coli. Although it has been shown that transcriptional mechanisms are similar among various bacteria, it is quite possible that the differences between Escherichia coli and Chlamydia trachomatis are large enough to recommend an organism-specific modeling effort.

RESULTS

Here we present an iterative stochastic model building procedure that combines such biophysical metrics as DNA stability, curvature, twist and stress-induced DNA duplex destabilization along with duration hidden Markov model parameters to model Chlamydia trachomatis sigma66 promoters from 29 experimentally verified sequences. Initially, iterative duration hidden Markov modeling of the training set sequences provides a scoring algorithm for Chlamydia trachomatis RNA polymerase sigma66/DNA binding. Subsequently, an iterative application of Stepwise Binary Logistic Regression selects multiple promoter predictors and deletes/replaces training set sequences to determine an optimal training set. The resulting model predicts the final training set with a high degree of accuracy and provides insights into the structure of the promoter region. Model based genome-wide predictions are provided so that optimal promoter candidates can be experimentally evaluated, and refined models developed. Co-predictions with three other algorithms are also supplied to enhance reliability.

CONCLUSION

This strategy and resulting model support the conjecture that DNA biophysical properties, along with RNA polymerase sigma-factor/DNA binding collaboratively, contribute to a sequence's ability to promote transcription. This work provides a baseline model that can evolve as new Chlamydia trachomatis sigma66 promoters are identified with assistance from the provided genome-wide predictions. The proposed methodology is ideal for organisms with few identified promoters and relatively small genomes.

Collapse

Shavkunov KS, Masulis IS, Tutukina MN, Deev AA, Ozoline ON. Gains and unexpected lessons from genome-scale promoter mapping. Nucleic Acids Res 2009;37:4919-31. [PMID: 19528070 PMCID: PMC2731890 DOI: 10.1093/nar/gkp490] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Coculescu BI. Antimicrobial resistance induced by genetic changes. J Med Life 2009;2:114-23. [PMID: 20108530 PMCID: PMC3018982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Williams JA, Carnes AE, Hodgson CP. Plasmid DNA vaccine vector design: impact on efficacy, safety and upstream production. Biotechnol Adv 2009;27:353-70. [PMID: 19233255 DOI: 10.1016/j.biotechadv.2009.02.003] [Citation(s) in RCA: 121] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2008] [Revised: 02/02/2009] [Accepted: 02/07/2009] [Indexed: 10/21/2022]

Askary A, Masoudi-Nejad A, Sharafi R, Mizbani A, Parizi SN, Purmasjedi M. N4: A precise and highly sensitive promoter predictor using neural network fed by nearest neighbors. Genes Genet Syst 2009;84:425-30. [DOI: 10.1266/ggs.84.425] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Rangannan V, Bansal M. Relative stability of DNA as a generic criterion for promoter prediction: whole genome annotation of microbial genomes with varying nucleotide base composition. MOLECULAR BIOSYSTEMS 2009;5:1758-69. [DOI: 10.1039/b906535k] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Dekhtyar M, Morin A, Sakanyan V. Triad pattern algorithm for predicting strong promoter candidates in bacterial genomes. BMC Bioinformatics 2008;9:233. [PMID: 18471287 PMCID: PMC2412878 DOI: 10.1186/1471-2105-9-233] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2007] [Accepted: 05/09/2008] [Indexed: 11/17/2022] Open

Wang Z, Jin L, Węgrzyn G, Węgrzyn A. Screening of the osmotic pressure-inducible promoter regions from the whole genome of Escherichia coli by using a novel cloning method. Biotechnol Lett 2008;30:707-11. [DOI: 10.1007/s10529-007-9583-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2007] [Revised: 10/22/2007] [Accepted: 10/24/2007] [Indexed: 11/27/2022]

Wang H, Benham CJ. Superhelical destabilization in regulatory regions of stress response genes. PLoS Comput Biol 2008;4:e17. [PMID: 18208321 PMCID: PMC2211533 DOI: 10.1371/journal.pcbi.0040017] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2007] [Accepted: 12/03/2007] [Indexed: 11/18/2022] Open

Abstract

Stress-induced DNA duplex destabilization (SIDD) analysis exploits the known structural and energetic properties of DNA to predict sites that are susceptible to strand separation under negative superhelical stress. When this approach was used to calculate the SIDD profile of the entire Escherichia coli K12 genome, it was found that strongly destabilized sites occur preferentially in intergenic regions that are either known or inferred to contain promoters, but rarely occur in coding regions. Here, we investigate whether the genes grouped in different functional categories have characteristic SIDD properties in their upstream flanks. We report that strong SIDD sites in the E. coli K12 genome are statistically significantly overrepresented in the upstream regions of genes encoding transcriptional regulators. In particular, the upstream regions of genes that directly respond to physiological and environmental stimuli are more destabilized than are those regions of genes that are not involved in these responses. Moreover, if a pathway is controlled by a transcriptional regulator whose gene has a destabilized 5′ flank, then the genes (operons) in that pathway also usually contain strongly destabilized SIDD sites in their 5′ flanks. We observe this statistically significant association of SIDD sites with upstream regions of genes functioning in transcription in 38 of 43 genomes of free-living bacteria, but in only four of 18 genomes of endosymbionts or obligate parasitic bacteria. These results suggest that strong SIDD sites 5′ to participating genes may be involved in transcriptional responses to environmental changes, which are known to transiently alter superhelicity. We propose that these SIDD sites are active and necessary participants in superhelically mediated regulatory mechanisms governing changes in the global pattern of gene expression in prokaryotes in response to physiological or environmental changes.

DNA in vivo experiences regulated amounts of untwisting stress. If sufficiently large, these stresses can destabilize the double helix at specific locations. These sites then become favored locations for strand separations. Gene expression and DNA replication, the two major jobs of DNA, both require the strands of the duplex to be separated. Thus, events that affect the ease of strand separation can regulate the initiation of these processes. Stress-induced DNA duplex destabilization (SIDD) has been implicated in mechanisms regulating several biological processes, including the initiation of gene expression and replication. We have developed computational methods that accurately predict the locations and extents of destabilization within genomic DNA sequences that occur in response to specified stress levels. Here, we report that the easily destabilized sites we find in the Escherichia coli K12 genome are statistically significantly overrepresented in the upstream regions of genes encoding proteins that regulate transcription. In particular, the regions upstream of genes that directly respond to physiological and environmental stimuli are more destabilized than are those regions of genes that are not involved in these responses. These results suggest that strong SIDD sites upstream of participating genes may be involved in transcriptional responses to environmental changes.

Collapse

Abeel T, Saeys Y, Bonnet E, Rouzé P, Van de Peer Y. Generic eukaryotic core promoter prediction using structural features of DNA. Genes Dev 2008;18:310-23. [PMID: 18096745 PMCID: PMC2203629 DOI: 10.1101/gr.6991408] [Citation(s) in RCA: 133] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2007] [Accepted: 11/14/2007] [Indexed: 11/24/2022]

Janga SC, Collado-Vides J. Structure and evolution of gene regulatory networks in microbial genomes. Res Microbiol 2007;158:787-94. [PMID: 17996425 DOI: 10.1016/j.resmic.2007.09.001] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2007] [Revised: 08/07/2007] [Accepted: 09/17/2007] [Indexed: 12/24/2022]

Liu F, Tøstesen E, Sundet JK, Jenssen TK, Bock C, Jerstad GI, Thilly WG, Hovig E. The human genomic melting map. PLoS Comput Biol 2007;3:e93. [PMID: 17511513 PMCID: PMC1868775 DOI: 10.1371/journal.pcbi.0030093] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2006] [Accepted: 04/11/2007] [Indexed: 11/19/2022] Open