1
|
Meydan S, Klepacki D, Mankin AS, Vázquez-Laslop N. Identification of Translation Start Sites in Bacterial Genomes. Methods Mol Biol 2021; 2252:27-55. [PMID: 33765270 DOI: 10.1007/978-1-0716-1150-0_2] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The knowledge of translation start sites is crucial for annotation of genes in bacterial genomes. However, systematic mapping of start codons in bacterial genes has mainly relied on predictions based on protein conservation and mRNA sequence features which, although useful, are not always accurate. We recently found that the pleuromutilin antibiotic retapamulin (RET) is a specific inhibitor of translation initiation that traps ribosomes specifically at start codons, and we used it in combination with ribosome profiling to map start codons in the Escherichia coli genome. This genome-wide strategy, that was named Ribo-RET, not only verifies the position of start codons in already annotated genes but also enables identification of previously unannotated open reading frames and reveals the presence of internal start sites within genes. Here, we provide a detailed Ribo-RET protocol for E. coli. Ribo-RET can be adapted for mapping the start codons of the protein-coding sequences in a variety of bacterial species.
Collapse
Affiliation(s)
- Sezen Meydan
- National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Dorota Klepacki
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, IL, USA
| | - Alexander S Mankin
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, IL, USA.
| | - Nora Vázquez-Laslop
- Center for Biomolecular Sciences, University of Illinois at Chicago, Chicago, IL, USA.
| |
Collapse
|
2
|
Retapamulin-Assisted Ribosome Profiling Reveals the Alternative Bacterial Proteome. Mol Cell 2019; 74:481-493.e6. [PMID: 30904393 DOI: 10.1016/j.molcel.2019.02.017] [Citation(s) in RCA: 101] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Revised: 01/25/2019] [Accepted: 02/12/2019] [Indexed: 12/21/2022]
Abstract
The use of alternative translation initiation sites enables production of more than one protein from a single gene, thereby expanding the cellular proteome. Although several such examples have been serendipitously found in bacteria, genome-wide mapping of alternative translation start sites has been unattainable. We found that the antibiotic retapamulin specifically arrests initiating ribosomes at start codons of the genes. Retapamulin-enhanced Ribo-seq analysis (Ribo-RET) not only allowed mapping of conventional initiation sites at the beginning of the genes, but strikingly, it also revealed putative internal start sites in a number of Escherichia coli genes. Experiments demonstrated that the internal start codons can be recognized by the ribosomes and direct translation initiation in vitro and in vivo. Proteins, whose synthesis is initiated at internal in-frame and out-of-frame start sites, can be functionally important and contribute to the "alternative" bacterial proteome. The internal start sites may also play regulatory roles in gene expression.
Collapse
|
3
|
Adalat R, Saleem F, Bashir A, Ahmad M, Zulfiqar S, Shakoori AR. Multiple upstream start codons (AUG) in 5' untranslated region enhance translation efficiency of cry2Ac11 without helper protein. J Cell Biochem 2019; 120:2236-2250. [PMID: 30242865 DOI: 10.1002/jcb.27534] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Accepted: 08/01/2018] [Indexed: 01/24/2023]
Abstract
Cry2Ac11, a 65 kDa insecticidal protein produced by Bacillus thuringiensis, shows toxicity against dipteran and lepidopteran larvae. It is encoded by cry2Ac11 gene ( orf3), which is part of an operon comprising orf1, orf2, and orf3. Orf2, a helper protein, helps in proper folding and prevents aberrant aggregation of newly produced molecules. In this study, we have elucidated the effect of different mutations in translation initiation region (TIR), particularly the ribosomal binding site and the start codon (RBS-ATG) on cry2Ac11 gene expression without helper protein. All recombinant constructs were expressed in acrystalliferous B. thuringiensis subsp israelensis 4Q7 under the control of strong chimeric promoter cyt1AP/STAB. Of all the mutants, mut/RBS2, with two consecutive AUGs after the spacer region in TIR, exhibited 89- and 2246-fold higher transcript levels compared with 4Q7-operSalI/RBS ( cry2Ac11 operon) and 4Q7-w-RBS ( cry2Ac11 gene), respectively. The analysis of mut/RBS2 messenger RNA (mRNA) structure in the RBS-AUG region showed the presence of RBS in the single-stranded part of the moderately stable hairpin loop. The high expression efficiency of Cry2Ac11 mutant without helper protein is a cumulative and cooperative result of chimeric promoter cyt1AP/STAB-SD with the optimal context of RBS-AUG region provided by multiple AUGs and stabilizer sequence at 3' ends.
Collapse
Affiliation(s)
- Rooma Adalat
- Department of Biotechnology, Lahore College for Women University, Lahore, Pakistan
| | - Faiza Saleem
- Department of Biotechnology, Lahore College for Women University, Lahore, Pakistan
| | - Aftab Bashir
- Department of Biological Sciences, Forman Christian College, Lahore, Pakistan
| | - Munir Ahmad
- School of Biological Sciences, Quaid-i-Azam Campus, University of the Punjab, Lahore, Pakistan
| | - Soumble Zulfiqar
- School of Biological Sciences, Quaid-i-Azam Campus, University of the Punjab, Lahore, Pakistan
| | - Abdul Rauf Shakoori
- School of Biological Sciences, Quaid-i-Azam Campus, University of the Punjab, Lahore, Pakistan.,Department of Biochemistry, Faculty of Life Sciences, University of Central Punjab, Lahore, Pakistan
| |
Collapse
|
4
|
Danchin A, Ouzounis C, Tokuyasu T, Zucker JD. No wisdom in the crowd: genome annotation in the era of big data - current status and future prospects. Microb Biotechnol 2018; 11:588-605. [PMID: 29806194 PMCID: PMC6011933 DOI: 10.1111/1751-7915.13284] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Science and engineering rely on the accumulation and dissemination of knowledge to make discoveries and create new designs. Discovery-driven genome research rests on knowledge passed on via gene annotations. In response to the deluge of sequencing big data, standard annotation practice employs automated procedures that rely on majority rules. We argue this hinders progress through the generation and propagation of errors, leading investigators into blind alleys. More subtly, this inductive process discourages the discovery of novelty, which remains essential in biological research and reflects the nature of biology itself. Annotation systems, rather than being repositories of facts, should be tools that support multiple modes of inference. By combining deduction, induction and abduction, investigators can generate hypotheses when accurate knowledge is extracted from model databases. A key stance is to depart from 'the sequence tells the structure tells the function' fallacy, placing function first. We illustrate our approach with examples of critical or unexpected pathways, using MicroScope to demonstrate how tools can be implemented following the principles we advocate. We end with a challenge to the reader.
Collapse
Affiliation(s)
- Antoine Danchin
- Integromics, Institute of Cardiometabolism and Nutrition, Hôpital de la Pitié-Salpêtrière, 47 Boulevard de l'Hôpital, 75013, Paris, France
- School of Biomedical Sciences, Li KaShing Faculty of Medicine, Hong Kong University, 21 Sassoon Road, Pokfulam, Hong Kong
| | - Christos Ouzounis
- Biological Computation and Process Laboratory, Centre for Research and Technology Hellas, Chemical Process and Energy Resources Institute, Thessalonica, 57001, Greece
| | - Taku Tokuyasu
- Shenzhen Institutes of Advanced Technology, Institute of Synthetic Biology, Shenzhen University Town, 1068 Xueyuan Avenue, Shenzhen, China
| | - Jean-Daniel Zucker
- Integromics, Institute of Cardiometabolism and Nutrition, Hôpital de la Pitié-Salpêtrière, 47 Boulevard de l'Hôpital, 75013, Paris, France
| |
Collapse
|
5
|
Liu Y, Guo J, Hu G, Zhu H. Gene prediction in metagenomic fragments based on the SVM algorithm. BMC Bioinformatics 2013; 14 Suppl 5:S12. [PMID: 23735199 PMCID: PMC3622649 DOI: 10.1186/1471-2105-14-s5-s12] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Background Metagenomic sequencing is becoming a powerful technology for exploring micro-ogranisms from various environments, such as human body, without isolation and cultivation. Accurately identifying genes from metagenomic fragments is one of the most fundamental issues. Results In this article, we present a novel gene prediction method named MetaGUN for metagenomic fragments based on a machine learning approach of SVM. It implements in a three-stage strategy to predict genes. Firstly, it classifies input fragments into phylogenetic groups by a k-mer based sequence binning method. Then, protein-coding sequences are identified for each group independently with SVM classifiers that integrate entropy density profiles (EDP) of codon usage, translation initiation site (TIS) scores and open reading frame (ORF) length as input patterns. Finally, the TISs are adjusted by employing a modified version of MetaTISA. To identify protein-coding sequences, MetaGun builds the universal module and the novel module. The former is based on a set of representative species, while the latter is designed to find potential functionary DNA sequences with conserved domains. Conclusions Comparisons on artificial shotgun fragments with multiple current metagenomic gene finders show that MetaGUN predicts better results on both 3' and 5' ends of genes with fragments of various lengths. Especially, it makes the most reliable predictions among these methods. As an application, MetaGUN was used to predict genes for two samples of human gut microbiome. It identifies thousands of additional genes with significant evidences. Further analysis indicates that MetaGUN tends to predict more potential novel genes than other current metagenomic gene finders.
Collapse
Affiliation(s)
- Yongchu Liu
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing, China
| | | | | | | |
Collapse
|
6
|
Gao J, Wang J. Re-annotation of two hyperthermophilic archaea Pyrococcus abyssi GE5 and Pyrococcus furiosus DSM 3638. Curr Microbiol 2011; 64:118-29. [PMID: 22057919 DOI: 10.1007/s00284-011-0035-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2011] [Accepted: 10/04/2011] [Indexed: 10/15/2022]
Abstract
Pyrococcus abyssi GE5 (P. aby) and Pyrococcus furiosus DSM 3638 (P. fur) are two model hyperthermophilic archaea. However, their annotations in public databases are unsatisfactory. In this article, the two genomes were re-annotated according to the following steps. (i) All "hypothetical genes" in the original annotation were re-identified based on the Z-curve method, and some of them were recognized as non-coding open reading frames (ORFs). Evidence showed that the recognized non-coding ORFs were highly unlikely to encode proteins. (ii) The translation initiation sites (TISs) of all the annotated genes were re-located, and more than 10% of the TISs were shifted to 5'-upstream or 3'-downstream regions. (iii) The functions of the refined "hypothetical genes" were predicted using sequence alignment tools, more than 200 originally annotated "hypothetical genes" in either of the two hyperthermophiles were assigned functions. A large number of these functions have reference support or experimentally characterized homologues. All the refined information will serve as a valuable resource for research on P. aby and P. fur, which may be helpful in the exploration of thermal adaptation mechanisms. The complete re-annotation files of P. aby and P. fur are available at http://211.69.128.148/download/ .
Collapse
Affiliation(s)
- Junxiang Gao
- School of Science, Huazhong Agricultural University, Wuhan 430070, People's Republic of China.
| | | |
Collapse
|
7
|
Identifying translation initiation sites in prokaryotes using support vector machine. J Theor Biol 2010; 262:644-9. [DOI: 10.1016/j.jtbi.2009.10.023] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2009] [Revised: 10/12/2009] [Accepted: 10/12/2009] [Indexed: 11/17/2022]
|
8
|
Luo C, Hu GQ, Zhu H. Genome reannotation of Escherichia coli CFT073 with new insights into virulence. BMC Genomics 2009; 10:552. [PMID: 19930606 PMCID: PMC2785843 DOI: 10.1186/1471-2164-10-552] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2009] [Accepted: 11/22/2009] [Indexed: 11/30/2022] Open
Abstract
Background As one of human pathogens, the genome of Uropathogenic Escherichia coli strain CFT073 was sequenced and published in 2002, which was significant in pathogenetic bacterial genomics research. However, the current RefSeq annotation of this pathogen is now outdated to some degree, due to missing or misannotation of some essential genes associated with its virulence. We carried out a systematic reannotation by combining automated annotation tools with manual efforts to provide a comprehensive understanding of virulence for the CFT073 genome. Results The reannotation excluded 608 coding sequences from the RefSeq annotation. Meanwhile, a total of 299 coding sequences were newly added, about one third of them are found in genomic island (GI) regions while more than one fifth of them are located in virulence related regions pathogenicity islands (PAIs). Furthermore, there are totally 341 genes were relocated with their translational initiation sites (TISs), which resulted in a high quality of gene start annotation. In addition, 94 pseudogenes annotated in RefSeq were thoroughly inspected and updated. The number of miscellaneous genes (sRNAs) has been updated from 6 in RefSeq to 46 in the reannotation. Based on the adjustment in the reannotation, subsequent analysis were conducted by both general and case studies on new virulence factors or new virulence-associated genes that are crucial during the urinary tract infections (UTIs) process, including invasion, colonization, nutrition uptaking and population density control. Furthermore, miscellaneous RNAs collected in the reannotation are believed to contribute to the virulence of strain CFT073. The reannotation including the nucleotide data, the original RefSeq annotation, and all reannotated results is freely available via http://mech.ctb.pku.edu.cn/CFT073/. Conclusion As a result, the reannotation presents a more comprehensive picture of mechanisms of uropathogenicity of UPEC strain CFT073. The new genes change the view of its uropathogenicity in many respects, particularly by new genes in GI regions and new virulence-associated factors. The reannotation thus functions as an important source by providing new information about genomic structure and organization, and gene function. Moreover, we expect that the detailed analysis will facilitate the studies for exploration of novel virulence mechanisms and help guide experimental design.
Collapse
Affiliation(s)
- Chengwei Luo
- State Key Laboratory for Turbulence and Complex Systems, and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, China
| | | | | |
Collapse
|
9
|
Hu GQ, Guo JT, Liu YC, Zhu H. MetaTISA: Metagenomic Translation Initiation Site Annotator for improving gene start prediction. ACTA ACUST UNITED AC 2009; 25:1843-5. [PMID: 19389734 DOI: 10.1093/bioinformatics/btp272] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
SUMMARY We proposed a tool named MetaTISA with an aim to improve TIS prediction of current gene-finders for metagenomes. The method employs a two-step strategy to predict translation initiation sites (TISs) by first clustering metagenomic fragments into phylogenetic groups and then predicting TISs independently for each group in an unsupervised manner. As evaluated on experimentally verified TISs, MetaTISA greatly improves the accuracies of TIS prediction of current gene-finders. AVAILABILITY The C++ source code is freely available under the GNU GPL license via http://mech.ctb.pku.edu.cn/MetaTISA/.
Collapse
Affiliation(s)
- Gang-Qing Hu
- State Key Laboratory for Turbulence and Complex Systems, Department of Biomedical Engineering and Center for Theoretical Biology, Peking University, Beijing 100871, China
| | | | | | | |
Collapse
|
10
|
Smollett KL, Fivian-Hughes AS, Smith JE, Chang A, Rao T, Davis EO. Experimental determination of translational start sites resolves uncertainties in genomic open reading frame predictions - application to Mycobacterium tuberculosis. MICROBIOLOGY-SGM 2009; 155:186-197. [PMID: 19118359 PMCID: PMC2897130 DOI: 10.1099/mic.0.022889-0] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Correct identification of translational start sites is important for understanding protein function and transcriptional regulation. The annotated translational start sites contained in genome databases are often predicted using bioinformatics and are rarely verified experimentally, and so are not all accurate. Therefore, we devised a simple approach for determining translational start sites using a combination of epitope tagging and frameshift mutagenesis. This assay was used to determine the start sites of three Mycobacterium tuberculosis proteins: LexA, SigC and Rv1955. We were able to show that proteins may begin before or after the predicted site. We also found that a small, non-annotated open reading frame upstream of Rv1955 was expressed as a protein, which we have designated Rv1954A. This approach is readily applicable to any bacterial species for which plasmid transformation can be achieved.
Collapse
Affiliation(s)
- Katherine L Smollett
- Division of Mycobacterial Research, MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK
| | - Amanda S Fivian-Hughes
- Division of Mycobacterial Research, MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK
| | - Joanne E Smith
- Division of Mycobacterial Research, MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK
| | - Anchi Chang
- Division of Mycobacterial Research, MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK
| | - Tara Rao
- Division of Mycobacterial Research, MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK
| | - Elaine O Davis
- Division of Mycobacterial Research, MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK
| |
Collapse
|
11
|
Warren AS, Setubal JC. The Genome Reverse Compiler: an explorative annotation tool. BMC Bioinformatics 2009; 10:35. [PMID: 19173744 PMCID: PMC2640359 DOI: 10.1186/1471-2105-10-35] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2008] [Accepted: 01/27/2009] [Indexed: 11/19/2022] Open
Abstract
Background As sequencing costs have decreased, whole genome sequencing has become a viable and integral part of biological laboratory research. However, the tools with which genes can be found and functionally characterized have not been readily adapted to be part of the everyday biological sciences toolkit. Most annotation pipelines remain as a service provided by large institutions or come as an unwieldy conglomerate of independent components, each requiring their own setup and maintenance. Results To address this issue we have created the Genome Reverse Compiler, an easy-to-use, open-source, automated annotation tool. The GRC is independent of third party software installs and only requires a Linux operating system. This stands in contrast to most annotation packages, which typically require installation of relational databases, sequence similarity software, and a number of other programming language modules. We provide details on the methodology used by GRC and evaluate its performance on several groups of prokaryotes using GRC's built in comparison module. Conclusion Traditionally, to perform whole genome annotation a user would either set up a pipeline or take advantage of an online service. With GRC the user need only provide the genome he or she wants to annotate and the function resource files to use. The result is high usability and a very minimal learning curve for the intended audience of life science researchers and bioinformaticians. We believe that the GRC fills a valuable niche in allowing users to perform explorative, whole-genome annotation.
Collapse
Affiliation(s)
- Andrew S Warren
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, VA, USA.
| | | |
Collapse
|
12
|
Hu GQ, Zheng X, Zhu HQ, She ZS. Prediction of translation initiation site for microbial genomes with TriTISA. ACTA ACUST UNITED AC 2008; 25:123-5. [PMID: 19015130 DOI: 10.1093/bioinformatics/btn576] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
UNLABELLED We report a new and simple method, TriTISA, for accurate prediction of translation initiation site (TIS) of microbial genomes. TriTISA classifies all candidate TISs into three categories based on evolutionary properties, and characterizes them in terms of Markov models. Then, it employs a Bayesian methodology for the selection of true TIS with a non-supervised, iterative procedure. Assessment on experimentally verified TIS data shows that TriTISA is overall better than all other methods of the state-of-the-art for microbial genome TIS prediction. In particular, TriTISA is shown to have a robust accuracy independent of the quality of initial annotation. AVAILABILITY The C++ source code is freely available under the GNU GPL license via http://mech.ctb.pku.edu.cn/protisa/TriTISA.
Collapse
Affiliation(s)
- Gang-Qing Hu
- State Key Lab for Turbulence and Complex Systems, Department of Biomedical Engineering, College of Engineering and Center for Theoretical Biology, Peking University, Beijing 100871, China
| | | | | | | |
Collapse
|
13
|
Phylogenetic and evolutionary relationships of RubisCO and the RubisCO-like proteins and the functional lessons provided by diverse molecular forms. Philos Trans R Soc Lond B Biol Sci 2008; 363:2629-40. [PMID: 18487131 PMCID: PMC2606765 DOI: 10.1098/rstb.2008.0023] [Citation(s) in RCA: 115] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Ribulose 1,5-bisphosphate (RuBP) carboxylase/oxygenase (RubisCO) catalyses the key reaction by which inorganic carbon may be assimilated into organic carbon. Phylogenetic analyses indicate that there are three classes of bona fide RubisCO proteins, forms I, II and III, which all catalyse the same reactions. In addition, there exists another form of RubisCO, form IV, which does not catalyse RuBP carboxylation or oxygenation. Form IV is actually a homologue of RubisCO and is called the RubisCO-like protein (RLP). Both RubisCO and RLP appear to have evolved from an ancestor protein in a methanogenic archaeon, and comprehensive analyses indicate that the different forms (I, II, III and IV) contain various subgroups, with individual sequences derived from representatives of all three kingdoms of life. The diversity of RubisCO molecules, many of which function in distinct milieus, has provided convenient model systems to study the ways in which the active site of this protein has evolved to accommodate necessary molecular adaptations. Such studies have proven useful to help provide a framework for understanding the molecular basis for many important aspects of RubisCO catalysis, including the elucidation of factors or functional groups that impinge on RubisCO carbon dioxide/oxygen substrate discrimination.
Collapse
|
14
|
Hu GQ, Zheng X, Yang YF, Ortet P, She ZS, Zhu H. ProTISA: a comprehensive resource for translation initiation site annotation in prokaryotic genomes. Nucleic Acids Res 2007; 36:D114-9. [PMID: 17942412 PMCID: PMC2238952 DOI: 10.1093/nar/gkm799] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Correct annotation of translation initiation site (TIS) is essential for both experiments and bioinformatics studies of prokaryotic translation initiation mechanism as well as understanding of gene regulation and gene structure. Here we describe a comprehensive database ProTISA, which collects TIS confirmed through a variety of available evidences for prokaryotic genomes, including Swiss-Prot experiments record, literature, conserved domain hits and sequence alignment between orthologous genes. Moreover, by combining the predictions from our recently developed TIS post-processor, ProTISA provides a refined annotation for the public database RefSeq. Furthermore, the database annotates the potential regulatory signals associated with translation initiation at the TIS upstream region. As of July 2007, ProTISA includes 440 microbial genomes with more than 390 000 confirmed TISs. The database is available at http://mech.ctb.pku.edu.cn/protisa
Collapse
Affiliation(s)
- Gang-Qing Hu
- State Key Lab for Turbulence and Complex System and Department of Biomedical Engineering, Peking University, Beijing 100871, China
| | | | | | | | | | | |
Collapse
|