1
|
Winter S, Jahn K, Wehner S, Kuchenbecker L, Marz M, Stoye J, Böcker S. Finding approximate gene clusters with Gecko 3. Nucleic Acids Res 2016; 44:9600-9610. [PMID: 27679480 PMCID: PMC5175365 DOI: 10.1093/nar/gkw843] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2015] [Revised: 09/06/2016] [Accepted: 09/12/2016] [Indexed: 12/15/2022] Open
Abstract
Gene-order-based comparison of multiple genomes provides signals for functional analysis of genes and the evolutionary process of genome organization. Gene clusters are regions of co-localized genes on genomes of different species. The rapid increase in sequenced genomes necessitates bioinformatics tools for finding gene clusters in hundreds of genomes. Existing tools are often restricted to few (in many cases, only two) genomes, and often make restrictive assumptions such as short perfect conservation, conserved gene order or monophyletic gene clusters. We present Gecko 3, an open-source software for finding gene clusters in hundreds of bacterial genomes, that comes with an easy-to-use graphical user interface. The underlying gene cluster model is intuitive, can cope with low degrees of conservation as well as misannotations and is complemented by a sound statistical evaluation. To evaluate the biological benefit of Gecko 3 and to exemplify our method, we search for gene clusters in a dataset of 678 bacterial genomes using Synechocystis sp. PCC 6803 as a reference. We confirm detected gene clusters reviewing the literature and comparing them to a database of operons; we detect two novel clusters, which were confirmed by publicly available experimental RNA-Seq data. The computational analysis is carried out on a laptop computer in <40 min.
Collapse
Affiliation(s)
- Sascha Winter
- Chair for Bioinformatics, Institute for Computer Science, Friedrich-Schiller-University Jena, Jena, Germany
| | - Katharina Jahn
- Genome Informatics, Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
- Computational Biology Group, Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Stefanie Wehner
- RNA Bioinformatics and High Throughput Analysis, Institute for Computer Science, Friedrich-Schiller-University Jena, Jena, Germany
- Institute of Aquaculture, School of Natural Sciences, University of Stirling, Stirling, FK9LA, Scotland, UK
| | - Leon Kuchenbecker
- Genome Informatics, Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
- Berlin-Brandenburg Center for Regenerative Therapies, Charité University Medicine Berlin, Berlin, Germany
| | - Manja Marz
- RNA Bioinformatics and High Throughput Analysis, Institute for Computer Science, Friedrich-Schiller-University Jena, Jena, Germany
- Leibniz Institute for Age Research-Fritz Lipmann Institute (FLI), Jena, Germany
| | - Jens Stoye
- Genome Informatics, Faculty of Technology and Center for Biotechnology (CeBiTec), Bielefeld University, Bielefeld, Germany
| | - Sebastian Böcker
- Chair for Bioinformatics, Institute for Computer Science, Friedrich-Schiller-University Jena, Jena, Germany
| |
Collapse
|
2
|
Stoye J, Wittler R. A unified approach for reconstructing ancient gene clusters. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2009; 6:387-400. [PMID: 19644167 DOI: 10.1109/tcbb.2008.135] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
The order of genes in genomes provides extensive information. In comparative genomics, differences or similarities of gene orders are determined to predict functional relations of genes or phylogenetic relations of genomes. For this purpose, various combinatorial models can be used to identify gene clusters--groups of genes that are colocated in a set of genomes. We introduce a unified approach to model gene clusters and define the problem of labeling the inner nodes of a given phylogenetic tree with sets of gene clusters. Our optimization criterion in this context combines two properties: parsimony, i.e., the number of gains and losses of gene clusters has to be minimal, and consistency, i.e., for each ancestral node, there must exist at least one potential gene order that contains all the reconstructed clusters. We present and evaluate an exact algorithm to solve this problem. Despite its exponential worst-case time complexity, our method is suitable even for large-scale data. We show the effectiveness and efficiency on both simulated and real data.
Collapse
Affiliation(s)
- Jens Stoye
- Genome Informatics Group, Faculty of Technology, Bielefeld University, 33594 Bielefeld, Germany.
| | | |
Collapse
|
3
|
Mutsuda M, Sugiura M. Translation initiation of cyanobacterial rbcS mRNAs requires the 38-kDa ribosomal protein S1 but not the Shine-Dalgarno sequence: development of a cyanobacterial in vitro translation system. J Biol Chem 2006; 281:38314-21. [PMID: 17046824 DOI: 10.1074/jbc.m604647200] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Little is known about the biochemical mechanism of translation in cyanobacteria though substantial studies have been made on photosynthesis, nitrogen fixation, circadian rhythm, and genome structure. To analyze the mechanism of cyanobacterial translation, we have developed an in vitro translation system from Synechococcus cells using a psbAI-lacZ fusion mRNA as a model template. This in vitro system supports accurate translation from the authentic initiation site of a variety of Synechococcus mRNAs. In Synechococcus cells, rbcL and rbcS encoding the large and small subunits, respectively, of ribulose-1,5-bisphosphate carboxylase/oxygenase are co-transcribed as a dicistronic mRNA, and the downstream rbcS mRNA possesses two possible initiation codons separated by three nucleotides. Using this in vitro system and mutated mRNAs, we demonstrated that translation starts exclusively from the upstream AUG codon. Although there are Shine-Dalgarno-like sequences in positions similar to those of the functional Shine-Dalgarno elements in Escherichia coli, mutation analysis indicated that these sequences are not required for translation. Assays with deletions within the 5'-untranslated region showed that a pyrimidine-rich sequence in the -46 to -15 region is necessary for efficient translation. Synechococcus cells contain two ribosomal protein S1 homologues of 38 and 33 kDa in size. UV cross-linking and immunoprecipitation experiments suggested that the 38-kDa S1 is involved in efficient translation via associating with the pyrimidine-rich sequence. The present in vitro translation system will be a powerful tool to analyze the basic mechanism of translation in cyanobacteria.
Collapse
|
4
|
Teeling H, Lombardot T, Bauer M, Ludwig W, Glöckner FO. Evaluation of the phylogenetic position of the planctomycete 'Rhodopirellula baltica' SH 1 by means of concatenated ribosomal protein sequences, DNA-directed RNA polymerase subunit sequences and whole genome trees. Int J Syst Evol Microbiol 2004; 54:791-801. [PMID: 15143026 DOI: 10.1099/ijs.0.02913-0] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In recent years, the planctomycetes have been recognized as a phylum of environmentally important bacteria with habitats ranging from soil and freshwater to marine ecosystems. The planctomycetes form an independent phylum within the bacterial domain, whose exact phylogenetic position remains controversial. With the completion of sequencing of the genome of 'Rhodopirellula baltica' SH 1, it is now possible to re-evaluate the phylogeny of the planctomycetes based on multiple genes and genome trees in addition to single genes like the 16S rRNA or the elongation factor Tu. Here, evidence is presented based on the concatenated amino acid sequences of ribosomal proteins and DNA-directed RNA polymerase subunits from 'Rhodopirellula baltica' SH 1 and more than 90 other publicly available genomes that support a relationship of the Planctomycetes and the Chlamydiae. Affiliation of 'Rhodopirellula baltica' SH 1 and the Chlamydiae was reasonably stable regarding site selection since, during stepwise filtering of less-conserved sites from the alignments, it was only broken when rigorous filtering was applied. In a few cases, 'Rhodopirellula baltica' SH 1 shifted to a deep branching position adjacent to the Thermotoga/Aquifex clade. These findings are in agreement with recent publications, but the deep branching position was dependent on site selection and treeing algorithm and thus not stable. A genome tree calculated from normalized BLASTP scores did not confirm a close relationship of 'Rhodopirellula baltica' SH 1 and the Chlamydiae, but also indicated that the Planctomycetes do not emerge at the very root of the Bacteria. Therefore, these analyses rather contradict a deep branching position of the Planctomycetes within the bacterial domain and reaffirm their earlier proposed relatedness to the Chlamydiae.
Collapse
Affiliation(s)
- Hanno Teeling
- Max-Planck-Institute for Marine Microbiology, Celsiusstrasse 1, D-28359 Bremen, Germany
| | - Thierry Lombardot
- Max-Planck-Institute for Marine Microbiology, Celsiusstrasse 1, D-28359 Bremen, Germany
| | - Margarete Bauer
- Max-Planck-Institute for Marine Microbiology, Celsiusstrasse 1, D-28359 Bremen, Germany
| | - Wolfgang Ludwig
- Department of Microbiology, Technical University Munich, D-85350 Freising, Germany
| | - Frank Oliver Glöckner
- Max-Planck-Institute for Marine Microbiology, Celsiusstrasse 1, D-28359 Bremen, Germany
| |
Collapse
|
5
|
|
6
|
Affiliation(s)
- B Stoebe
- Botanisches Institut, Heinrich-Heine-Universität Düsseldorf, Universitätsstr. 1, D-40225 Düsseldorf, Germany.
| | | |
Collapse
|
7
|
Mooney BP, Miernyk JA, Randall DD. Cloning and characterization of the dihydrolipoamide S-acetyltransferase subunit of the plastid pyruvate dehydrogenase complex (E2) from Arabidopsis. PLANT PHYSIOLOGY 1999; 120:443-52. [PMID: 10364395 PMCID: PMC59282 DOI: 10.1104/pp.120.2.443] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/1998] [Accepted: 03/03/1999] [Indexed: 05/17/2023]
Abstract
An Arabidopsis cDNA encoding the dihydrolipoamide S-acetyltransferase subunit of the plastid pyruvate dehydrogenase complex (E2) was isolated from a lambdaPRL2 library. The cDNA is 1709 bp in length, with a continuous open reading frame of 1440 bp encoding a protein of 480 amino acids with a calculated molecular mass of 50,079 D. Southern analysis suggests that a single gene encodes plastid E2. The amino acid sequence has characteristic features of an acetyltransferase, namely, distinct lipoyl, subunit-binding, and catalytic domains, although it is unusual in having only a single lipoyl domain. The in vitro synthesized plastid E2 precursor protein has a relative molecular weight of 67,000 on sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Upon incubation of the precursor with pea (Pisum sativum) chloroplasts, it was imported and processed to a mature-sized relative molecular weight of 60,000. The imported protein was located in the chloroplast stroma, associated with the endogenous pyruvate dehydrogenase. Catalytically active recombinant plastid E2 was purified as a glutathione S-transferase fusion protein. Analysis of plastid E2 mRNA by reverse transcriptase-polymerase chain reaction showed highest expression in flowers, followed by leaves, siliques, and roots. The results of immunoblot analysis indicate that protein expression was similar in roots and flowers, less similar in leaves, and even less similar in siliques. This is the first report, to our knowledge, describing a plastid E2.
Collapse
Affiliation(s)
- B P Mooney
- Biochemistry Department, University of Missouri, Columbia, Missouri 65211, USA
| | | | | |
Collapse
|
8
|
Hirose T, Ideue T, Wakasugi T, Sugiura M. The chloroplast infA gene with a functional UUG initiation codon. FEBS Lett 1999; 445:169-72. [PMID: 10069394 DOI: 10.1016/s0014-5793(99)00123-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
All chloroplast genes reported so far possess ATG start codons and sometimes GTGs as an exception. Sequence alignments suggested that the chloroplast infA gene encoding initiation factor 1 in the green alga Chlorella vulgaris has TTG as a putative initiation codon. This gene was shown to be transcribed by RT-PCR analysis. The infA mRNA was translated accurately from the UUG codon in a tobacco chloroplast in vitro translation system. Mutation of the UUG codon to AUG increased translation efficiency approximately 300-fold. These results indicate that the UUG is functional for accurate translation initiation of Chlorella infA mRNA but it is an inefficient initiation codon.
Collapse
Affiliation(s)
- T Hirose
- Center for Gene Research, Nagoya University, Japan
| | | | | | | |
Collapse
|
9
|
Abstract
The entire sequence (120-190 kb) of chloroplast genomes has been determined from a dozen plant species. The genome contains from 87 to 183 known genes, of which half encode components involved in translation. These include a complete set of rRNAs and about 30 tRNAs, which are likely to be sufficient to support translation in chloroplasts. RNA editing (mostly C to U base changes) occurs in some chloroplast transcripts, creating start and stop codons and changing codons to retain conserved amino acids. Many components that constitute the chloroplast translational machinery are similar to those of Escherichia coli, whereas only one third of the chloroplast mRNAs contain Shine-Dalgarno-like sequences at the correct positions. Analyses conducted in vivo and in vitro have revealed the existence of multiple mechanisms for translational initiation in chloroplasts.
Collapse
Affiliation(s)
- M Sugiura
- Center for Gene Research, Nagoya University, Japan.
| | | | | |
Collapse
|
10
|
Abstract
Bacterial genome sizes, which range from 500 to 10,000 kbp, are within the current scope of operation of large-scale nucleotide sequence determination facilities. To date, 8 complete bacterial genomes have been sequenced, and at least 40 more will be completed in the near future. Such projects give wonderfully detailed information concerning the structure of the organism's genes and the overall organization of the sequenced genomes. It will be very important to put this incredible wealth of detail into a larger biological picture: How does this information apply to the genomes of related genera, related species, or even other individuals from the same species? Recent advances in pulsed-field gel electrophoretic technology have facilitated the construction of complete and accurate physical maps of bacterial chromosomes, and the many maps constructed in the past decade have revealed unexpected and substantial differences in genome size and organization even among closely related bacteria. This review focuses on this recently appreciated plasticity in structure of bacterial genomes, and diversity in genome size, replicon geometry, and chromosome number are discussed at inter- and intraspecies levels.
Collapse
Affiliation(s)
- S Casjens
- Department of Oncological Sciences, University of Utah, Salt Lake City 84132, USA.
| |
Collapse
|
11
|
Abstract
The amazing diversity of extant photosynthetic eukaryotes is largely a result of the presence of formerly free-living photosynthesizing organisms that have been sequestered by eukaryotic hosts and established as plastids in a process known as endosymbiosis. The evolutionary history of these endosymbiotic events was traditionally investigated by studying ultrastructural features and pigment characteristics but in recent years has been approached using molecular sequence data and gene trees. Two important developments, more detailed studies of members of the Cyanobacteria (from which plastids ultimately derive) and the availability of complete plastid genome sequences from a wide variety of plant and algal lineages, have allowed a more accurate reconstruction of plastid evolution.
Collapse
Affiliation(s)
- S E Douglas
- Canadian Institute for Advanced Research, Program in Evolutionary Biology, National Research Council, Institute for Marine Biosciences, 1411 Oxford Street, Halifax, Nova Scotia, Canada.
| |
Collapse
|