1
|
Acevedo-Garcia J, Walden K, Leissing F, Baumgarten K, Drwiega K, Kwaaitaal M, Reinstädler A, Freh M, Dong X, James GV, Baus LC, Mascher M, Stein N, Schneeberger K, Brocke-Ahmadinejad N, Kollmar M, Schulze-Lefert P, Panstruga R. Barley Ror1 encodes a class XI myosin required for mlo-based broad-spectrum resistance to the fungal powdery mildew pathogen. Plant J 2022; 112:84-103. [PMID: 35916711 DOI: 10.1111/tpj.15930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 06/17/2022] [Accepted: 07/22/2022] [Indexed: 06/15/2023]
Abstract
Loss-of-function alleles of plant MLO genes confer broad-spectrum resistance to powdery mildews in many eudicot and monocot species. Although barley (Hordeum vulgare) mlo mutants have been used in agriculture for more than 40 years, understanding of the molecular principles underlying this type of disease resistance remains fragmentary. Forward genetic screens in barley have revealed mutations in two Required for mlo resistance (Ror) genes that partially impair immunity conferred by mlo mutants. While Ror2 encodes a soluble N-ethylmaleimide-sensitive factor-attached protein receptor (SNARE), the identity of Ror1, located at the pericentromeric region of barley chromosome 1H, remained elusive. We report the identification of Ror1 based on combined barley genomic sequence information and transcriptomic data from ror1 mutant plants. Ror1 encodes the barley class XI myosin Myo11A (HORVU.MOREX.r3.1HG0046420). Single amino acid substitutions of this myosin, deduced from non-functional ror1 mutant alleles, map to the nucleotide-binding region and the interface between the relay-helix and the converter domain of the motor protein. Ror1 myosin accumulates transiently in the course of powdery mildew infection. Functional fluorophore-labeled Ror1 variants associate with mobile intracellular compartments that partially colocalize with peroxisomes. Single-cell expression of the Ror1 tail region causes a dominant-negative effect that phenocopies ror1 loss-of-function mutants. We define a myosin motor for the establishment of mlo-mediated resistance, suggesting that motor protein-driven intracellular transport processes are critical for extracellular immunity, possibly through the targeted transfer of antifungal and/or cell wall cargoes to pathogen contact sites.
Collapse
Affiliation(s)
- Johanna Acevedo-Garcia
- Unit of Plant Molecular Cell Biology, Institute for Biology I, RWTH Aachen University, Worringerweg 1, 52056, Aachen, Germany
- Department of Plant-Microbe Interactions, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
| | - Kim Walden
- Unit of Plant Molecular Cell Biology, Institute for Biology I, RWTH Aachen University, Worringerweg 1, 52056, Aachen, Germany
| | - Franz Leissing
- Unit of Plant Molecular Cell Biology, Institute for Biology I, RWTH Aachen University, Worringerweg 1, 52056, Aachen, Germany
| | - Kira Baumgarten
- Unit of Plant Molecular Cell Biology, Institute for Biology I, RWTH Aachen University, Worringerweg 1, 52056, Aachen, Germany
| | - Katarzyna Drwiega
- Unit of Plant Molecular Cell Biology, Institute for Biology I, RWTH Aachen University, Worringerweg 1, 52056, Aachen, Germany
| | - Mark Kwaaitaal
- Unit of Plant Molecular Cell Biology, Institute for Biology I, RWTH Aachen University, Worringerweg 1, 52056, Aachen, Germany
| | - Anja Reinstädler
- Unit of Plant Molecular Cell Biology, Institute for Biology I, RWTH Aachen University, Worringerweg 1, 52056, Aachen, Germany
| | - Matthias Freh
- Unit of Plant Molecular Cell Biology, Institute for Biology I, RWTH Aachen University, Worringerweg 1, 52056, Aachen, Germany
| | - Xue Dong
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
| | - Geo Velikkakam James
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
| | - Lisa C Baus
- Faculty of Biology, LMU Munich, 82152, Planegg-Martinsried, Germany
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstr. 3, 06466, Seeland, Germany
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstr. 3, 06466, Seeland, Germany
- Center of integrated Breeding Research (CiBreed), Department of Crop Sciences, Georg-August-University Göttingen, Von Siebold Str. 8, 37075, Göttingen, Germany
| | - Korbinian Schneeberger
- Department of Plant Developmental Biology, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
- Faculty of Biology, LMU Munich, 82152, Planegg-Martinsried, Germany
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
| | - Nahal Brocke-Ahmadinejad
- INRES Crop Bioinformatics, University of Bonn, Katzenburgweg 2, 53115, Bonn, Germany
- Institute of Biochemistry and Molecular Biology, University of Bonn, Nussallee 11, D-53115, Bonn, Germany
| | - Martin Kollmar
- Department of NMR-based Structural Biology, Group Systems Biology of Motor Proteins, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077, Göttingen, Germany
| | - Paul Schulze-Lefert
- Department of Plant-Microbe Interactions, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
| | - Ralph Panstruga
- Unit of Plant Molecular Cell Biology, Institute for Biology I, RWTH Aachen University, Worringerweg 1, 52056, Aachen, Germany
- Department of Plant-Microbe Interactions, Max Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, 50829, Cologne, Germany
| |
Collapse
|
2
|
Mühlhausen S, Schmitt HD, Plessmann U, Mienkus P, Sternisek P, Perl T, Weig M, Urlaub H, Bader O, Kollmar M. Proteogenomics analysis of CUG codon translation in the human pathogen Candida albicans. BMC Biol 2021; 19:258. [PMID: 34863173 PMCID: PMC8645108 DOI: 10.1186/s12915-021-01197-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2021] [Accepted: 11/18/2021] [Indexed: 11/25/2022] Open
Abstract
Background Yeasts of the CTG-clade lineage, which includes the human-infecting Candida albicans, Candida parapsilosis and Candida tropicalis species, are characterized by an altered genetic code. Instead of translating CUG codons as leucine, as happens in most eukaryotes, these yeasts, whose ancestors are thought to have lost the relevant leucine-tRNA gene, translate CUG codons as serine using a serine-tRNA with a mutated anticodon, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$ {\mathrm{tRNA}}_{\mathrm{CAG}}^{\mathrm{Ser}} $$\end{document}tRNACAGSer. Previously reported experiments have suggested that 3–5% of the CTG-clade CUG codons are mistranslated as leucine due to mischarging of the \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$ {\mathrm{tRNA}}_{\mathrm{CAG}}^{\mathrm{Ser}} $$\end{document}tRNACAGSer. The mistranslation was suggested to result in variable surface proteins explaining fast host adaptation and pathogenicity. Results In this study, we reassess this potential mistranslation by high-resolution mass spectrometry-based proteogenomics of multiple CTG-clade yeasts, including various C. albicans strains, isolated from colonized and from infected human body sites, and C. albicans grown in yeast and hyphal forms. Our data do not support a bias towards CUG codon mistranslation as leucine. Instead, our data suggest that (i) CUG codons are mistranslated at a frequency corresponding to the normal extent of ribosomal mistranslation with no preference for specific amino acids, (ii) CUG codons are as unambiguous (or ambiguous) as the related CUU leucine and UCC serine codons, (iii) tRNA anticodon loop variation across the CTG-clade yeasts does not result in any difference of the mistranslation level, and (iv) CUG codon unambiguity is independent of C. albicans’ strain pathogenicity or growth form. Conclusions Our findings imply that C. albicans does not decode CUG ambiguously. This suggests that the proposed misleucylation of the \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$ {\mathrm{tRNA}}_{\mathrm{CAG}}^{\mathrm{Ser}} $$\end{document}tRNACAGSer might be as prevalent as every other misacylation or mistranslation event and, if at all, be just one of many reasons causing phenotypic diversity. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-021-01197-9.
Collapse
Affiliation(s)
- Stefanie Mühlhausen
- Theoretical Computer Science and Algorithmic Methods Group, Institute of Computer Science, University of Göttingen, Goldschmidtstr. 7, 37077, Göttingen, Germany
| | - Hans Dieter Schmitt
- Department of Neurobiology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077, Göttingen, Germany
| | - Uwe Plessmann
- Bioanalytical Mass Spectrometry, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077, Göttingen, Germany
| | - Peter Mienkus
- Department of Neurobiology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077, Göttingen, Germany
| | - Pia Sternisek
- Institute for Medical Microbiology, University Medical Center Göttingen, Kreuzbergring 57, 37075, Göttingen, Germany
| | - Thorsten Perl
- Intermediate Care, University Medical Center Göttingen, Robert Koch Strasse 40, 37075, Göttingen, Germany
| | - Michael Weig
- Institute for Medical Microbiology, University Medical Center Göttingen, Kreuzbergring 57, 37075, Göttingen, Germany
| | - Henning Urlaub
- Bioanalytical Mass Spectrometry, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077, Göttingen, Germany.,Bioanalytics Group, Department of Clinical Chemistry, University Medical Center Göttingen, Robert Koch Strasse 40, 37075, Göttingen, Germany
| | - Oliver Bader
- Institute for Medical Microbiology, University Medical Center Göttingen, Kreuzbergring 57, 37075, Göttingen, Germany
| | - Martin Kollmar
- Theoretical Computer Science and Algorithmic Methods Group, Institute of Computer Science, University of Göttingen, Goldschmidtstr. 7, 37077, Göttingen, Germany. .,Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077, Göttingen, Germany.
| |
Collapse
|
3
|
Simm D, Hatje K, Waack S, Kollmar M. Critical assessment of coiled-coil predictions based on protein structure data. Sci Rep 2021; 11:12439. [PMID: 34127723 PMCID: PMC8203680 DOI: 10.1038/s41598-021-91886-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 05/28/2021] [Indexed: 02/05/2023] Open
Abstract
Coiled-coil regions were among the first protein motifs described structurally and theoretically. The simplicity of the motif promises that coiled-coil regions can be detected with reasonable accuracy and precision in any protein sequence. Here, we re-evaluated the most commonly used coiled-coil prediction tools with respect to the most comprehensive reference data set available, the entire Protein Data Bank, down to each amino acid and its secondary structure. Apart from the 30-fold difference in minimum and maximum number of coiled coils predicted the tools strongly vary in where they predict coiled-coil regions. Accordingly, there is a high number of false predictions and missed, true coiled-coil regions. The evaluation of the binary classification metrics in comparison with naïve coin-flip models and the calculation of the Matthews correlation coefficient, the most reliable performance metric for imbalanced data sets, suggests that the tested tools' performance is close to random. This implicates that the tools' predictions have only limited informative value. Coiled-coil predictions are often used to interpret biochemical data and are part of in-silico functional genome annotation. Our results indicate that these predictions should be treated very cautiously and need to be supported and validated by experimental evidence.
Collapse
Affiliation(s)
- Dominic Simm
- grid.418140.80000 0001 2104 4211Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany ,grid.7450.60000 0001 2364 4210Theoretical Computer Science and Algorithmic Methods, Institute of Computer Science, Georg-August-University Göttingen, Göttingen, Germany
| | - Klas Hatje
- grid.418140.80000 0001 2104 4211Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany ,grid.417570.00000 0004 0374 1269Present Address: Roche Pharmaceutical Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Stephan Waack
- grid.7450.60000 0001 2364 4210Theoretical Computer Science and Algorithmic Methods, Institute of Computer Science, Georg-August-University Göttingen, Göttingen, Germany
| | - Martin Kollmar
- grid.418140.80000 0001 2104 4211Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany ,grid.7450.60000 0001 2364 4210Theoretical Computer Science and Algorithmic Methods, Institute of Computer Science, Georg-August-University Göttingen, Göttingen, Germany
| |
Collapse
|
4
|
Hatje K, Mühlhausen S, Simm D, Kollmar M. The Protein-Coding Human Genome: Annotating High-Hanging Fruits. Bioessays 2019; 41:e1900066. [PMID: 31544971 DOI: 10.1002/bies.201900066] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 08/07/2019] [Indexed: 12/19/2022]
Abstract
The major transcript variants of human protein-coding genes are annotated to a certain degree of accuracy combining manual curation, transcript data, and proteomics evidence. However, there is considerable disagreement on the annotation of about 2000 genes-they can be protein-coding, noncoding, or pseudogenes-and on the annotation of most of the predicted alternative transcripts. Pure transcriptome mapping approaches seem to be limited in discriminating functional expression from noise. These limitations have partially been overcome by dedicated algorithms to detect alternative spliced micro-exons and wobble splice variants. Recently, knowledge about splice mechanism and protein structure are incorporated into an algorithm to predict neighboring homologous exons, often spliced in a mutually exclusive manner. Predicted exons are evaluated by transcript data, structural compatibility, and evolutionary conservation, revealing hundreds of novel coding exons and splice mechanism re-assignments. The emerging human pan-genome is necessitating distinctive annotations incorporating differences between individuals and between populations.
Collapse
Affiliation(s)
- Klas Hatje
- Roche Pharmaceutical Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstr. 124, 4070, Basel, Switzerland
| | - Stefanie Mühlhausen
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077, Göttingen, Germany
| | - Dominic Simm
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077, Göttingen, Germany.,Theoretical Computer Science and Algorithmic Methods, Institute of Computer Science, Georg-August-University Göttingen, Goldschmidtstr. 7, 37077, Göttingen, Germany
| | - Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077, Göttingen, Germany
| |
Collapse
|
5
|
Mühlhausen S, Schmitt HD, Pan KT, Plessmann U, Urlaub H, Hurst LD, Kollmar M. Endogenous Stochastic Decoding of the CUG Codon by Competing Ser- and Leu-tRNAs in Ascoidea asiatica. Curr Biol 2018; 28:2046-2057.e5. [PMID: 29910077 PMCID: PMC6041473 DOI: 10.1016/j.cub.2018.04.085] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2018] [Revised: 04/22/2018] [Accepted: 04/24/2018] [Indexed: 12/24/2022]
Abstract
Although the “universal” genetic code is now known not to be universal, and stop codons can have multiple meanings, one regularity remains, namely that for a given sense codon there is a unique translation. Examining CUG usage in yeasts that have transferred CUG away from leucine, we here report the first example of dual coding: Ascoidea asiatica stochastically encodes CUG as both serine and leucine in approximately equal proportions. This is deleterious, as evidenced by CUG codons being rare, never at conserved serine or leucine residues, and predominantly in lowly expressed genes. Related yeasts solve the problem by loss of function of one of the two tRNAs. This dual coding is consistent with the tRNA-loss-driven codon reassignment hypothesis, and provides a unique example of a proteome that cannot be deterministically predicted. Video Abstract
Ascoidea asiatica stochastically encodes CUG as leucine and serine It is the only known example of a proteome with non-deterministic features Stochastic encoding is caused by competing tRNALeu(CAG) and tRNASer(CAG) A. asiatica copes with stochastic encoding by avoiding CUG at key positions
Collapse
Affiliation(s)
- Stefanie Mühlhausen
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Hans Dieter Schmitt
- Department of Neurobiology, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| | - Kuan-Ting Pan
- Bioanalytical Mass Spectrometry, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| | - Uwe Plessmann
- Bioanalytical Mass Spectrometry, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| | - Henning Urlaub
- Bioanalytical Mass Spectrometry, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany; Bioanalytics Group, Department of Clinical Chemistry, University Medical Center Göttingen, Robert Koch Strasse 40, 37075 Göttingen, Germany
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany.
| |
Collapse
|
6
|
Simm D, Kollmar M. Waggawagga-CLI: A command-line tool for predicting stable single α-helices (SAH-domains), and the SAH-domain distribution across eukaryotes. PLoS One 2018; 13:e0191924. [PMID: 29444145 PMCID: PMC5812594 DOI: 10.1371/journal.pone.0191924] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 01/12/2018] [Indexed: 12/15/2022] Open
Abstract
Stable single-alpha helices (SAH-domains) function as rigid connectors and constant force springs between structural domains, and can provide contact surfaces for protein-protein and protein-RNA interactions. SAH-domains mainly consist of charged amino acids and are monomeric and stable in polar solutions, characteristics which distinguish them from coiled-coil domains and intrinsically disordered regions. Although the number of reported SAH-domains is steadily increasing, genome-wide analyses of SAH-domains in eukaryotic genomes are still missing. Here, we present Waggawagga-CLI, a command-line tool for predicting and analysing SAH-domains in protein sequence datasets. Using Waggawagga-CLI we predicted SAH-domains in 24 datasets from eukaryotes across the tree of life. SAH-domains were predicted in 0.5 to 3.5% of the protein-coding content per species. SAH-domains are particularly present in longer proteins supporting their function as structural building block in multi-domain proteins. In human, SAH-domains are mainly used as alternative building blocks not being present in all transcripts of a gene. Gene ontology analysis showed that yeast proteins with SAH-domains are particular enriched in macromolecular complex subunit organization, cellular component biogenesis and RNA metabolic processes, and that they have a strong nuclear and ribonucleoprotein complex localization and function in ribosome and nucleic acid binding. Human proteins with SAH-domains have roles in all types of RNA processing and cytoskeleton organization, and are predicted to function in RNA binding, protein binding involved in cell and cell-cell adhesion, and cytoskeletal protein binding. Waggawagga-CLI allows the user to adjust the stabilizing and destabilizing contribution of amino acid interactions in i,i+3 and i,i+4 spacings, and provides extensive flexibility for user-designed analyses.
Collapse
Affiliation(s)
- Dominic Simm
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
- Theoretical Computer Science and Algorithmic Methods, Institute of Computer Science, Georg-August-University Göttingen, Göttingen, Germany
| | - Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
- * E-mail:
| |
Collapse
|
7
|
Kollmar M, Simm D. Identifying Sequenced Eukaryotic Genomes and Transcriptomes with diArk. Methods Mol Biol 2018; 1757:1-19. [PMID: 29761453 DOI: 10.1007/978-1-4939-7737-6_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The diArk Eukaryotic Genome Database is a manually curated and updated repository of available eukaryotic genome and transcriptome assemblies. diArk is a key resource for researchers interested in comparative eukaryotic genomics, and the entry point to browsing sequenced eukaryotes in general and to find the most closely related species to the own organism of interest in particular. The exponentially increasing number of sequenced species demands sophisticated search and data presentation tools. In this chapter we describe how to navigate the diArk database keeping a first-time user in mind.
Collapse
Affiliation(s)
- Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany.
| | - Dominic Simm
- Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
- Theoretical Computer Science and Algorithmic Methods, Institute of Computer Science, Georg-August-University, Göttingen, Germany
| |
Collapse
|
8
|
Hatje K, Rahman RU, Vidal RO, Simm D, Hammesfahr B, Bansal V, Rajput A, Mickael ME, Sun T, Bonn S, Kollmar M. The landscape of human mutually exclusive splicing. Mol Syst Biol 2017; 13:959. [PMID: 29242366 PMCID: PMC5740500 DOI: 10.15252/msb.20177728] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Mutually exclusive splicing of exons is a mechanism of functional gene and protein diversification with pivotal roles in organismal development and diseases such as Timothy syndrome, cardiomyopathy and cancer in humans. In order to obtain a first genomewide estimate of the extent and biological role of mutually exclusive splicing in humans, we predicted and subsequently validated mutually exclusive exons (MXEs) using 515 publically available RNA‐Seq datasets. Here, we provide evidence for the expression of over 855 MXEs, 42% of which represent novel exons, increasing the annotated human mutually exclusive exome more than fivefold. The data provide strong evidence for the existence of large and multi‐cluster MXEs in higher vertebrates and offer new insights into MXE evolution. More than 82% of the MXE clusters are conserved in mammals, and five clusters have homologous clusters in Drosophila. Finally, MXEs are significantly enriched in pathogenic mutations and their spatio‐temporal expression might predict human disease pathology.
Collapse
Affiliation(s)
- Klas Hatje
- Group Systems Biology of Motor Proteins Department of NMR-Based Structural Biology Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany.,Group of Computational Systems Biology, German Center for Neurodegenerative Diseases, Göttingen, Germany
| | - Raza-Ur Rahman
- Group of Computational Systems Biology, German Center for Neurodegenerative Diseases, Göttingen, Germany.,Center for Molecular Neurobiology, Institute of Medical Systems Biology University Clinic Hamburg-Eppendorf, Hamburg, Germany
| | - Ramon O Vidal
- Group of Computational Systems Biology, German Center for Neurodegenerative Diseases, Göttingen, Germany
| | - Dominic Simm
- Group Systems Biology of Motor Proteins Department of NMR-Based Structural Biology Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany.,Theoretical Computer Science and Algorithmic Methods, Institute of Computer Science Georg-August-University, Göttingen, Germany
| | - Björn Hammesfahr
- Group Systems Biology of Motor Proteins Department of NMR-Based Structural Biology Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| | - Vikas Bansal
- Group of Computational Systems Biology, German Center for Neurodegenerative Diseases, Göttingen, Germany.,Center for Molecular Neurobiology, Institute of Medical Systems Biology University Clinic Hamburg-Eppendorf, Hamburg, Germany
| | - Ashish Rajput
- Group of Computational Systems Biology, German Center for Neurodegenerative Diseases, Göttingen, Germany.,Center for Molecular Neurobiology, Institute of Medical Systems Biology University Clinic Hamburg-Eppendorf, Hamburg, Germany
| | - Michel Edwar Mickael
- Group of Computational Systems Biology, German Center for Neurodegenerative Diseases, Göttingen, Germany.,Center for Molecular Neurobiology, Institute of Medical Systems Biology University Clinic Hamburg-Eppendorf, Hamburg, Germany
| | - Ting Sun
- Group of Computational Systems Biology, German Center for Neurodegenerative Diseases, Göttingen, Germany.,Center for Molecular Neurobiology, Institute of Medical Systems Biology University Clinic Hamburg-Eppendorf, Hamburg, Germany
| | - Stefan Bonn
- Group of Computational Systems Biology, German Center for Neurodegenerative Diseases, Göttingen, Germany .,Center for Molecular Neurobiology, Institute of Medical Systems Biology University Clinic Hamburg-Eppendorf, Hamburg, Germany.,German Center for Neurodegenerative Diseases, Tübingen, Germany
| | - Martin Kollmar
- Group Systems Biology of Motor Proteins Department of NMR-Based Structural Biology Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| |
Collapse
|
9
|
Kollmar M, Mühlhausen S. Myosin repertoire expansion coincides with eukaryotic diversification in the Mesoproterozoic era. BMC Evol Biol 2017; 17:211. [PMID: 28870165 PMCID: PMC5583752 DOI: 10.1186/s12862-017-1056-2] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Accepted: 08/25/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The last eukaryotic common ancestor already had an amazingly complex cell possessing genomic and cellular features such as spliceosomal introns, mitochondria, cilia-dependent motility, and a cytoskeleton together with several intracellular transport systems. In contrast to the microtubule-based dyneins and kinesins, the actin-filament associated myosins are considerably divergent in extant eukaryotes and a unifying picture of their evolution has not yet emerged. RESULTS Here, we manually assembled and annotated 7852 myosins from 929 eukaryotes providing an unprecedented dense sequence and taxonomic sampling. For classification we complemented phylogenetic analyses with gene structure comparisons resulting in 79 distinct myosin classes. The intron pattern analysis and the taxonomic distribution of the classes suggest two myosins in the last eukaryotic common ancestor, a class-1 prototype and another myosin, which is most likely the ancestor of all other myosin classes. The sparse distribution of class-2 and class-4 myosins outside their major lineages contradicts their presence in the last eukaryotic common ancestor but instead strongly suggests early eukaryote-eukaryote horizontal gene transfer. CONCLUSIONS By correlating the evolution of myosin diversity with the history of Earth we found that myosin innovation occurred in independent major "burst" events in the major eukaryotic lineages. Most myosin inventions happened in the Mesoproterozoic era. In the late Neoproterozoic era, a process of extensive independent myosin loss began simultaneously with further eukaryotic diversification. Since the Cambrian explosion, myosin repertoire expansion is driven by lineage- and species-specific gene and genome duplications leading to subfunctionalization and fine-tuning of myosin functions.
Collapse
Affiliation(s)
- Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany.
| | - Stefanie Mühlhausen
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany.,Department of Biology and Biochemistry, The Milner Centre for Evolution, University of Bath, Bath, UK
| |
Collapse
|
10
|
Pearce SL, Clarke DF, East PD, Elfekih S, Gordon KHJ, Jermiin LS, McGaughran A, Oakeshott JG, Papanicolaou A, Perera OP, Rane RV, Richards S, Tay WT, Walsh TK, Anderson A, Anderson CJ, Asgari S, Board PG, Bretschneider A, Campbell PM, Chertemps T, Christeller JT, Coppin CW, Downes SJ, Duan G, Farnsworth CA, Good RT, Han LB, Han YC, Hatje K, Horne I, Huang YP, Hughes DST, Jacquin-Joly E, James W, Jhangiani S, Kollmar M, Kuwar SS, Li S, Liu NY, Maibeche MT, Miller JR, Montagne N, Perry T, Qu J, Song SV, Sutton GG, Vogel H, Walenz BP, Xu W, Zhang HJ, Zou Z, Batterham P, Edwards OR, Feyereisen R, Gibbs RA, Heckel DG, McGrath A, Robin C, Scherer SE, Worley KC, Wu YD. Erratum to: Genomic innovations, transcriptional plasticity and gene loss underlying the evolution and divergence of two highly polyphagous and invasive Helicoverpa pest species. BMC Biol 2017; 15:69. [PMID: 28810920 PMCID: PMC5557573 DOI: 10.1186/s12915-017-0413-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Accepted: 08/07/2017] [Indexed: 11/10/2022] Open
Affiliation(s)
- S L Pearce
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - D F Clarke
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.,School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | - P D East
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - S Elfekih
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - K H J Gordon
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.
| | - L S Jermiin
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - A McGaughran
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.,Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - J G Oakeshott
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.
| | - A Papanicolaou
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.,Hawksbury Institute for the Environment, Western Sydney University, Penrith, NSW, Australia
| | - O P Perera
- Southern Insect Management Research Unit, USDA-ARS, Stoneville, MS, USA
| | - R V Rane
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.,School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | - S Richards
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
| | - W T Tay
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - T K Walsh
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - A Anderson
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - C J Anderson
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.,Biological and Environmental Sciences, University of Stirling, Stirling, UK
| | - S Asgari
- School of Biological Sciences, University of Queensland, Brisbane St Lucia, QLD, Australia
| | - P G Board
- John Curtin School of Medical Research, Australian National University, Canberra, ACT, Australia
| | | | - P M Campbell
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - T Chertemps
- Sorbonnes Universités, UPMC Université Paris 06, Institute of Ecology and Environmental Sciences of Paris, Paris, France.,National Institute for Agricultural Research (INRA), Institute of Ecology and Environmental Sciences of Paris, Versailles, France
| | | | - C W Coppin
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | | | - G Duan
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - C A Farnsworth
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - R T Good
- School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | - L B Han
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Y C Han
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.,College of Plant Protection, Nanjing Agricultural University, Nanjing, Jiangsu, China
| | - K Hatje
- Max Planck Institute for Biophysical Chemistry, Gottingen, Germany
| | - I Horne
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - Y P Huang
- Institute of Plant Physiology and Ecology, Shanghai Institutes of Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - D S T Hughes
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - E Jacquin-Joly
- National Institute for Agricultural Research (INRA), Institute of Ecology and Environmental Sciences of Paris, Versailles, France
| | - W James
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - S Jhangiani
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - M Kollmar
- Max Planck Institute for Biophysical Chemistry, Gottingen, Germany
| | - S S Kuwar
- Max Planck Institute of Chemical Ecology, Jena, Germany
| | - S Li
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - N-Y Liu
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.,Key Laboratory of Forest Disaster Warning and Control of Yunnan Province, Southwest Forestry University, Kunming, 650224, China
| | - M T Maibeche
- Sorbonnes Universités, UPMC Université Paris 06, Institute of Ecology and Environmental Sciences of Paris, Paris, France.,National Institute for Agricultural Research (INRA), Institute of Ecology and Environmental Sciences of Paris, Versailles, France
| | - J R Miller
- J. Craig Venter Institute, Rockville, MD, USA
| | - N Montagne
- Sorbonnes Universités, UPMC Université Paris 06, Institute of Ecology and Environmental Sciences of Paris, Paris, France
| | - T Perry
- School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | - J Qu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - S V Song
- School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | - G G Sutton
- J. Craig Venter Institute, Rockville, MD, USA
| | - H Vogel
- Max Planck Institute of Chemical Ecology, Jena, Germany
| | - B P Walenz
- J. Craig Venter Institute, Rockville, MD, USA
| | - W Xu
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.,School of Veterinary and Life Sciences, Murdoch University, Perth, WA, Australia
| | - H-J Zhang
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.,Chongqing Key Laboratory of Biochemistry and Molecular Pharmacology, Chongqing Medical University, Chongqing, 400016, China
| | - Z Zou
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - P Batterham
- School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | | | - R Feyereisen
- Department of Plant and Environmental Sciences, University of Copenhagen, Thorvaldsensvej, Denmark
| | - R A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - D G Heckel
- Max Planck Institute of Chemical Ecology, Jena, Germany
| | - A McGrath
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - C Robin
- School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | - S E Scherer
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - K C Worley
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Y D Wu
- College of Plant Protection, Nanjing Agricultural University, Nanjing, Jiangsu, China
| |
Collapse
|
11
|
Pearce SL, Clarke DF, East PD, Elfekih S, Gordon KHJ, Jermiin LS, McGaughran A, Oakeshott JG, Papanicolaou A, Perera OP, Rane RV, Richards S, Tay WT, Walsh TK, Anderson A, Anderson CJ, Asgari S, Board PG, Bretschneider A, Campbell PM, Chertemps T, Christeller JT, Coppin CW, Downes SJ, Duan G, Farnsworth CA, Good RT, Han LB, Han YC, Hatje K, Horne I, Huang YP, Hughes DST, Jacquin-Joly E, James W, Jhangiani S, Kollmar M, Kuwar SS, Li S, Liu NY, Maibeche MT, Miller JR, Montagne N, Perry T, Qu J, Song SV, Sutton GG, Vogel H, Walenz BP, Xu W, Zhang HJ, Zou Z, Batterham P, Edwards OR, Feyereisen R, Gibbs RA, Heckel DG, McGrath A, Robin C, Scherer SE, Worley KC, Wu YD. Genomic innovations, transcriptional plasticity and gene loss underlying the evolution and divergence of two highly polyphagous and invasive Helicoverpa pest species. BMC Biol 2017; 15:63. [PMID: 28756777 PMCID: PMC5535293 DOI: 10.1186/s12915-017-0402-6] [Citation(s) in RCA: 178] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Accepted: 07/04/2017] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Helicoverpa armigera and Helicoverpa zea are major caterpillar pests of Old and New World agriculture, respectively. Both, particularly H. armigera, are extremely polyphagous, and H. armigera has developed resistance to many insecticides. Here we use comparative genomics, transcriptomics and resequencing to elucidate the genetic basis for their properties as pests. RESULTS We find that, prior to their divergence about 1.5 Mya, the H. armigera/H. zea lineage had accumulated up to more than 100 more members of specific detoxification and digestion gene families and more than 100 extra gustatory receptor genes, compared to other lepidopterans with narrower host ranges. The two genomes remain very similar in gene content and order, but H. armigera is more polymorphic overall, and H. zea has lost several detoxification genes, as well as about 50 gustatory receptor genes. It also lacks certain genes and alleles conferring insecticide resistance found in H. armigera. Non-synonymous sites in the expanded gene families above are rapidly diverging, both between paralogues and between orthologues in the two species. Whole genome transcriptomic analyses of H. armigera larvae show widely divergent responses to different host plants, including responses among many of the duplicated detoxification and digestion genes. CONCLUSIONS The extreme polyphagy of the two heliothines is associated with extensive amplification and neofunctionalisation of genes involved in host finding and use, coupled with versatile transcriptional responses on different hosts. H. armigera's invasion of the Americas in recent years means that hybridisation could generate populations that are both locally adapted and insecticide resistant.
Collapse
Affiliation(s)
- S L Pearce
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - D F Clarke
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
- School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | - P D East
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - S Elfekih
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - K H J Gordon
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.
| | - L S Jermiin
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - A McGaughran
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - J G Oakeshott
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia.
| | - A Papanicolaou
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
- Hawksbury Institute for the Environment, Western Sydney University, Penrith, NSW, Australia
| | - O P Perera
- Southern Insect Management Research Unit, USDA-ARS, Stoneville, MS, USA
| | - R V Rane
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
- School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | - S Richards
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
| | - W T Tay
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - T K Walsh
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - A Anderson
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - C J Anderson
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
- Biological and Environmental Sciences, University of Stirling, Stirling, UK
| | - S Asgari
- School of Biological Sciences, University of Queensland, Brisbane St Lucia, QLD, Australia
| | - P G Board
- John Curtin School of Medical Research, Australian National University, Canberra, ACT, Australia
| | | | - P M Campbell
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - T Chertemps
- Sorbonnes Universités, UPMC Université Paris 06, Institute of Ecology and Environmental Sciences of Paris, Paris, France
- National Institute for Agricultural Research (INRA), Institute of Ecology and Environmental Sciences of Paris, Versailles, France
| | | | - C W Coppin
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | | | - G Duan
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - C A Farnsworth
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - R T Good
- School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | - L B Han
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Y C Han
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
- College of Plant Protection, Nanjing Agricultural University, Nanjing, Jiangsu, China
| | - K Hatje
- Max Planck Institute for Biophysical Chemistry, Gottingen, Germany
| | - I Horne
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - Y P Huang
- Institute of Plant Physiology and Ecology, Shanghai Institutes of Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - D S T Hughes
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - E Jacquin-Joly
- National Institute for Agricultural Research (INRA), Institute of Ecology and Environmental Sciences of Paris, Versailles, France
| | - W James
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - S Jhangiani
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - M Kollmar
- Max Planck Institute for Biophysical Chemistry, Gottingen, Germany
| | - S S Kuwar
- Max Planck Institute of Chemical Ecology, Jena, Germany
| | - S Li
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - N-Y Liu
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
- Key Laboratory of Forest Disaster Warning and Control of Yunnan Province, Southwest Forestry University, Kunming, 650224, China
| | - M T Maibeche
- Sorbonnes Universités, UPMC Université Paris 06, Institute of Ecology and Environmental Sciences of Paris, Paris, France
- National Institute for Agricultural Research (INRA), Institute of Ecology and Environmental Sciences of Paris, Versailles, France
| | - J R Miller
- J. Craig Venter Institute, Rockville, MD, USA
| | - N Montagne
- Sorbonnes Universités, UPMC Université Paris 06, Institute of Ecology and Environmental Sciences of Paris, Paris, France
| | - T Perry
- School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | - J Qu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - S V Song
- School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | - G G Sutton
- J. Craig Venter Institute, Rockville, MD, USA
| | - H Vogel
- Max Planck Institute of Chemical Ecology, Jena, Germany
| | - B P Walenz
- J. Craig Venter Institute, Rockville, MD, USA
| | - W Xu
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
- School of Veterinary and Life Sciences, Murdoch University, Perth, WA, Australia
| | - H-J Zhang
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
- Chongqing Key Laboratory of Biochemistry and Molecular Pharmacology, Chongqing Medical University, Chongqing, 400016, China
| | - Z Zou
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - P Batterham
- School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | | | - R Feyereisen
- Department of Plant and Environmental Sciences, University of Copenhagen, Thorvaldsensvej, Denmark
| | - R A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - D G Heckel
- Max Planck Institute of Chemical Ecology, Jena, Germany
| | - A McGrath
- CSIRO Black Mountain, GPO Box 1700, Canberra, ACT, 2600, Australia
| | - C Robin
- School of Biological Sciences, University of Melbourne, Parkville, Vic, Australia
| | - S E Scherer
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - K C Worley
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Y D Wu
- College of Plant Protection, Nanjing Agricultural University, Nanjing, Jiangsu, China
| |
Collapse
|
12
|
Abstract
[This corrects the article DOI: 10.1371/journal.pone.0174639.].
Collapse
|
13
|
Simm D, Hatje K, Kollmar M. Distribution and evolution of stable single α-helices (SAH domains) in myosin motor proteins. PLoS One 2017; 12:e0174639. [PMID: 28369123 PMCID: PMC5378345 DOI: 10.1371/journal.pone.0174639] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Accepted: 03/13/2017] [Indexed: 11/19/2022] Open
Abstract
Stable single-alpha helices (SAHs) are versatile structural elements in many prokaryotic and eukaryotic proteins acting as semi-flexible linkers and constant force springs. This way SAH-domains function as part of the lever of many different myosins. Canonical myosin levers consist of one or several IQ-motifs to which light chains such as calmodulin bind. SAH-domains provide flexibility in length and stiffness to the myosin levers, and may be particularly suited for myosins working in crowded cellular environments. Although the function of the SAH-domains in human class-6 and class-10 myosins has well been characterised, the distribution of the SAH-domain in all myosin subfamilies and across the eukaryotic tree of life remained elusive. Here, we analysed the largest available myosin sequence dataset consisting of 7919 manually annotated myosin sequences from 938 species representing all major eukaryotic branches using the SAH-prediction algorithm of Waggawagga, a recently developed tool for the identification of SAH-domains. With this approach we identified SAH-domains in more than one third of the supposed 79 myosin subfamilies. Depending on the myosin class, the presence of SAH-domains can range from a few to almost all class members indicating complex patterns of independent and taxon-specific SAH-domain gain and loss.
Collapse
Affiliation(s)
- Dominic Simm
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
- Theoretical Computer Science and Algorithmic Methods, Institute of Computer Science, Georg-August-University Göttingen, Göttingen, Germany
| | - Klas Hatje
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| | - Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
- * E-mail:
| |
Collapse
|
14
|
Abstract
The canonical genetic code ubiquitously translates nucleotide into peptide sequence with several alterations known in viruses, bacteria, mitochondria, plastids, and single-celled eukaryotes. A new hypothesis to explain genetic code changes, termed tRNA loss driven codon reassignment, has been proposed recently when the polyphyly of the yeast codon reassignment events has been uncovered. According to this hypothesis, the driving force for genetic code changes are tRNA or translation termination factor loss-of-function mutations or loss-of-gene events. The free codon can subsequently be captured by all tRNAs that have an appropriately mutated anticodon and are efficiently charged. Thus, codon capture most likely happens by near-cognate tRNAs and tRNAs whose anticodons are not part of the recognition sites of the respective aminoacyl-tRNA-synthetases. This hypothesis comprehensively explains the CTG codon translation as alanine in Pachysolen yeast together with the long known translation of the same codon as serine in Candida albicans and related species, and can also be applied to most other known reassignments.
Collapse
Affiliation(s)
- Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| | - Stefanie Mühlhausen
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK
| |
Collapse
|
15
|
Abstract
mRNA decoding by tRNAs and tRNA charging by aminoacyl-tRNA synthetases are biochemically separated processes that nevertheless in general involve the same nucleotides. The combination of charging and decoding determines the genetic code. Codon reassignment happens when a differently charged tRNA replaces a former cognate tRNA. The recent discovery of the polyphyly of the yeast CUG sense codon reassignment challenged previous mechanistic considerations and led to the proposal of the so-called tRNA loss driven codon reassignment hypothesis. Accordingly, codon capture is caused by loss of a tRNA or by mutations in the translation termination factor, subsequent reduction of the codon frequency through reduced translation fidelity and final appearance of a new cognate tRNA. Critical for codon capture are sequence and structure of the new tRNA, which must be compatible with recognition regions of aminoacyl-tRNA synthetases. The proposed hypothesis applies to all reported nuclear and organellar codon reassignments.
Collapse
Affiliation(s)
- Martin Kollmar
- a Group Systems Biology of Motor Proteins , Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry , Göttingen , Germany
| | - Stefanie Mühlhausen
- b Milner Centre for Evolution, Department of Biology and Biochemistry , University of Bath, Milner Centre for Evolution , Bath , UK
| |
Collapse
|
16
|
Abstract
The flagellum is a key innovation linked to eukaryogenesis. It provides motility by regulated cycles of bending and bend propagation, which are thought to be controlled by a complex arrangement of seven distinct dyneins in repeated patterns of outer- (OAD) and inner-arm dynein (IAD) complexes. Electron tomography showed high similarity of this axonemal repeat pattern across ciliates, algae, and animals, but the diversity of dynein sequences across the eukaryotes has not yet comprehensively been resolved and correlated with structural data. To shed light on the evolution of the axoneme I performed an exhaustive analysis of dyneins using the available sequenced genome data. Evidence from motor domain phylogeny allowed expanding the current set of nine dynein subtypes by eight additional isoforms with, however, restricted taxonomic distributions. I confirmed the presence of the nine dyneins in all eukaryotic super-groups indicating their origin predating the last eukaryotic common ancestor. The comparison of the N-terminal tail domains revealed a most likely axonemal dynein origin of the new classes, a group of chimeric dyneins in plants/algae and Stramenopiles, and the unique domain architecture and origin of the outermost OADs present in green algae and ciliates but not animals. The correlation of sequence and structural data suggests the single-headed class-8 and class-9 dyneins to localize to the distal end of the axonemal repeat and the class-7 dyneins filling the region up to the proximal heterodimeric IAD. Tracing dynein gene duplications across the eukaryotes indicated ongoing diversification and fine-tuning of flagellar functions in extant taxa and species.
Collapse
Affiliation(s)
- Martin Kollmar
- Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Goettingen, Germany
| |
Collapse
|
17
|
Pylypenko O, Welz T, Tittel J, Kollmar M, Chardon F, Malherbe G, Weiss S, Michel CIL, Samol-Wolf A, Grasskamp AT, Hume A, Goud B, Baron B, England P, Titus MA, Schwille P, Weidemann T, Houdusse A, Kerkhoff E. Coordinated recruitment of Spir actin nucleators and myosin V motors to Rab11 vesicle membranes. eLife 2016; 5. [PMID: 27623148 PMCID: PMC5021521 DOI: 10.7554/elife.17523] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2016] [Accepted: 08/18/2016] [Indexed: 12/22/2022] Open
Abstract
There is growing evidence for a coupling of actin assembly and myosin motor activity in cells. However, mechanisms for recruitment of actin nucleators and motors on specific membrane compartments remain unclear. Here we report how Spir actin nucleators and myosin V motors coordinate their specific membrane recruitment. The myosin V globular tail domain (MyoV-GTD) interacts directly with an evolutionarily conserved Spir sequence motif. We determined crystal structures of MyoVa-GTD bound either to the Spir-2 motif or to Rab11 and show that a Spir-2:MyoVa:Rab11 complex can form. The ternary complex architecture explains how Rab11 vesicles support coordinated F-actin nucleation and myosin force generation for vesicle transport and tethering. New insights are also provided into how myosin activation can be coupled with the generation of actin tracks. Since MyoV binds several Rab GTPases, synchronized nucleator and motor targeting could provide a common mechanism to control force generation and motility in different cellular processes.
Collapse
Affiliation(s)
- Olena Pylypenko
- Institut Curie, PSL Research University, CNRS, UMR 144, F-75005, Paris, France
| | - Tobias Welz
- University Hospital Regensburg, Regensburg, Germany
| | - Janine Tittel
- Max Planck Institute of Biochemistry, Martinsried, Germany
| | - Martin Kollmar
- Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Florian Chardon
- Institut Curie, PSL Research University, CNRS, UMR 144, F-75005, Paris, France
| | - Gilles Malherbe
- Institut Curie, PSL Research University, CNRS, UMR 144, F-75005, Paris, France
| | - Sabine Weiss
- University Hospital Regensburg, Regensburg, Germany
| | | | | | | | - Alistair Hume
- University of Nottingham, Nottingham, United Kingdom
| | - Bruno Goud
- Institut Curie, PSL Research University, CNRS, UMR 144, F-75005, Paris, France
| | - Bruno Baron
- Institut Pasteur, Biophysics of Macromolecules and their Interactions, Paris, France.,CNRS, UMR 3528, Paris, France
| | - Patrick England
- Institut Pasteur, Biophysics of Macromolecules and their Interactions, Paris, France.,CNRS, UMR 3528, Paris, France
| | - Margaret A Titus
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, United States
| | - Petra Schwille
- Max Planck Institute of Biochemistry, Martinsried, Germany
| | | | - Anne Houdusse
- Institut Curie, PSL Research University, CNRS, UMR 144, F-75005, Paris, France
| | | |
Collapse
|
18
|
Mühlhausen S, Findeisen P, Plessmann U, Urlaub H, Kollmar M. A novel nuclear genetic code alteration in yeasts and the evolution of codon reassignment in eukaryotes. Genome Res 2016; 26:945-55. [PMID: 27197221 PMCID: PMC4937558 DOI: 10.1101/gr.200931.115] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2015] [Accepted: 04/28/2016] [Indexed: 01/12/2023]
Abstract
The genetic code is the cellular translation table for the conversion of nucleotide sequences into amino acid sequences. Changes to the meaning of sense codons would introduce errors into almost every translated message and are expected to be highly detrimental. However, reassignment of single or multiple codons in mitochondria and nuclear genomes, although extremely rare, demonstrates that the code can evolve. Several models for the mechanism of alteration of nuclear genetic codes have been proposed (including “codon capture,” “genome streamlining,” and “ambiguous intermediate” theories), but with little resolution. Here, we report a novel sense codon reassignment in Pachysolen tannophilus, a yeast related to the Pichiaceae. By generating proteomics data and using tRNA sequence comparisons, we show that Pachysolen translates CUG codons as alanine and not as the more usual leucine. The Pachysolen tRNACAG is an anticodon-mutated tRNAAla containing all major alanine tRNA recognition sites. The polyphyly of the CUG-decoding tRNAs in yeasts is best explained by a tRNA loss driven codon reassignment mechanism. Loss of the CUG-tRNA in the ancient yeast is followed by gradual decrease of respective codons and subsequent codon capture by tRNAs whose anticodon is not part of the aminoacyl-tRNA synthetase recognition region. Our hypothesis applies to all nuclear genetic code alterations and provides several testable predictions. We anticipate more codon reassignments to be uncovered in existing and upcoming genome projects.
Collapse
Affiliation(s)
- Stefanie Mühlhausen
- Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, 37077 Göttingen, Germany
| | - Peggy Findeisen
- Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, 37077 Göttingen, Germany
| | - Uwe Plessmann
- Bioanalytical Mass Spectrometry, Max-Planck-Institute for Biophysical Chemistry, 37077 Göttingen, Germany
| | - Henning Urlaub
- Bioanalytical Mass Spectrometry, Max-Planck-Institute for Biophysical Chemistry, 37077 Göttingen, Germany; Bioanalytics Group, Department of Clinical Chemistry, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, 37077 Göttingen, Germany
| |
Collapse
|
19
|
Mühlhausen S, Hellkamp M, Kollmar M. GenePainter v. 2.0 resolves the taxonomic distribution of intron positions. ACTA ACUST UNITED AC 2014; 31:1302-4. [PMID: 25434742 DOI: 10.1093/bioinformatics/btu798] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Accepted: 11/25/2014] [Indexed: 11/12/2022]
Abstract
UNLABELLED Conserved intron positions in eukaryotic genes can be used to reconstruct phylogenetic trees, to resolve ambiguous subfamily relationships in protein families and to infer the history of gene families. This version of GenePainter facilitates working with large datasets through options to select specific subsets for analysis and visualization, and through providing exhaustive statistics. GenePainter's application in phylogenetic analyses is considerably extended by the newly implemented integration of the exon-intron pattern conservation with phylogenetic trees. AVAILABILITY AND IMPLEMENTATION The software along with detailed documentation is available at http://www.motorprotein.de/genepainter and as Supplementary Material. CONTACT mako@nmr.mpibpc.mpg.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Stefanie Mühlhausen
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| | - Marcel Hellkamp
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| | - Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| |
Collapse
|
20
|
Abstract
Eukaryotic genomes are the basis for understanding the complexity of life from populations to the molecular level. Recent technological innovations have revolutionized the speed of data generation enabling the sequencing of eukaryotic genomes and transcriptomes within days. The database diArk (http://www.diark.org) has been developed with the aim to provide access to all available assembled genomes and transcriptomes. In September 2014, diArk contains about 2600 eukaryotes with 6000 genome and transcriptome assemblies, of which 22% are not available via NCBI/ENA/DDBJ. Several indicators for the quality of the assemblies are provided to facilitate their comparison for selecting the most appropriate dataset for further studies. diArk has a user-friendly web interface with extensive options for filtering and browsing the sequenced eukaryotes. In this new version of the database we have also integrated species, for which transcriptome assemblies are available, and we provide more analyses of assemblies.
Collapse
Affiliation(s)
- Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, 37085, Germany
| | - Lotte Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, 37085, Germany
| | - Björn Hammesfahr
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, 37085, Germany
| | - Dominic Simm
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, 37085, Germany
| |
Collapse
|
21
|
Simm D, Hatje K, Kollmar M. Waggawagga: comparative visualization of coiled-coil predictions and detection of stable single α-helices (SAH domains). Bioinformatics 2014; 31:767-9. [PMID: 25338722 DOI: 10.1093/bioinformatics/btu700] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED Waggawagga is a web-based tool for the comparative visualization of coiled-coil predictions and the detection of stable single α-helices (SAH domains). Overview schemes show the predicted coiled-coil regions found in the query sequence and provide sliders, which can be used to select segments for detailed helical wheel and helical net views. A window-based score has been developed to predict SAH domains. Export to several bitmap and vector graphics formats is supported. AVAILABILITY AND IMPLEMENTATION http://waggawagga.motorprotein.de
Collapse
Affiliation(s)
- Dominic Simm
- Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| | - Klas Hatje
- Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| | - Martin Kollmar
- Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| |
Collapse
|
22
|
Findeisen P, Mühlhausen S, Dempewolf S, Hertzog J, Zietlow A, Carlomagno T, Kollmar M. Six subgroups and extensive recent duplications characterize the evolution of the eukaryotic tubulin protein family. Genome Biol Evol 2014; 6:2274-88. [PMID: 25169981 PMCID: PMC4202323 DOI: 10.1093/gbe/evu187] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Tubulins belong to the most abundant proteins in eukaryotes providing the backbone for many cellular substructures like the mitotic and meiotic spindles, the intracellular cytoskeletal network, and the axonemes of cilia and flagella. Homologs have even been reported for archaea and bacteria. However, a taxonomically broad and whole-genome-based analysis of the tubulin protein family has never been performed, and thus, the number of subfamilies, their taxonomic distribution, and the exact grouping of the supposed archaeal and bacterial homologs are unknown. Here, we present the analysis of 3,524 tubulins from 504 species. The tubulins formed six major subfamilies, α to ζ. Species of all major kingdoms of the eukaryotes encode members of these subfamilies implying that they must have already been present in the last common eukaryotic ancestor. The proposed archaeal homologs grouped together with the bacterial TubZ proteins as sister clade to the FtsZ proteins indicating that tubulins are unique to eukaryotes. Most species contained α- and/or β-tubulin gene duplicates resulting from recent branch- and species-specific duplication events. This shows that tubulins cannot be used for constructing species phylogenies without resolving their ortholog–paralog relationships. The many gene duplicates and also the independent loss of the δ-, ε-, or ζ-tubulins, which have been shown to be part of the triplet microtubules in basal bodies, suggest that tubulins can functionally substitute each other.
Collapse
Affiliation(s)
- Peggy Findeisen
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| | - Stefanie Mühlhausen
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| | - Silke Dempewolf
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| | - Jonny Hertzog
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| | - Alexander Zietlow
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| | - Teresa Carlomagno
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| |
Collapse
|
23
|
Abstract
The universal genetic code defines the translation of nucleotide triplets, called
codons, into amino acids. In many Saccharomycetes a unique alteration of this code
affects the translation of the CUG codon, which is normally translated as leucine.
Most of the species encoding CUG alternatively as serine belong to the
Candida genus and were grouped into a so-called CTG clade.
However, the “Candida genus” is not a monophyletic group
and several Candida species are known to use the standard CUG
translation. The codon identity could have been changed in a single branch, the
ancestor of the Candida, or to several branches independently
leading to a polyphyletic alternative yeast codon usage (AYCU). In order to resolve
the monophyly or polyphyly of the AYCU, we performed a phylogenomics analysis of 26
motor and cytoskeletal proteins from 60 sequenced yeast species. By investigating the
CUG codon positions with respect to sequence conservation at the respective alignment
positions, we were able to unambiguously assign the standard code or AYCU.
Quantitative analysis of the highly conserved leucine and serine alignment positions
showed that 61.1% and 17% of the CUG codons coding for leucine and
serine, respectively, are at highly conserved positions, whereas only 0.6% and
2.3% of the CUG codons, respectively, are at positions conserved in the
respective other amino acid. Plotting the codon usage onto the phylogenetic tree
revealed the polyphyly of the AYCU with Pachysolen tannophilus and
the CTG clade branching independently within a time span of 30–100 Ma.
Collapse
|
24
|
Abstract
Background Many eukaryotes have been shown to use alternative schemes to the universal genetic code. While most Saccharomycetes, including Saccharomyces cerevisiae, use the standard genetic code translating the CUG codon as leucine, some yeasts, including many but not all of the “Candida”, translate the same codon as serine. It has been proposed that the change in codon identity was accomplished by an almost complete loss of the original CUG codons, making the CUG positions within the extant species highly discriminative for the one or other translation scheme. Results In order to improve the prediction of genes in yeast species by providing the correct CUG decoding scheme we implemented a web server, called Bagheera, that allows determining the most probable CUG codon translation for a given transcriptome or genome assembly based on extensive reference data. As reference data we use 2071 manually assembled and annotated sequences from 38 cytoskeletal and motor proteins belonging to 79 yeast species. The web service includes a pipeline, which starts with predicting and aligning homologous genes to the reference data. CUG codon positions within the predicted genes are analysed with respect to amino acid similarity and CUG codon conservation in related species. In addition, the tRNACAG gene is predicted in genomic data and compared to known leu-tRNACAG and ser-tRNACAG genes. Bagheera can also be used to evaluate any mRNA and protein sequence data with the codon usage of the respective species. The usage of the system has been demonstrated by analysing six genomes not included in the reference data. Conclusions Gene prediction and consecutive comparison with reference data from other Saccharomycetes are sufficient to predict the most probable decoding scheme for CUG codons. This approach has been implemented into Bagheera (http://www.motorprotein.de/bagheera). Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-411) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany.
| |
Collapse
|
25
|
Horwege S, Lindner S, Boden M, Hatje K, Kollmar M, Leimeister CA, Morgenstern B. Spaced words and kmacs: fast alignment-free sequence comparison based on inexact word matches. Nucleic Acids Res 2014; 42:W7-11. [PMID: 24829447 PMCID: PMC4086093 DOI: 10.1093/nar/gku398] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In this article, we present a user-friendly web interface for two alignment-free sequence-comparison methods that we recently developed. Most alignment-free methods rely on exact word matches to estimate pairwise similarities or distances between the input sequences. By contrast, our new algorithms are based on inexact word matches. The first of these approaches uses the relative frequencies of so-called spaced words in the input sequences, i.e. words containing ‘don't care’ or ‘wildcard’ symbols at certain pre-defined positions. Various distance measures can then be defined on sequences based on their different spaced-word composition. Our second approach defines the distance between two sequences by estimating for each position in the first sequence the length of the longest substring at this position that also occurs in the second sequence with up to k mismatches. Both approaches take a set of deoxyribonucleic acid (DNA) or protein sequences as input and return a matrix of pairwise distance values that can be used as a starting point for clustering algorithms or distance-based phylogeny reconstruction. The two alignment-free programmes are accessible through a web interface at ‘Göttingen Bioinformatics Compute Server (GOBICS)’: http://spaced.gobics.dehttp://kmacs.gobics.de and the source codes can be downloaded.
Collapse
Affiliation(s)
- Sebastian Horwege
- University of Göttingen, Institute of Microbiology and Genetics, Department of Bioinformatics, Goldschmidtstraße 1, 37073 Göttingen, Germany
| | - Sebastian Lindner
- University of Göttingen, Institute of Microbiology and Genetics, Department of Bioinformatics, Goldschmidtstraße 1, 37073 Göttingen, Germany
| | - Marcus Boden
- University of Göttingen, Institute of Microbiology and Genetics, Department of Bioinformatics, Goldschmidtstraße 1, 37073 Göttingen, Germany
| | - Klas Hatje
- Max-Planck-Institute for Biophysical Chemistry, Department of NMR-based Structural Biology, Group Systems Biology of Motor Proteins, Am Fassberg 11, 37077 Göttingen, Germany
| | - Martin Kollmar
- Max-Planck-Institute for Biophysical Chemistry, Department of NMR-based Structural Biology, Group Systems Biology of Motor Proteins, Am Fassberg 11, 37077 Göttingen, Germany
| | - Chris-André Leimeister
- University of Göttingen, Institute of Microbiology and Genetics, Department of Bioinformatics, Goldschmidtstraße 1, 37073 Göttingen, Germany
| | - Burkhard Morgenstern
- University of Göttingen, Institute of Microbiology and Genetics, Department of Bioinformatics, Goldschmidtstraße 1, 37073 Göttingen, Germany Université d'Évry Val d'Essonne, Laboratoire Statistique et Génome, UMR CNRS 8071, USC INRA, 23 Boulevard de France, 91037 Évry, France
| |
Collapse
|
26
|
Mühlhausen S, Kollmar M. Retracted: Molecular Phylogeny of Sequenced Saccharomycetes Reveals Polyphyly of the Alternative Yeast Codon Usage. Genome Biol Evol 2014; 6:evu093. [PMID: 24787622 PMCID: PMC4041000 DOI: 10.1093/gbe/evu093] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
27
|
Kollmar M, Hatje K. Shared gene structures and clusters of mutually exclusive spliced exons within the metazoan muscle myosin heavy chain genes. PLoS One 2014; 9:e88111. [PMID: 24498429 PMCID: PMC3912159 DOI: 10.1371/journal.pone.0088111] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2013] [Accepted: 01/07/2014] [Indexed: 11/25/2022] Open
Abstract
Multicellular animals possess two to three different types of muscle tissues. Striated muscles have considerable ultrastructural similarity and contain a core set of proteins including the muscle myosin heavy chain (Mhc) protein. The ATPase activity of this myosin motor protein largely dictates muscle performance at the molecular level. Two different solutions to adjusting myosin properties to different muscle subtypes have been identified so far: Vertebrates and nematodes contain many independent differentially expressed Mhc genes while arthropods have single Mhc genes with clusters of mutually exclusive spliced exons (MXEs). The availability of hundreds of metazoan genomes now allowed us to study whether the ancient bilateria already contained MXEs, how MXE complexity subsequently evolved, and whether additional scenarios to control contractile properties in different muscles could be proposed, By reconstructing the Mhc genes from 116 metazoans we showed that all intron positions within the motor domain coding regions are conserved in all bilateria analysed. The last common ancestor of the bilateria already contained a cluster of MXEs coding for part of the loop-2 actin-binding sequence. Subsequently the protostomes and later the arthropods gained many further clusters while MXEs got completely lost independently in several branches (vertebrates and nematodes) and species (for example the annelid Helobdella robusta and the salmon louse Lepeophtheirus salmonis). Several bilateria have been found to encode multiple Mhc genes that might all or in part contain clusters of MXEs. Notable examples are a cluster of six tandemly arrayed Mhc genes, of which two contain MXEs, in the owl limpet Lottia gigantea and four Mhc genes with three encoding MXEs in the predatory mite Metaseiulus occidentalis. Our analysis showed that similar solutions to provide different myosin isoforms (multiple genes or clusters of MXEs or both) have independently been developed several times within bilaterian evolution.
Collapse
Affiliation(s)
- Martin Kollmar
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
- * E-mail:
| | - Klas Hatje
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| |
Collapse
|
28
|
Mühlhausen S, Kollmar M. Whole genome duplication events in plant evolution reconstructed and predicted using myosin motor proteins. BMC Evol Biol 2013; 13:202. [PMID: 24053117 PMCID: PMC3850447 DOI: 10.1186/1471-2148-13-202] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2013] [Accepted: 09/16/2013] [Indexed: 01/22/2023] Open
Abstract
Background The evolution of land plants is characterized by whole genome duplications (WGD), which drove species diversification and evolutionary novelties. Detecting these events is especially difficult if they date back to the origin of the plant kingdom. Established methods for reconstructing WGDs include intra- and inter-genome comparisons, KS age distribution analyses, and phylogenetic tree constructions. Results By analysing 67 completely sequenced plant genomes 775 myosins were identified and manually assembled. Phylogenetic trees of the myosin motor domains revealed orthologous and paralogous relationships and were consistent with recent species trees. Based on the myosin inventories and the phylogenetic trees, we have identified duplications of the entire myosin motor protein family at timings consistent with 23 WGDs, that had been reported before. We also predict 6 WGDs based on further protein family duplications. Notably, the myosin data support the two recently reported WGDs in the common ancestor of all extant angiosperms. We predict single WGDs in the Manihot esculenta and Nicotiana benthamiana lineages, two WGDs for Linum usitatissimum and Phoenix dactylifera, and a triplication or two WGDs for Gossypium raimondii. Our data show another myosin duplication in the ancestor of the angiosperms that could be either the result of a single gene duplication or a remnant of a WGD. Conclusions We have shown that the myosin inventories in angiosperms retain evidence of numerous WGDs that happened throughout plant evolution. In contrast to other protein families, many myosins are still present in extant species. They are closely related and have similar domain architectures, and their phylogenetic grouping follows the genome duplications. Because of its broad taxonomic sampling the dataset provides the basis for reliable future identification of further whole genome duplications.
Collapse
Affiliation(s)
- Stefanie Mühlhausen
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for biophysical Chemistry, Göttingen, Germany.
| | | |
Collapse
|
29
|
Mazur A, Hammesfahr B, Griesinger C, Lee D, Kollmar M. ShereKhan--calculating exchange parameters in relaxation dispersion data from CPMG experiments. Bioinformatics 2013; 29:1819-20. [PMID: 23698862 DOI: 10.1093/bioinformatics/btt286] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
SUMMARY Dynamics governing the function of biomolecule is usually described as exchange processes and can be monitored at atomic resolution with nuclear magnetic resonance (NMR) relaxation dispersion data. Here, we present a new tool for the analysis of CPMG relaxation dispersion profiles (ShereKhan). The web interface to ShereKhan provides a user-friendly environment for the analysis. AVAILABILITY A stable version of ShereKhan, the web application and documentation are available at http://sherekhan.bionmr.org. CONTACT dole@nmr.mpibpc.mpg.de or mako@nmr.mpibpc.mpg.de.
Collapse
Affiliation(s)
- Adam Mazur
- Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany
| | | | | | | | | |
Collapse
|
30
|
Abstract
Accurate exon–intron structures are essential prerequisites in genomics, proteomics and for many protein family and single gene studies. We originally developed Scipio and the corresponding web service WebScipio for the reconstruction of gene structures based on protein sequences and available genome assemblies. WebScipio also allows predicting mutually exclusive spliced exons and tandemly arrayed gene duplicates. The obtained gene structures are illustrated in graphical schemes and can be analysed down to the nucleotide level. The set of eukaryotic genomes available at the WebScipio server is updated on a daily basis. The current version of the web server provides access to ∼3400 genome assembly files of >1100 sequenced eukaryotic species. Here, we have also extended the functionality by adding a module with which expressed sequence tag (EST) and cDNA data can be mapped to the reconstructed gene structure for the identification of all types of alternative splice variants. WebScipio has a user-friendly web interface, and we believe that the improved web server will provide better service to biologists interested in the gene structure corresponding to their protein of interest, including all types of alternative splice forms and tandem gene duplicates. WebScipio is freely available at http://www.webscipio.org.
Collapse
Affiliation(s)
- Klas Hatje
- Group Systems Biology of Motor Proteins, Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen 37077, Germany
| | | | | |
Collapse
|
31
|
Abstract
MOTIVATION When analyzing solid-state nuclear magnetic resonance (NMR) spectra of proteins, assignment of resonances to nuclei and derivation of restraints for 3D structure calculations are challenging and time-consuming processes. Simulated spectra that have been calculated based on, for example, chemical shift predictions and structural models can be of considerable help. Existing solutions are typically limited in the type of experiment they can consider and difficult to adapt to different settings. RESULTS Here, we present Peakr, a software to simulate solid-state NMR spectra of proteins. It can generate simulated spectra based on numerous common types of internuclear correlations relevant for assignment and structure elucidation, can compare simulated and experimental spectra and produces lists and visualizations useful for analyzing measured spectra. Compared with other solutions, it is fast, versatile and user friendly. AVAILABILITY AND IMPLEMENTATION Peakr is maintained under the GPL license and can be accessed at http://www.peakr.org. The source code can be obtained on request from the authors.
Collapse
Affiliation(s)
- Robert Schneider
- Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany.
| | | | | | | | | |
Collapse
|
32
|
Hammesfahr B, Odronitz F, Mühlhausen S, Waack S, Kollmar M. GenePainter: a fast tool for aligning gene structures of eukaryotic protein families, visualizing the alignments and mapping gene structures onto protein structures. BMC Bioinformatics 2013; 14:77. [PMID: 23496949 PMCID: PMC3605371 DOI: 10.1186/1471-2105-14-77] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2012] [Accepted: 02/24/2013] [Indexed: 11/10/2022] Open
Abstract
Background All sequenced eukaryotic genomes have been shown to possess at least a few introns. This includes those unicellular organisms, which were previously suspected to be intron-less. Therefore, gene splicing must have been present at least in the last common ancestor of the eukaryotes. To explain the evolution of introns, basically two mutually exclusive concepts have been developed. The introns-early hypothesis says that already the very first protein-coding genes contained introns while the introns-late concept asserts that eukaryotic genes gained introns only after the emergence of the eukaryotic lineage. A very important aspect in this respect is the conservation of intron positions within homologous genes of different taxa. Results GenePainter is a standalone application for mapping gene structure information onto protein multiple sequence alignments. Based on the multiple sequence alignments the gene structures are aligned down to single nucleotides. GenePainter accounts for variable lengths in exons and introns, respects split codons at intron junctions and is able to handle sequencing and assembly errors, which are possible reasons for frame-shifts in exons and gaps in genome assemblies. Thus, even gene structures of considerably divergent proteins can properly be compared, as it is needed in phylogenetic analyses. Conserved intron positions can also be mapped to user-provided protein structures. For their visualization GenePainter provides scripts for the molecular graphics system PyMol. Conclusions GenePainter is a tool to analyse gene structure conservation providing various visualization options. A stable version of GenePainter for all operating systems as well as documentation and example data are available at http://www.motorprotein.de/genepainter.html.
Collapse
Affiliation(s)
- Björn Hammesfahr
- Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, Göttingen, 37077, Germany
| | | | | | | | | |
Collapse
|
33
|
Kollmar M. Setting the Stage for an Interactive Map of Cytoskeletal Networks and Intracellular Transport Pathways. Biophys J 2013. [DOI: 10.1016/j.bpj.2012.11.3587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
34
|
Kollmar M, Lbik D, Enge S. Evolution of the eukaryotic ARP2/3 activators of the WASP family: WASP, WAVE, WASH, and WHAMM, and the proposed new family members WAWH and WAML. BMC Res Notes 2012; 5:88. [PMID: 22316129 PMCID: PMC3298513 DOI: 10.1186/1756-0500-5-88] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2011] [Accepted: 02/08/2012] [Indexed: 12/14/2022] Open
Abstract
Background WASP family proteins stimulate the actin-nucleating activity of the ARP2/3 complex. They include members of the well-known WASP and WAVE/Scar proteins, and the recently identified WASH and WHAMM proteins. WASP family proteins contain family specific N-terminal domains followed by proline-rich regions and C-terminal VCA domains that harbour the ARP2/3-activating regions. Results To reveal the evolution of ARP2/3 activation by WASP family proteins we performed a "holistic" analysis by manually assembling and annotating all homologs in most of the eukaryotic genomes available. We have identified two new families: the WAML proteins (WASP and MIM like), which combine the membrane-deforming and actin bundling functions of the IMD domains with the ARP2/3-activating VCA regions, and the WAWH protein (WASP without WH1 domain) that have been identified in amoebae, Apusozoa, and the anole lizard. Surprisingly, with one exception we did not identify any alternative splice forms for WASP family proteins, which is in strong contrast to other actin-binding proteins like Ena/VASP, MIM, or NHS proteins that share domains with WASP proteins. Conclusions Our analysis showed that the last common ancestor of the eukaryotes must have contained a homolog of WASP, WAVE, and WASH. Specific families have subsequently been lost in many taxa like the WASPs in plants, algae, Stramenopiles, and Euglenozoa, and the WASH proteins in fungi. The WHAMM proteins are metazoa specific and have most probably been invented by the Eumetazoa. The diversity of WASP family proteins has strongly been increased by many species- and taxon-specific gene duplications and multimerisations. All data is freely accessible via http://www.cymobase.org.
Collapse
Affiliation(s)
- Martin Kollmar
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany.
| | | | | |
Collapse
|
35
|
Hatje K, Kollmar M. A phylogenetic analysis of the brassicales clade based on an alignment-free sequence comparison method. Front Plant Sci 2012; 3:192. [PMID: 22952468 PMCID: PMC3429886 DOI: 10.3389/fpls.2012.00192] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2012] [Accepted: 08/06/2012] [Indexed: 05/06/2023]
Abstract
Phylogenetic analyses reveal the evolutionary derivation of species. A phylogenetic tree can be inferred from multiple sequence alignments of proteins or genes. The alignment of whole genome sequences of higher eukaryotes is a computational intensive and ambitious task as is the computation of phylogenetic trees based on these alignments. To overcome these limitations, we here used an alignment-free method to compare genomes of the Brassicales clade. For each nucleotide sequence a Chaos Game Representation (CGR) can be computed, which represents each nucleotide of the sequence as a point in a square defined by the four nucleotides as vertices. Each CGR is therefore a unique fingerprint of the underlying sequence. If the CGRs are divided by grid lines each grid square denotes the occurrence of oligonucleotides of a specific length in the sequence (Frequency Chaos Game Representation, FCGR). Here, we used distance measures between FCGRs to infer phylogenetic trees of Brassicales species. Three types of data were analyzed because of their different characteristics: (A) Whole genome assemblies as far as available for species belonging to the Malvidae taxon. (B) EST data of species of the Brassicales clade. (C) Mitochondrial genomes of the Rosids branch, a supergroup of the Malvidae. The trees reconstructed based on the Euclidean distance method are in general agreement with single gene trees. The Fitch-Margoliash and Neighbor joining algorithms resulted in similar to identical trees. Here, for the first time we have applied the bootstrap re-sampling concept to trees based on FCGRs to determine the support of the branchings. FCGRs have the advantage that they are fast to calculate, and can be used as additional information to alignment based data and morphological characteristics to improve the phylogenetic classification of species in ambiguous cases.
Collapse
Affiliation(s)
- Klas Hatje
- Abteilung NMR-Basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische ChemieGöttingen, Germany
| | - Martin Kollmar
- Abteilung NMR-Basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische ChemieGöttingen, Germany
- *Correspondence: Martin Kollmar, Abteilung NMR-Basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany. e-mail:
| |
Collapse
|
36
|
Hammesfahr B, Odronitz F, Hellkamp M, Kollmar M. diArk 2.0 provides detailed analyses of the ever increasing eukaryotic genome sequencing data. BMC Res Notes 2011; 4:338. [PMID: 21906294 PMCID: PMC3180467 DOI: 10.1186/1756-0500-4-338] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2011] [Accepted: 09/09/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Nowadays, the sequencing of even the largest mammalian genomes has become a question of days with current next-generation sequencing methods. It comes as no surprise that dozens of genome assemblies are released per months now. Since the number of next-generation sequencing machines increases worldwide and new major sequencing plans are announced, a further increase in the speed of releasing genome assemblies is expected. Thus it becomes increasingly important to get an overview as well as detailed information about available sequenced genomes. The different sequencing and assembly methods have specific characteristics that need to be known to evaluate the various genome assemblies before performing subsequent analyses. RESULTS diArk has been developed to provide fast and easy access to all sequenced eukaryotic genomes worldwide. Currently, diArk 2.0 contains information about more than 880 species and more than 2350 genome assembly files. Many meta-data like sequencing and read-assembly methods, sequencing coverage, GC-content, extended lists of alternatively used scientific names and common species names, and various kinds of statistics are provided. To intuitively approach the data the web interface makes extensive usage of modern web techniques. A number of search modules and result views facilitate finding and judging the data of interest. Subscribing to the RSS feed is the easiest way to stay up-to-date with the latest genome data. CONCLUSIONS diArk 2.0 is the most up-to-date database of sequenced eukaryotic genomes compared to databases like GOLD, NCBI Genome, NHGRI, and ISC. It is different in that only those projects are stored for which genome assembly data or considerable amounts of cDNA data are available. Projects in planning stage or in the process of being sequenced are not included. The user can easily search through the provided data and directly access the genome assembly files of the sequenced genome of interest. diArk 2.0 is available at http://www.diark.org.
Collapse
Affiliation(s)
- Björn Hammesfahr
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany
| | - Florian Odronitz
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany
| | - Marcel Hellkamp
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany
| | - Martin Kollmar
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany
| |
Collapse
|
37
|
Hatje K, Keller O, Hammesfahr B, Pillmann H, Waack S, Kollmar M. Cross-species protein sequence and gene structure prediction with fine-tuned Webscipio 2.0 and Scipio. BMC Res Notes 2011; 4:265. [PMID: 21798037 PMCID: PMC3162530 DOI: 10.1186/1756-0500-4-265] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Accepted: 07/28/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Obtaining transcripts of homologs of closely related organisms and retrieving the reconstructed exon-intron patterns of the genes is a very important process during the analysis of the evolution of a protein family and the comparative analysis of the exon-intron structure of a certain gene from different species. Due to the ever-increasing speed of genome sequencing, the gap to genome annotation is growing. Thus, tools for the correct prediction and reconstruction of genes in related organisms become more and more important. The tool Scipio, which can also be used via the graphical interface WebScipio, performs significant hit processing of the output of the Blat program to account for sequencing errors, missing sequence, and fragmented genome assemblies. However, Scipio has so far been limited to high sequence similarity and unable to reconstruct short exons. RESULTS Scipio and WebScipio have fundamentally been extended to better reconstruct very short exons and intron splice sites and to be better suited for cross-species gene structure predictions. The Needleman-Wunsch algorithm has been implemented for the search for short parts of the query sequence that were not recognized by Blat. Those regions might either be short exons, divergent sequence at intron splice sites, or very divergent exons. We have shown the benefit and use of new parameters with several protein examples from completely different protein families in searches against species from several kingdoms of the eukaryotes. The performance of the new Scipio version has been tested in comparison with several similar tools. CONCLUSIONS With the new version of Scipio very short exons, terminal and internal, of even just one amino acid can correctly be reconstructed. Scipio is also able to correctly predict almost all genes in cross-species searches even if the ancestors of the species separated more than 100 Myr ago and if the protein sequence identity is below 80%. For our test cases Scipio outperforms all other software tested. WebScipio has been restructured and provides easy access to the genome assemblies of about 640 eukaryotic species. Scipio and WebScipio are freely accessible at http://www.webscipio.org.
Collapse
Affiliation(s)
- Klas Hatje
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany.
| | | | | | | | | | | |
Collapse
|
38
|
Pillmann H, Hatje K, Odronitz F, Hammesfahr B, Kollmar M. Predicting mutually exclusive spliced exons based on exon length, splice site and reading frame conservation, and exon sequence homology. BMC Bioinformatics 2011; 12:270. [PMID: 21718515 PMCID: PMC3228551 DOI: 10.1186/1471-2105-12-270] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2011] [Accepted: 06/30/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Alternative splicing of pre-mature RNA is an important process eukaryotes utilize to increase their repertoire of different protein products. Several types of different alternative splice forms exist including exon skipping, differential splicing of exons at their 3'- or 5'-end, intron retention, and mutually exclusive splicing. The latter term is used for clusters of internal exons that are spliced in a mutually exclusive manner. RESULTS We have implemented an extension to the WebScipio software to search for mutually exclusive exons. Here, the search is based on the precondition that mutually exclusive exons encode regions of the same structural part of the protein product. This precondition provides restrictions to the search for candidate exons concerning their length, splice site conservation and reading frame preservation, and overall homology. Mutually exclusive exons that are not homologous and not of about the same length will not be found. Using the new algorithm, mutually exclusive exons in several example genes, a dynein heavy chain, a muscle myosin heavy chain, and Dscam were correctly identified. In addition, the algorithm was applied to the whole Drosophila melanogaster X chromosome and the results were compared to the Flybase annotation and an ab initio prediction. Clusters of mutually exclusive exons might be subsequent to each other and might encode dozens of exons. CONCLUSIONS This is the first implementation of an automatic search for mutually exclusive exons in eukaryotes. Exons are predicted and reconstructed in the same run providing the complete gene structure for the protein query of interest. WebScipio offers high quality gene structure figures with the clusters of mutually exclusive exons colour-coded, and several analysis tools for further manual inspection. The genome scale analysis of all genes of the Drosophila melanogaster X chromosome showed that WebScipio is able to find all but two of the 28 annotated mutually exclusive spliced exons and predicts 39 new candidate exons. Thus, WebScipio should be able to identify mutually exclusive spliced exons in any query sequence from any species with a very high probability. WebScipio is freely available to academics at http://www.webscipio.org.
Collapse
Affiliation(s)
- Holger Pillmann
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany
| | | | | | | | | |
Collapse
|
39
|
Abstract
MOTIVATION As improved DNA sequencing techniques have increased enormously the speed of producing new eukaryotic genome assemblies, the further development of automated gene prediction methods continues to be essential. While the classification of proteins into families is a task heavily relying on correct gene predictions, it can at the same time provide a source of additional information for the prediction, complementary to those presently used. RESULTS We extended the gene prediction software AUGUSTUS by a method that employs block profiles generated from multiple sequence alignments as a protein signature to improve the accuracy of the prediction. Equipped with profiles modelling human dynein heavy chain (DHC) proteins and other families, AUGUSTUS was run on the genomic sequences known to contain members of these families. Compared with AUGUSTUS' ab initio version, the rate of genes predicted with high accuracy showed a dramatic increase. AVAILABILITY The AUGUSTUS project web page is located at http://augustus.gobics.de, with the executable program as well as the source code available for download.
Collapse
Affiliation(s)
- Oliver Keller
- Institute of Computer Science, University of Göttingen, Goldschmidtstrasse 7, Greifswald, Germany.
| | | | | | | |
Collapse
|
40
|
Hammesfahr B, Odronitz F, Kollmar M. Cymobase - the Reference Database for Cytoskeletal and Motor Proteins. Biophys J 2010. [DOI: 10.1016/j.bpj.2009.12.3036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
|
41
|
Kollmar M. News from the Myosin Tree: 1000 New Sequences, 100 New Species, 1 New Class. Biophys J 2010. [DOI: 10.1016/j.bpj.2009.12.1236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|
42
|
Odronitz F, Becker S, Kollmar M. Reconstructing the phylogeny of 21 completely sequenced arthropod species based on their motor proteins. BMC Genomics 2009; 10:173. [PMID: 19383156 PMCID: PMC2674883 DOI: 10.1186/1471-2164-10-173] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2008] [Accepted: 04/21/2009] [Indexed: 01/11/2023] Open
Abstract
BACKGROUND Motor proteins have extensively been studied in the past and consist of large superfamilies. They are involved in diverse processes like cell division, cellular transport, neuronal transport processes, or muscle contraction, to name a few. Vertebrates contain up to 60 myosins and about the same number of kinesins that are spread over more than a dozen distinct classes. RESULTS Here, we present the comparative genomic analysis of the motor protein repertoire of 21 completely sequenced arthropod species using the owl limpet Lottia gigantea as outgroup. Arthropods contain up to 17 myosins grouped into 13 classes. The myosins are in almost all cases clear paralogs, and thus the evolution of the arthropod myosin inventory is mainly determined by gene losses. Arthropod species contain up to 29 kinesins spread over 13 classes. In contrast to the myosins, the evolution of the arthropod kinesin inventory is not only determined by gene losses but also by many subtaxon-specific and species-specific gene duplications. All arthropods contain each of the subunits of the cytoplasmic dynein/dynactin complex. Except for the dynein light chains and the p150 dynactin subunit they contain single gene copies of the other subunits. Especially the roadblock light chain repertoire is very species-specific. CONCLUSION All 21 completely sequenced arthropods, including the twelve sequenced Drosophila species, contain a species-specific set of motor proteins. The phylogenetic analysis of all genes as well as the protein repertoire placed Daphnia pulex closest to the root of the Arthropoda. The louse Pediculus humanus corporis is the closest relative to Daphnia followed by the group of the honeybee Apis mellifera and the jewel wasp Nasonia vitripennis. After this group the rust-red flour beetle Tribolium castaneum and the silkworm Bombyx mori diverged very closely from the lineage leading to the Drosophila species.
Collapse
Affiliation(s)
- Florian Odronitz
- Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Goettingen, Germany
| | - Sebastian Becker
- Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Goettingen, Germany
| | - Martin Kollmar
- Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Goettingen, Germany
| |
Collapse
|
43
|
Odronitz F, Pillmann H, Keller O, Waack S, Kollmar M. WebScipio: an online tool for the determination of gene structures using protein sequences. BMC Genomics 2008; 9:422. [PMID: 18801164 PMCID: PMC2644328 DOI: 10.1186/1471-2164-9-422] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2008] [Accepted: 09/18/2008] [Indexed: 11/13/2022] Open
Abstract
Background Obtaining the gene structure for a given protein encoding gene is an important step in many analyses. A software suited for this task should be readily accessible, accurate, easy to handle and should provide the user with a coherent representation of the most probable gene structure. It should be rigorous enough to optimise features on the level of single bases and at the same time flexible enough to allow for cross-species searches. Results WebScipio, a web interface to the Scipio software, allows a user to obtain the corresponding coding sequence structure of a here given a query protein sequence that belongs to an already assembled eukaryotic genome. The resulting gene structure is presented in various human readable formats like a schematic representation, and a detailed alignment of the query and the target sequence highlighting any discrepancies. WebScipio can also be used to identify and characterise the gene structures of homologs in related organisms. In addition, it offers a web service for integration with other programs. Conclusion WebScipio is a tool that allows users to get a high-quality gene structure prediction from a protein query. It offers more than 250 eukaryotic genomes that can be searched and produces predictions that are close to what can be achieved by manual annotation, for in-species and cross-species searches alike. WebScipio is freely accessible at .
Collapse
Affiliation(s)
- Florian Odronitz
- Max-Planck-Institut für Biophysikalische Chemie, Abteilung NMR-basierte Strukturbiologie, Am Fassberg 11, 37077 Göttingen, Germany.
| | | | | | | | | |
Collapse
|
44
|
Keller O, Odronitz F, Stanke M, Kollmar M, Waack S. Scipio: using protein sequences to determine the precise exon/intron structures of genes and their orthologs in closely related species. BMC Bioinformatics 2008; 9:278. [PMID: 18554390 PMCID: PMC2442105 DOI: 10.1186/1471-2105-9-278] [Citation(s) in RCA: 88] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2008] [Accepted: 06/13/2008] [Indexed: 11/10/2022] Open
Abstract
Background For many types of analyses, data about gene structure and locations of non-coding regions of genes are required. Although a vast amount of genomic sequence data is available, precise annotation of genes is lacking behind. Finding the corresponding gene of a given protein sequence by means of conventional tools is error prone, and cannot be completed without manual inspection, which is time consuming and requires considerable experience. Results Scipio is a tool based on the alignment program BLAT to determine the precise gene structure given a protein sequence and a genome sequence. It identifies intron-exon borders and splice sites and is able to cope with sequencing errors and genes spanning several contigs in genomes that have not yet been assembled to supercontigs or chromosomes. Instead of producing a set of hits with varying confidence, Scipio gives the user a coherent summary of locations on the genome that code for the query protein. The output contains information about discrepancies that may result from sequencing errors. Scipio has also successfully been used to find homologous genes in closely related species. Scipio was tested with 979 protein queries against 16 arthropod genomes (intra species search). For cross-species annotation, Scipio was used to annotate 40 genes from Homo sapiens in the primates Pongo pygmaeus abelii and Callithrix jacchus. The prediction quality of Scipio was tested in a comparative study against that of BLAT and the well established program Exonerate. Conclusion Scipio is able to precisely map a protein query onto a genome. Even in cases when there are many sequencing errors, or when incomplete genome assemblies lead to hits that stretch across multiple target sequences, it very often provides the user with the correct determination of intron-exon borders and splice sites, showing an improved prediction accuracy compared to BLAT and Exonerate. Apart from being able to find genes in the genome that encode the query protein, Scipio can also be used to annotate genes in closely related species.
Collapse
Affiliation(s)
- Oliver Keller
- Universität Göttingen, Institut für Informatik, Lotzestr. 16-18, 37083 Göttingen, Germany.
| | | | | | | | | |
Collapse
|
45
|
Odronitz F, Kollmar M. Drawing the tree of eukaryotic life based on the analysis of 2,269 manually annotated myosins from 328 species. Genome Biol 2008; 8:R196. [PMID: 17877792 PMCID: PMC2375034 DOI: 10.1186/gb-2007-8-9-r196] [Citation(s) in RCA: 273] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2007] [Revised: 09/17/2007] [Accepted: 09/18/2007] [Indexed: 01/03/2023] Open
Abstract
The tree of eukaryotic life was reconstructed based on the analysis of 2,269 myosin motor domains from 328 organisms, confirming some accepted relationships of major taxa and resolving disputed and preliminary classifications. Background The evolutionary history of organisms is expressed in phylogenetic trees. The most widely used phylogenetic trees describing the evolution of all organisms have been constructed based on single-gene phylogenies that, however, often produce conflicting results. Incongruence between phylogenetic trees can result from the violation of the orthology assumption and stochastic and systematic errors. Results Here, we have reconstructed the tree of eukaryotic life based on the analysis of 2,269 myosin motor domains from 328 organisms. All sequences were manually annotated and verified, and were grouped into 35 myosin classes, of which 16 have not been proposed previously. The resultant phylogenetic tree confirms some accepted relationships of major taxa and resolves disputed and preliminary classifications. We place the Viridiplantae after the separation of Euglenozoa, Alveolata, and Stramenopiles, we suggest a monophyletic origin of Entamoebidae, Acanthamoebidae, and Dictyosteliida, and provide evidence for the asynchronous evolution of the Mammalia and Fungi. Conclusion Our analysis of the myosins allowed combining phylogenetic information derived from class-specific trees with the information of myosin class evolution and distribution. This approach is expected to result in superior accuracy compared to single-gene or phylogenomic analyses because the orthology problem is resolved and a strong determinant not depending on any technical uncertainties is incorporated, the class distribution. Combining our analysis of the myosins with high quality analyses of other protein families, for example, that of the kinesins, could help in resolving still questionable dependencies at the origin of eukaryotic life.
Collapse
Affiliation(s)
- Florian Odronitz
- Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg, 37077 Goettingen, Germany
| | - Martin Kollmar
- Department of NMR-based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Am Fassberg, 37077 Goettingen, Germany
| |
Collapse
|
46
|
Odronitz F, Kollmar M. Comparative genomic analysis of the arthropod muscle myosin heavy chain genes allows ancestral gene reconstruction and reveals a new type of 'partially' processed pseudogene. BMC Mol Biol 2008; 9:21. [PMID: 18254963 PMCID: PMC2257972 DOI: 10.1186/1471-2199-9-21] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2007] [Accepted: 02/06/2008] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Alternative splicing of mutually exclusive exons is an important mechanism for increasing protein diversity in eukaryotes. The insect Mhc (myosin heavy chain) gene produces all different muscle myosins as a result of alternative splicing in contrast to most other organisms of the Metazoa lineage, that have a family of muscle genes with each gene coding for a protein specialized for a functional niche. RESULTS The muscle myosin heavy chain genes of 22 species of the Arthropoda ranging from the waterflea to wasp and Drosophila have been annotated. The analysis of the gene structures allowed the reconstruction of an ancient muscle myosin heavy chain gene and showed that during evolution of the arthropods introns have mainly been lost in these genes although intron gain might have happened in a few cases. Surprisingly, the genome of Aedes aegypti contains another and that of Culex pipiens quinquefasciatus two further muscle myosin heavy chain genes, called Mhc3 and Mhc4, that contain only one variant of the corresponding alternative exons of the Mhc1 gene. Mhc3 transcription in Aedes aegypti is documented by EST data. Mhc3 and Mhc4 inserted in the Aedes and Culex genomes either by gene duplication followed by the loss of all but one variant of the alternative exons, or by incorporation of a transcript of which all other variants have been spliced out retaining the exon-intron structure. The second and more likely possibility represents a new type of a 'partially' processed pseudogene. CONCLUSION Based on the comparative genomic analysis of the alternatively spliced arthropod muscle myosin heavy chain genes we propose that the splicing process operates sequentially on the transcript. The process consists of the splicing of the mutually exclusive exons until one exon out of the cluster remains while retaining surrounding intronic sequence. In a second step splicing of introns takes place. A related mechanism could be responsible for the splicing of other genes containing mutually exclusive exons.
Collapse
Affiliation(s)
- Florian Odronitz
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany
| | - Martin Kollmar
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany
| |
Collapse
|
47
|
Abstract
BACKGROUND Dictyostelium discoideum is one of the most famous model organisms for studying motile processes like cell movement, organelle transport, cytokinesis, and endocytosis. Members of the myosin superfamily, that move on actin filaments and power many of these tasks, are tripartite proteins consisting of a conserved catalytic domain followed by the neck region consisting of a different number of so-called IQ motifs for binding of light chains. The tails contain functional motifs that are responsible for the accomplishment of the different tasks in the cell. Unicellular organisms like yeasts contain three to five myosins while vertebrates express over 40 different myosin genes. Recently, the question has been raised how many myosins a simple multicellular organism like Dictyostelium would need to accomplish all the different motility-related tasks. RESULTS The analysis of the Dictyostelium genome revealed thirteen myosins of which three have not been described before. The phylogenetic analysis of the motor domains of the new myosins placed Myo1F to the class-I myosins and Myo5A to the class-V myosins. The third new myosin, an orphan myosin, has been named MyoG. It contains an N-terminal extension of over 400 residues, and a tail consisting of four IQ motifs and two MyTH4/FERM (myosin tail homology 4/band 4.1, ezrin, radixin, and moesin) tandem domains that are separated by a long region containing an SH3 (src homology 3) domain. In contrast to previous analyses, an extensive comparison with 126 class-VII, class-X, class-XV, and class-XXII myosins now showed that MyoI does not group into any of these classes and should not be used as a model for class-VII myosins.The search for calmodulin related proteins revealed two further potential myosin light chains. One is a close homolog of the two EF-hand motifs containing MlcB, and the other, CBP14, phylogenetically groups to the ELC/RLC/calmodulin (essential light chain/regulatory light chain) branch of the tree. CONCLUSION Dictyostelium contains thirteen myosins together with 6-8 MLCs (myosin light chain) to assist in a variety of actin-based processes in the cell. Although they are homologous to myosins of higher eukaryotes, the myosins of Dictyostelium should be considered with care as models for specific functions of vertebrate myosins.
Collapse
Affiliation(s)
- Martin Kollmar
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Goettingen, Germany.
| |
Collapse
|
48
|
Kollmar M. Use of the myosin motor domain as large-affinity tag for the expression and purification of proteins in Dictyostelium discoideum. Int J Biol Macromol 2006; 39:37-44. [PMID: 16516959 DOI: 10.1016/j.ijbiomac.2006.01.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2005] [Revised: 01/17/2006] [Accepted: 01/18/2006] [Indexed: 11/25/2022]
Abstract
The cellular slime mold Dictyostelium discoideum is increasingly be used for the overexpression of proteins. Dictyostelium is amenable to classical and molecular genetic approaches and can easily be grown in large quantities. It contains a variety of chaperones and folding enzymes, and is able to perform all kinds of post-translational protein modifications. Here, new expression vectors are presented that have been designed for the production of proteins in large quantities for biochemical and structural studies. The expression cassettes of the most successful vectors are based on a tandem affinity purification tag consisting of an octahistidine tag followed by the myosin motor domain tag. The myosin motor domain not only strongly enhances the production of fused proteins but is also used for a fast affinity purification step through its ATP-dependent binding to actin. The applicability of the new system has been demonstrated for the expression and purification of subunits of the dynein-dynactin motor protein complex from different species.
Collapse
Affiliation(s)
- Martin Kollmar
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Fassberg 11, D-37077 Göttingen, Germany.
| |
Collapse
|
49
|
Abstract
Background Kinesins constitute a large superfamily of motor proteins in eukaryotic cells. They perform diverse tasks such as vesicle and organelle transport and chromosomal segregation in a microtubule- and ATP-dependent manner. In recent years, the genomes of a number of eukaryotic organisms have been completely sequenced. Subsequent studies revealed and classified the full set of members of the kinesin superfamily expressed by these organisms. For Dictyostelium discoideum, only five kinesin superfamily proteins (Kif's) have already been reported. Results Here, we report the identification of thirteen kinesin genes exploiting the information from the raw shotgun reads of the Dictyostelium discoideum genome project. A phylogenetic tree of 390 kinesin motor domain sequences was built, grouping the Dictyostelium kinesins into nine subfamilies. According to known cellular functions or strong homologies to kinesins of other organisms, four of the Dictyostelium kinesins are involved in organelle transport, six are implicated in cell division processes, two are predicted to perform multiple functions, and one kinesin may be the founder of a new subclass. Conclusion This analysis of the Dictyostelium genome led to the identification of eight new kinesin motor proteins. According to an exhaustive phylogenetic comparison, Dictyostelium contains the same subset of kinesins that higher eukaryotes need to perform mitosis. Some of the kinesins are implicated in intracellular traffic and a small number have unpredictable functions.
Collapse
Affiliation(s)
- Martin Kollmar
- Abteilung NMR basierte Strukturbiologie, Max-Planck-Institut für Biophysikalische Chemie, Am Faβberg 11, D-37077 Göttingen, Germany
| | - Gernot Glöckner
- Abteilung Genom-Analyse, Institut für Molekulare Biotechnologie, Beutenbergstr. 11, D-07745 Jena, Germany
| |
Collapse
|
50
|
Kollmar M, Helmchen G. An (η1-Allyl)palladium Complex of a Chiral Bidentate Ligand: Crystallographic and NMR Studies on a (η1-3,3-Diphenylallyl)(phosphinooxazoline)palladium Complex. Organometallics 2002. [DOI: 10.1021/om020323z] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Martin Kollmar
- Organisch-chemisches Institut, Universität Heidelberg, Im Neuenheimer Feld 270, D-69120 Heidelberg, Germany
| | - Günter Helmchen
- Organisch-chemisches Institut, Universität Heidelberg, Im Neuenheimer Feld 270, D-69120 Heidelberg, Germany
| |
Collapse
|