1
|
Dheilly NM, Adema C, Raftos DA, Gourbal B, Grunau C, Du Pasquier L. No more non-model species: the promise of next generation sequencing for comparative immunology. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2014; 45:56-66. [PMID: 24508980 PMCID: PMC4096995 DOI: 10.1016/j.dci.2014.01.022] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Revised: 01/20/2014] [Accepted: 01/21/2014] [Indexed: 05/21/2023]
Abstract
Next generation sequencing (NGS) allows for the rapid, comprehensive and cost effective analysis of entire genomes and transcriptomes. NGS provides approaches for immune response gene discovery, profiling gene expression over the course of parasitosis, studying mechanisms of diversification of immune receptors and investigating the role of epigenetic mechanisms in regulating immune gene expression and/or diversification. NGS will allow meaningful comparisons to be made between organisms from different taxa in an effort to understand the selection of diverse strategies for host defence under different environmental pathogen pressures. At the same time, it will reveal the shared and unique components of the immunological toolkit and basic functional aspects that are essential for immune defence throughout the living world. In this review, we argue that NGS will revolutionize our understanding of immune responses throughout the animal kingdom because the depth of information it provides will circumvent the need to concentrate on a few "model" species.
Collapse
Affiliation(s)
- Nolwenn M Dheilly
- CNRS, UMR 5244, Ecologie et Evolution des Interactions (2EI), Perpignan F-66860, France; Université de Perpignan Via Domitia, Perpignan F-66860, France.
| | - Coen Adema
- Center for Evolutionary and Theoretical Immunology, Biology Department, University of New Mexico, Albuquerque, NM 87131, USA
| | - David A Raftos
- Department of Biological Sciences, Macquarie University, North Ryde, NSW 2109, Australia
| | - Benjamin Gourbal
- CNRS, UMR 5244, Ecologie et Evolution des Interactions (2EI), Perpignan F-66860, France; Université de Perpignan Via Domitia, Perpignan F-66860, France
| | - Christoph Grunau
- CNRS, UMR 5244, Ecologie et Evolution des Interactions (2EI), Perpignan F-66860, France; Université de Perpignan Via Domitia, Perpignan F-66860, France
| | - Louis Du Pasquier
- University of Basel, Institute of Zoology and Evolutionary Biology, Basel, Switzerland
| |
Collapse
|
2
|
Kumar S, Jones M, Koutsovoulos G, Clarke M, Blaxter M. Blobology: exploring raw genome data for contaminants, symbionts and parasites using taxon-annotated GC-coverage plots. Front Genet 2013; 4:237. [PMID: 24348509 PMCID: PMC3843372 DOI: 10.3389/fgene.2013.00237] [Citation(s) in RCA: 193] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2013] [Accepted: 10/23/2013] [Indexed: 12/16/2022] Open
Abstract
Generating the raw data for a de novo genome assembly project for a target eukaryotic species is relatively easy. This democratization of access to large-scale data has allowed many research teams to plan to assemble the genomes of non-model organisms. These new genome targets are very different from the traditional, inbred, laboratory-reared model organisms. They are often small, and cannot be isolated free of their environment – whether ingested food, the surrounding host organism of parasites, or commensal and symbiotic organisms attached to or within the individuals sampled. Preparation of pure DNA originating from a single species can be technically impossible, but assembly of mixed-organism DNA can be difficult, as most genome assemblers perform poorly when faced with multiple genomes in different stoichiometries. This class of problem is common in metagenomic datasets that deliberately try to capture all the genomes present in an environment, but replicon assembly is not often the goal of such programs. Here we present an approach to extracting, from mixed DNA sequence data, subsets that correspond to single species’ genomes and thus improving genome assembly. We use both numerical (proportion of GC bases and read coverage) and biological (best-matching sequence in annotated databases) indicators to aid partitioning of draft assembly contigs, and the reads that contribute to those contigs, into distinct bins that can then be subjected to rigorous, optimized assembly, through the use of taxon-annotated GC-coverage plots (TAGC plots). We also present Blobsplorer, a tool that aids exploration and selection of subsets from TAGC-annotated data. Partitioning the data in this way can rescue poorly assembled genomes, and reveal unexpected symbionts and commensals in eukaryotic genome projects. The TAGC plot pipeline script is available from https://github.com/blaxterlab/blobology, and the Blobsplorer tool from https://github.com/mojones/Blobsplorer.
Collapse
Affiliation(s)
- Sujai Kumar
- Institute of Evolutionary Biology, Ashworth Laboratories, University of Edinburgh Edinburgh, UK
| | - Martin Jones
- Institute of Evolutionary Biology, Ashworth Laboratories, University of Edinburgh Edinburgh, UK
| | - Georgios Koutsovoulos
- Institute of Evolutionary Biology, Ashworth Laboratories, University of Edinburgh Edinburgh, UK
| | - Michael Clarke
- Institute of Evolutionary Biology, Ashworth Laboratories, University of Edinburgh Edinburgh, UK
| | - Mark Blaxter
- Institute of Evolutionary Biology, Ashworth Laboratories, University of Edinburgh Edinburgh, UK ; Edinburgh Genomics, University of Edinburgh Edinburgh, UK
| |
Collapse
|
3
|
Emmersen J, Rudd S, Mewes HW, Tetko IV. Separation of sequences from host-pathogen interface using triplet nucleotide frequencies. Fungal Genet Biol 2007; 44:231-41. [PMID: 17218127 DOI: 10.1016/j.fgb.2006.11.010] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2006] [Revised: 10/22/2006] [Accepted: 11/27/2006] [Indexed: 11/22/2022]
Abstract
The identification of genes involved in host-pathogen interactions is important for the elucidation of mechanisms of disease resistance and host susceptibility. A traditional way to classify the origin of genes sampled from a pool of mixed cDNA is through sequence similarity to known genes from either the pathogen or host organism or other closely related species. This approach does not work when the identified sequence has no close homologues in the sequence databases. In our previous studies, we classified genes using their codon frequencies. This method, however, explicitly required the prediction of CDS regions and thus could not be applied to sequences composed from the non-coding regions of genes. In this study, we show that the use of sliding-window triplet frequencies extends the application of the algorithm to both coding and non-coding sequences and also increases the prediction accuracy of a Support Vector Machine classifier from 95.6+/-0.3 to 96.5+/-0.2. Thus the use of the triplet frequencies increased the prediction accuracy of the new method by more than 20% compared to our previous approach. A functional analysis of sequences detected gene families having significantly higher or lower probability to be correctly classified compared to the average accuracy of the method is described. The server to perform classification of EST sequences using triplet frequencies is available at (URL: http://mips.gsf.de/proj/est3).
Collapse
Affiliation(s)
- Jeppe Emmersen
- Institut for Miljø og Bioteknologi, Aalborg Universitet, Sohngaardsholmsvej 49, 9000 Aalborg, Denmark
| | | | | | | |
Collapse
|
4
|
Jiang RHY, Govers F. Nonneutral GC3 and retroelement codon mimicry in Phytophthora. J Mol Evol 2006; 63:458-72. [PMID: 16955239 DOI: 10.1007/s00239-005-0211-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2005] [Accepted: 05/20/2006] [Indexed: 10/24/2022]
Abstract
Phytophthora is a genus entirely comprised of destructive plant pathogens. It belongs to the Stramenopila, a unique branch of eukaryotes, phylogenetically distinct from plants, animals, or fungi. Phytophthora genes show a strong preference for usage of codons ending with G or C (high GC3). The presence of high GC3 in genes can be utilized to differentiate coding regions from noncoding regions in the genome. We found that both selective pressure and mutation bias drive codon bias in Phytophthora. Indicative for selection pressure is the higher GC3 value of highly expressed genes in different Phytophthora species. Lineage specific GC increase of noncoding regions is reminiscent of whole-genome mutation bias, whereas the elevated Phytophthora GC3 is primarily a result of translation efficiency-driven selection. Heterogeneous retrotransposons exist in Phytophthora genomes and many of them vary in their GC content. Interestingly, the most widespread groups of retroelements in Phytophthora show high GC3 and a codon bias that is similar to host genes. Apparently, selection pressure has been exerted on the retroelement's codon usage, and such mimicry of host codon bias might be beneficial for the propagation of retrotransposons.
Collapse
Affiliation(s)
- Rays H Y Jiang
- Laboratory of Phytopathology, Plant Sciences Group, and Graduate School of Experimental Plant Sciences, Wageningen University, Binnenhaven 5, NL-6709 PD, Wageningen, The Netherlands
| | | |
Collapse
|
5
|
Hohnjec N, Henckel K, Bekel T, Gouzy J, Dondrup M, Goesmann A, Küster H. Transcriptional snapshots provide insights into the molecular basis of arbuscular mycorrhiza in the model legume Medicago truncatula. FUNCTIONAL PLANT BIOLOGY : FPB 2006; 33:737-748. [PMID: 32689284 DOI: 10.1071/fp06079] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2006] [Accepted: 06/15/2006] [Indexed: 06/11/2023]
Abstract
The arbuscular mycorrhizal (AM) association between terrestrial plants and soil fungi of the phylum Glomeromycota is the most widespread beneficial plant-microbe interaction on earth. In the course of the symbiosis, fungal hyphae colonise plant roots and supply limiting nutrients, in particular phosphorus, in exchange for carbon compounds. Owing to the obligate biotrophy of mycorrhizal fungi and the lack of genetic systems to study them, targeted molecular studies on AM symbioses proved to be difficult. With the emergence of plant genomics and the selection of suitable models, an application of untargeted expression profiling experiments became possible. In the model legume Medicago truncatula, high-throughput expressed sequence tag (EST)-sequencing in conjunction with in silico and experimental transcriptome profiling provided transcriptional snapshots that together defined the global genetic program activated during AM. Owing to an asynchronous development of the symbiosis, several hundred genes found to be activated during the symbiosis cannot be easily correlated with symbiotic structures, but the expression of selected genes has been extended to the cellular level to correlate gene expression with specific stages of AM development. These approaches identified marker genes for the AM symbiosis and provided the first insights into the molecular basis of gene expression regulation during AM.
Collapse
Affiliation(s)
- Natalija Hohnjec
- Institute for Genome Research, Center for Biotechnology (CeBiTec), Bielefeld University, D-33594 Bielefeld, Germany
| | - Kolja Henckel
- Bioinformatics Resource Facility, Center for Biotechnology (CeBiTec), Bielefeld University, D-33594 Bielefeld, Germany
| | - Thomas Bekel
- Bioinformatics Resource Facility, Center for Biotechnology (CeBiTec), Bielefeld University, D-33594 Bielefeld, Germany
| | - Jerome Gouzy
- Laboratoire des Interactions Plantes Micro-organismes LIPM, Chemin de Borde-Rouge-Auzeville, BP 52627, 31326 Castanet Tolosan, Cedex, France
| | - Michael Dondrup
- International Graduate School in Bioinformatics and Genome Research, Center for Biotechnology (CeBiTec), Bielefeld University, D-33594 Bielefeld, Germany
| | - Alexander Goesmann
- Bioinformatics Resource Facility, Center for Biotechnology (CeBiTec), Bielefeld University, D-33594 Bielefeld, Germany
| | - Helge Küster
- Institute for Genome Research, Center for Biotechnology (CeBiTec), Bielefeld University, D-33594 Bielefeld, Germany
| |
Collapse
|
6
|
Win J, Kanneganti TD, Torto-Alalibo T, Kamoun S. Computational and comparative analyses of 150 full-length cDNA sequences from the oomycete plant pathogen Phytophthora infestans. Fungal Genet Biol 2006; 43:20-33. [PMID: 16380277 DOI: 10.1016/j.fgb.2005.10.003] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2005] [Revised: 10/05/2005] [Accepted: 10/05/2005] [Indexed: 11/16/2022]
Abstract
Phytophthora infestans is a devastating phytopathogenic oomycete that causes late blight on tomato and potato. Recent genome sequencing efforts of P. infestans and other Phytophthora species are generating vast amounts of sequence data providing opportunities to unlock the complex nature of pathogenesis. However, accurate annotation of Phytophthora genomes will be a significant challenge. Most of the information about gene structure in these species was gathered from a handful of genes resulting in significant limitations for development of ab initio gene-calling programs. In this study, we collected a total of 150 bioinformatically determined near full-length cDNA (FLcDNA) sequences of P. infestans that were predicted to contain full open reading frame sequences. We performed detailed computational analyses of these FLcDNA sequences to obtain a snapshot of P. infestans gene structure, gauge the degree of sequence conservation between P. infestans genes and those of Phytophthora sojae and Phytophthora ramorum, and identify patterns of gene conservation between P. infestans and various eukaryotes, particularly fungi, for which genome-wide translated protein sequences are available. These analyses helped us to define the structural characteristics of P. infestans genes using a validated data set. We also determined the degree of sequence conservation within the genus Phytophthora and identified a set of fast evolving genes. Finally, we identified a set of genes that are shared between Phytophthora and fungal phytopathogens but absent in animal fungal pathogens. These results confirm that plant pathogenic oomycetes and fungi share virulence components, and suggest that eukaryotic microbial pathogens that share similar lifestyles also share a similar set of genes independently of their phylogenetic relatedness.
Collapse
Affiliation(s)
- Joe Win
- Department of Plant Pathology, The Ohio State University, Ohio Agricultural Research and Development Center, Wooster, OH 44691, USA
| | | | | | | |
Collapse
|
7
|
Rudd S, Tetko IV. Eclair--a web service for unravelling species origin of sequences sampled from mixed host interfaces. Nucleic Acids Res 2005; 33:W724-7. [PMID: 15980572 PMCID: PMC1160195 DOI: 10.1093/nar/gki434] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The identification of the genes that participate at the biological interface of two species remains critical to our understanding of the mechanisms of disease resistance, disease susceptibility and symbiosis. The sequencing of complementary DNA (cDNA) libraries prepared from the biological interface between two organisms provides an inexpensive way to identify the novel genes that may be expressed as a cause or consequence of compatible or incompatible interactions. Sequence classification and annotation of species origin typically use an orthology-based approach and require access to large portions of either genome, or a close relative. Novel species- or clade-specific sequences may have no counterpart within existing databases and remain ambiguous features. Here we present a web-service, Eclair, which utilizes support vector machines for the classification of the origin of expressed sequence tags stemming from mixed host cDNA libraries. In addition to providing an interface for the classification of sequences, users are presented with the opportunity to train a model to suit their preferred species pair. Eclair is freely available at http://eclair.btk.fi.
Collapse
Affiliation(s)
- Stephen Rudd
- Centre for Biotechnology, Tykistökatu 6 FIN-20521, Turku, Finland.
| | | |
Collapse
|
8
|
Meijer HJG, Latijnhouwers M, Ligterink W, Govers F. A transmembrane phospholipase D in Phytophthora; a novel PLD subfamily. Gene 2005; 350:173-82. [PMID: 15826868 DOI: 10.1016/j.gene.2005.02.012] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2004] [Accepted: 02/22/2005] [Indexed: 11/18/2022]
Abstract
Phospholipase D (PLD) is a ubiquitous enzyme in eukaryotes that participates in various cellular processes. Its catalytic domain is characterized by two HKD motifs in the C-terminal part. Until now, two subfamilies were recognized based on their N-terminal domain structure. The first has a PX domain in combination with a PH domain and is designated as PXPH-PLD. Members of the second subfamily, named C2-PLD, have a C2 domain and have, so far, only been found in plants. Here we describe a novel PLD subfamily that we identified in Phytophthora, a genus belonging to the class oomycetes and comprising many important plant pathogens. We cloned Pipld1 from Phytophthora infestans and retrieved full-length sequences of its homologues from Phytophthora sojae and Phytophthora ramorum genome databases. Their promoters contain two putative regulatory elements, one of which is highly conserved in all three genes. The three Phytophthora pld1 genes encode nearly identical proteins of around 1807 amino acids, with the two characteristic HKD motifs in the C-terminal part. Homology of the predicted proteins with known PLDs however is restricted to the two catalytic HKD motifs and adjacent domains. In the N-terminal part Phytophthora PLD1 has a PX-like domain, but it lacks a PH domain. Instead the N-terminal region contains five putative membrane spanning domains suggesting that Phytophthora PLD1 is a transmembrane protein. Since Phytophthora PLD1 cannot be categorized in one of the two existing subfamilies we propose to create a novel subfamily named PXTM-PLD.
Collapse
Affiliation(s)
- Harold J G Meijer
- Laboratory of Phytopathology, Plant Sciences Group, Wageningen University, Binnenhaven 5, NL-6709 PD Wageningen, The Netherlands
| | | | | | | |
Collapse
|
9
|
Jiang RHY, Dawe AL, Weide R, van Staveren M, Peters S, Nuss DL, Govers F. Elicitin genes in Phytophthora infestans are clustered and interspersed with various transposon-like elements. Mol Genet Genomics 2005; 273:20-32. [PMID: 15702346 DOI: 10.1007/s00438-005-1114-0] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2004] [Accepted: 12/21/2004] [Indexed: 10/25/2022]
Abstract
Sequencing and annotation of a contiguous stretch of genomic DNA (112.3 kb) from the oomycete plant pathogen Phytophthora infestans revealed the order, spacing and genomic context of four members of the elicitin (inf) gene family. Analysis of the GC content at the third codon position (GC3) of six genes encoded in the region, and a set of randomly selected coding regions as well as random genomic regions, showed that a high GC3 value is a general feature of Phytophthora genes that can be exploited to optimize gene prediction programs for Phytophthora species. At least one-third of the annotated 112.3-kb P. infestans sequence consisted of transposons or transposon-like elements. The most prominent were four Tc3/gypsy and Tc1/copia type retrotransposons and three DNA transposons that belong to the Tc1/mariner, Pogo and PiggyBac groups, respectively. Comparative analysis of other available genomic sequences suggests that transposable elements are highly heterogeneous and ubiquitous in the P. infestans genome.
Collapse
Affiliation(s)
- Rays H Y Jiang
- Plant Sciences Group, Laboratory of Phytopathology, Graduate School of Experimental Plant Sciences, Wageningen University, Binnenhaven 5, 6709 PD, Wageningen, The Netherlands
| | | | | | | | | | | | | |
Collapse
|
10
|
Moy P, Qutob D, Chapman BP, Atkinson I, Gijzen M. Patterns of gene expression upon infection of soybean plants by Phytophthora sojae. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2004; 17:1051-62. [PMID: 15497398 DOI: 10.1094/mpmi.2004.17.10.1051] [Citation(s) in RCA: 126] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
To investigate patterns of gene expression in soybean (Glycine max) and Phytophthora sojae during an infection time course, we constructed a 4,896-gene microarray of host and pathogen cDNA transcripts. Analysis of rRNA from soybean and P. sojae was used to estimate the ratio of host and pathogen RNA present in mixed samples. Large changes in this ratio occurred between 12 and 24 h after infection, reflecting the rapid growth and proliferation of the pathogen within host tissues. From the microarray analysis, soybean genes that were identified as strongly upregulated during infection included those encoding enzymes of phytoalexin biosynthesis and defense and pathogenesis-related proteins. Expression of these genes generally peaked at 24 h after infection. Selected lipoxygenases and peroxidases were among the most strongly downregulated soybean genes during the course of infection. The number of pathogen genes expressed during infection reached a maximum at 24 h. The results show that it is possible to use a single microarray to simultaneously probe gene expression in two interacting organisms. The patterns of gene expression we observed in soybean and P. sojae support the hypothesis that the pathogen transits from biotrophy to necrotrophy between 12 and 24 h after infection.
Collapse
Affiliation(s)
- Pat Moy
- Agriculture and Agri-Food Canada, 1391 Sandford Street, London, Ontario, N5V 4T3, Canada
| | | | | | | | | |
Collapse
|
11
|
Affiliation(s)
- Sophien Kamoun
- Department of Plant Pathology, The Ohio State University, Ohio Agricultural Research and Development Center, Wooster, Ohio 44691, USA.
| |
Collapse
|