1601
|
Santana MF, Silva JCF, Batista AD, Ribeiro LE, da Silva GF, de Araújo EF, de Queiroz MV. Abundance, distribution and potential impact of transposable elements in the genome of Mycosphaerella fijiensis. BMC Genomics 2012; 13:720. [PMID: 23260030 PMCID: PMC3562529 DOI: 10.1186/1471-2164-13-720] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2012] [Accepted: 12/20/2012] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Mycosphaerella fijiensis is a ascomycete that causes Black Sigatoka in bananas. Recently, the M. fijiensis genome was sequenced. Repetitive sequences are ubiquitous components of fungal genomes. In most genomic analyses, repetitive sequences are associated with transposable elements (TEs). TEs are dispersed repetitive DNA sequences found in a host genome. These elements have the ability to move from one location to another within the genome, and their insertion can cause a wide spectrum of mutations in their hosts. Some of the deleterious effects of TEs may be due to ectopic recombination among TEs of the same family. In addition, some transposons are physically linked to genes and can control their expression. To prevent possible damage caused by the presence of TEs in the genome, some fungi possess TE-silencing mechanisms, such as RIP (Repeat Induced Point mutation). In this study, the abundance, distribution and potential impact of TEs in the genome of M. fijiensis were investigated. RESULTS A total of 613 LTR-Gypsy and 27 LTR-Copia complete elements of the class I were detected. Among the class II elements, a total of 28 Mariner, five Mutator and one Harbinger complete elements were identified. The results of this study indicate that transposons were and are important ectopic recombination sites. A distribution analysis of a transposable element from each class of the M. fijiensis isolates revealed variable hybridization profiles, indicating the activity of these elements. Several genes encoding proteins involved in important metabolic pathways and with potential correlation to pathogenicity systems were identified upstream and downstream of transposable elements. A comparison of the sequences from different transposon groups suggested the action of the RIP silencing mechanism in the genome of this microorganism. CONCLUSIONS The analysis of TEs in M. fijiensis suggests that TEs play an important role in the evolution of this organism because the activity of these elements, as well as the rearrangements caused by ectopic recombination, can result in deletion, duplication, inversion and translocation. Some of these changes can potentially modify gene structure or expression and, thus, facilitate the emergence of new strains of this pathogen.
Collapse
Affiliation(s)
- Mateus F Santana
- Present address: Laboratório de Genética Molecular e de Microrganismo, Universidade Federal de Viçosa, Viçosa, Brazil
| | - José CF Silva
- Present address: Diretoria de Tecnologia da Informação, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Aline D Batista
- Present address: Laboratório de Genética Molecular e de Microrganismo, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Lílian E Ribeiro
- Present address: Laboratório de Genética Molecular e de Microrganismo, Universidade Federal de Viçosa, Viçosa, Brazil
| | | | - Elza F de Araújo
- Present address: Laboratório de Genética Molecular e de Microrganismo, Universidade Federal de Viçosa, Viçosa, Brazil
| | - Marisa V de Queiroz
- Present address: Laboratório de Genética Molecular e de Microrganismo, Universidade Federal de Viçosa, Viçosa, Brazil
| |
Collapse
|
1602
|
[Identification and analysis methods of plant LTR retrotransposon sequences]. YI CHUAN = HEREDITAS 2012. [PMID: 23208147 DOI: 10.3724/sp.j.1005.2012.01491] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
LTR retrotransposons are an important class of eukaryotic transposable elements, which are ubiquitous and highly heterogeneous in plant and play a major role in genome evolution of eukaryote. They are now extensively employed in gene function and genetic diversity analyses. Identification of LTR retrotransposons is the precondition for its application. Therefore, it has important theoretical significance and practical application value in studying identification and analysis methods LTR retrotransposon sequences. Bioinformatic software of the sequence analysis, according to the work principle, can be classified roughly into two types: sequence alignment and sequence identification of conserved domains. Alignment software, such as BLAST and DNAstar, produce the corresponding sequence information through comparison of sequence similarity; however, this kind of software cannot be applied for full length sequences. According to the principle, LTR retro-transposon sequence identification software can be roughly sorted into four types: de novo repeat discovery method, com-parative genomic method, homology-based method, and structure-based method. For example, LTR_Finder based on de novo repeat discovery method can accurately predict and annotate LTR retrotransposons for full length sequences; Repeat-Masker, which is based on homology-based method, can discover LTR retrotransposons by comparing the similarity with known sequences in the database. In this article, different methods of identification and analysis of retrotransposon se-quences were compared and analyzed, and a set of flow of LTR retrotransposons sequence analysis was summarized in order to provide the reference for LTR retrotransposons sequence analysis.
Collapse
|
1603
|
Centromere retention and loss during the descent of maize from a tetraploid ancestor. Proc Natl Acad Sci U S A 2012. [PMID: 23197827 DOI: 10.1073/pnas.1218668109] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Although centromere function is highly conserved in eukaryotes, centromere sequences are highly variable. Only a few centromeres have been sequenced in higher eukaryotes because of their repetitive nature, thus hindering study of their structure and evolution. Conserved single-copy sequences in pericentromeres (CSCPs) of sorghum and maize were found to be diagnostic characteristics of adjacent centromeres. By analyzing comparative map data and CSCP sequences of sorghum, maize, and rice, the major evolutionary events related to centromere dynamics were discovered for the maize lineage after its divergence from a common ancestor with sorghum. (i) Remnants of ancient CSCP regions were found for the 10 lost ancestral centromeres, indicating that two ancient homeologous chromosome pairs did not contribute any centromeres to the current maize genome, whereas two other pairs contributed both of their centromeres. (ii) Five cases of long-distance, intrachromosome movement of CSCPs were detected in the retained centromeres, with inversion the major process involved. (iii) The 12 major chromosomal rearrangements that led to maize chromosome number reduction from 20 to 10 were uncovered. (iv) In addition to whole chromosome insertion near (but not always into) other centromeres, translocation and fusion were found to be important mechanisms underlying grass chromosome number reduction. (v) Comparison of chromosome structures confirms the polyploid event that led to the tetraploid ancestor of modern maize.
Collapse
|
1604
|
Wang D, Xia Y, Li X, Hou L, Yu J. The Rice Genome Knowledgebase (RGKbase): an annotation database for rice comparative genomics and evolutionary biology. Nucleic Acids Res 2012. [PMID: 23193278 PMCID: PMC3531066 DOI: 10.1093/nar/gks1225] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Over the past 10 years, genomes of cultivated rice cultivars and their wild counterparts have been sequenced although most efforts are focused on genome assembly and annotation of two major cultivated rice (Oryza sativa L.) subspecies, 93-11 (indica) and Nipponbare (japonica). To integrate information from genome assemblies and annotations for better analysis and application, we now introduce a comparative rice genome database, the Rice Genome Knowledgebase (RGKbase, http://rgkbase.big.ac.cn/RGKbase/). RGKbase is built to have three major components: (i) integrated data curation for rice genomics and molecular biology, which includes genome sequence assemblies, transcriptomic and epigenomic data, genetic variations, quantitative trait loci (QTLs) and the relevant literature; (ii) User-friendly viewers, such as Gbrowse, GeneBrowse and Circos, for genome annotations and evolutionary dynamics and (iii) Bioinformatic tools for compositional and synteny analyses, gene family classifications, gene ontology terms and pathways and gene co-expression networks. RGKbase current includes data from five rice cultivars and species: Nipponbare (japonica), 93-11 (indica), PA64s (indica), the African rice (Oryza glaberrima) and a wild rice species (Oryza brachyantha). We are also constantly introducing new datasets from variety of public efforts, such as two recent releases—sequence data from ∼1000 rice varieties, which are mapped into the reference genome, yielding ample high-quality single-nucleotide polymorphisms and insertions–deletions.
Collapse
Affiliation(s)
- Dapeng Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, PR China
| | | | | | | | | |
Collapse
|
1605
|
The draft genome of watermelon (Citrullus lanatus) and resequencing of 20 diverse accessions. Nat Genet 2012. [PMID: 23179023 DOI: 10.1038/ng.2470] [Citation(s) in RCA: 424] [Impact Index Per Article: 32.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Watermelon, Citrullus lanatus, is an important cucurbit crop grown throughout the world. Here we report a high-quality draft genome sequence of the east Asia watermelon cultivar 97103 (2n = 2× = 22) containing 23,440 predicted protein-coding genes. Comparative genomics analysis provided an evolutionary scenario for the origin of the 11 watermelon chromosomes derived from a 7-chromosome paleohexaploid eudicot ancestor. Resequencing of 20 watermelon accessions representing three different C. lanatus subspecies produced numerous haplotypes and identified the extent of genetic diversity and population structure of watermelon germplasm. Genomic regions that were preferentially selected during domestication were identified. Many disease-resistance genes were also found to be lost during domestication. In addition, integrative genomic and transcriptomic analyses yielded important insights into aspects of phloem-based vascular signaling in common between watermelon and cucumber and identified genes crucial to valuable fruit-quality traits, including sugar accumulation and citrulline metabolism.
Collapse
|
1606
|
Yang L, Liu T, Li B, Sui Y, Chen J, Shi J, Wing RA, Chen M. Comparative sequence analysis of the Ghd7 orthologous regions revealed movement of Ghd7 in the grass genomes. PLoS One 2012. [PMID: 23185584 PMCID: PMC3503983 DOI: 10.1371/journal.pone.0050236] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Ghd7 is an important rice gene that has a major effect on several agronomic traits, including yield. To reveal the origin of Ghd7 and sequence evolution of this locus, we performed a comparative sequence analysis of the Ghd7 orthologous regions from ten diploid Oryza species, Brachypodium distachyon, sorghum and maize. Sequence analysis demonstrated high gene collinearity across the genus Oryza and a disruption of collinearity among non-Oryza species. In particular, Ghd7 was not present in orthologous positions except in Oryza species. The Ghd7 regions were found to have low gene densities and high contents of repetitive elements, and that the sizes of orthologous regions varied tremendously. The large transposable element contents resulted in a high frequency of pseudogenization and gene movement events surrounding the Ghd7 loci. Annotation information and cytological experiments have indicated that Ghd7 is a heterochromatic gene. Ghd7 orthologs were identified in B. distachyon, sorghum and maize by phylogenetic analysis; however, the positions of orthologous genes differed dramatically as a consequence of gene movements in grasses. Rather, we identified sequence remnants of gene movement of Ghd7 mediated by illegitimate recombination in the B. distachyon genome.
Collapse
Affiliation(s)
- Lu Yang
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Science, Beijing, China
| | - Tieyan Liu
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Science, Beijing, China
| | - Bo Li
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Science, Beijing, China
| | - Yi Sui
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Science, Beijing, China
| | - Jinfeng Chen
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Science, Beijing, China
| | - Jinfeng Shi
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Science, Beijing, China
| | - Rod A. Wing
- Arizona Genomics Institute, School of Plant Sciences, BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Mingsheng Chen
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Science, Beijing, China
- * E-mail:
| |
Collapse
|
1607
|
González LG, Deyholos MK. Identification, characterization and distribution of transposable elements in the flax (Linum usitatissimum L.) genome. BMC Genomics 2012; 13:644. [PMID: 23171245 PMCID: PMC3544724 DOI: 10.1186/1471-2164-13-644] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2012] [Accepted: 11/15/2012] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Flax (Linum usitatissimum L.) is an important crop for the production of bioproducts derived from its seed and stem fiber. Transposable elements (TEs) are widespread in plant genomes and are a key component of their evolution. The availability of a genome assembly of flax (Linum usitatissimum) affords new opportunities to explore the diversity of TEs and their relationship to genes and gene expression. RESULTS Four de novo repeat identification algorithms (PILER, RepeatScout, LTR_finder and LTR_STRUC) were applied to the flax genome assembly. The resulting library of flax repeats was combined with the RepBase Viridiplantae division and used with RepeatMasker to identify TEs coverage in the genome. LTR retrotransposons were the most abundant TEs (17.2% genome coverage), followed by Long Interspersed Nuclear Element (LINE) retrotransposons (2.10%) and Mutator DNA transposons (1.99%). Comparison of putative flax TEs to flax transcript databases indicated that TEs are not highly expressed in flax. However, the presence of recent insertions, defined by 100% intra-element LTR similarity, provided evidence for recent TE activity. Spatial analysis showed TE-rich regions, gene-rich regions as well as regions with similar genes and TE density. Monte Carlo simulations for the 71 largest scaffolds (≥ 1 Mb each) did not show any regional differences in the frequency of TE overlap with gene coding sequences. However, differences between TE superfamilies were found in their proximity to genes. Genes within TE-rich regions also appeared to have lower transcript expression, based on EST abundance. When LTR elements were compared, Copia showed more diversity, recent insertions and conserved domains than the Gypsy, demonstrating their importance in genome evolution. CONCLUSIONS The calculated 23.06% TE coverage of the flax WGS assembly is at the low end of the range of TE coverages reported in other eudicots, although this estimate does not include TEs likely found in unassembled repetitive regions of the genome. Since enrichment for TEs in genomic regions was associated with reduced expression of neighbouring genes, and many members of the Copia LTR superfamily are inserted close to coding regions, we suggest Copia elements have a greater influence on recent flax genome evolution while Gypsy elements have become residual and highly mutated.
Collapse
|
1608
|
Wu J, Wang Z, Shi Z, Zhang S, Ming R, Zhu S, Khan MA, Tao S, Korban SS, Wang H, Chen NJ, Nishio T, Xu X, Cong L, Qi K, Huang X, Wang Y, Zhao X, Wu J, Deng C, Gou C, Zhou W, Yin H, Qin G, Sha Y, Tao Y, Chen H, Yang Y, Song Y, Zhan D, Wang J, Li L, Dai M, Gu C, Wang Y, Shi D, Wang X, Zhang H, Zeng L, Zheng D, Wang C, Chen M, Wang G, Xie L, Sovero V, Sha S, Huang W, Zhang S, Zhang M, Sun J, Xu L, Li Y, Liu X, Li Q, Shen J, Wang J, Paull RE, Bennetzen JL, Wang J, Zhang S. The genome of the pear (Pyrus bretschneideri Rehd.). Genome Res 2012; 23:396-408. [PMID: 23149293 PMCID: PMC3561880 DOI: 10.1101/gr.144311.112] [Citation(s) in RCA: 539] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The draft genome of the pear (Pyrus bretschneideri) using a combination of BAC-by-BAC and next-generation sequencing is reported. A 512.0-Mb sequence corresponding to 97.1% of the estimated genome size of this highly heterozygous species is assembled with 194× coverage. High-density genetic maps comprising 2005 SNP markers anchored 75.5% of the sequence to all 17 chromosomes. The pear genome encodes 42,812 protein-coding genes, and of these, ∼28.5% encode multiple isoforms. Repetitive sequences of 271.9 Mb in length, accounting for 53.1% of the pear genome, are identified. Simulation of eudicots to the ancestor of Rosaceae has reconstructed nine ancestral chromosomes. Pear and apple diverged from each other ∼5.4–21.5 million years ago, and a recent whole-genome duplication (WGD) event must have occurred 30–45 MYA prior to their divergence, but following divergence from strawberry. When compared with the apple genome sequence, size differences between the apple and pear genomes are confirmed mainly due to the presence of repetitive sequences predominantly contributed by transposable elements (TEs), while genic regions are similar in both species. Genes critical for self-incompatibility, lignified stone cells (a unique feature of pear fruit), sorbitol metabolism, and volatile compounds of fruit have also been identified. Multiple candidate SFB genes appear as tandem repeats in the S-locus region of pear; while lignin synthesis-related gene family expansion and highly expressed gene families of HCT, C3′H, and CCOMT contribute to high accumulation of both G-lignin and S-lignin. Moreover, alpha-linolenic acid metabolism is a key pathway for aroma in pear fruit.
Collapse
Affiliation(s)
- Jun Wu
- Centre of Pear Engineering Technology Research, State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing 210095, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1609
|
Steinbiss S, Kastens S, Kurtz S. LTRsift: a graphical user interface for semi-automatic classification and postprocessing of de novo detected LTR retrotransposons. Mob DNA 2012; 3:18. [PMID: 23131050 PMCID: PMC3582472 DOI: 10.1186/1759-8753-3-18] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2012] [Accepted: 08/31/2012] [Indexed: 11/30/2022] Open
Abstract
Background Long terminal repeat (LTR) retrotransposons are a class of eukaryotic mobile elements characterized by a distinctive sequence similarity-based structure. Hence they are well suited for computational identification. Current software allows for a comprehensive genome-wide de novo detection of such elements. The obvious next step is the classification of newly detected candidates resulting in (super-)families. Such a de novo classification approach based on sequence-based clustering of transposon features has been proposed before, resulting in a preliminary assignment of candidates to families as a basis for subsequent manual refinement. However, such a classification workflow is typically split across a heterogeneous set of glue scripts and generic software (for example, spreadsheets), making it tedious for a human expert to inspect, curate and export the putative families produced by the workflow. Results We have developed LTRsift, an interactive graphical software tool for semi-automatic postprocessing of de novo predicted LTR retrotransposon annotations. Its user-friendly interface offers customizable filtering and classification functionality, displaying the putative candidate groups, their members and their internal structure in a hierarchical fashion. To ease manual work, it also supports graphical user interface-driven reassignment, splitting and further annotation of candidates. Export of grouped candidate sets in standard formats is possible. In two case studies, we demonstrate how LTRsift can be employed in the context of a genome-wide LTR retrotransposon survey effort. Conclusions LTRsift is a useful and convenient tool for semi-automated classification of newly detected LTR retrotransposons based on their internal features. Its efficient implementation allows for convenient and seamless filtering and classification in an integrated environment. Developed for life scientists, it is helpful in postprocessing and refining the output of software for predicting LTR retrotransposons up to the stage of preparing full-length reference sequence libraries. The LTRsift software is freely available at
http://www.zbh.uni-hamburg.de/LTRsift under an open-source license.
Collapse
Affiliation(s)
- Sascha Steinbiss
- Center for Bioinformatics, University of Hamburg, 20146 Hamburg, Bundesstrasse 43, Germany.
| | | | | |
Collapse
|
1610
|
Ashlock W, Datta S. Distinguishing endogenous retroviral LTRs from SINE elements using features extracted from evolved side effect machines. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1676-1689. [PMID: 22908128 DOI: 10.1109/tcbb.2012.116] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Side effect machines produce features for classifiers that distinguish different types of DNA sequences. They have the, as yet unexploited, potential to give insight into biological features of the sequences. We introduce several innovations to the production and use of side effect machine sequence features. We compare the results of using consensus sequences and genomic sequences for training classifiers and find that more accurate results can be obtained using genomic sequences. Surprisingly, we were even able to build a classifier that distinguished consensus sequences from genomic sequences with high accuracy, suggesting that consensus sequences are not always representative of their genomic counterparts. We apply our techniques to the problem of distinguishing two types of transposable elements, solo LTRs and SINEs. Identifying these sequences is important because they affect gene expression,genome structure, and genetic diversity, and they serve as genetic markers. They are of similar length, neither codes for protein, and both have many nearly identical copies throughout the genome. Being able to efficiently and automatically distinguish them will aid efforts to improve annotations of genomes. Our approach reveals structural characteristics of the sequences of potential interest to biologists.
Collapse
Affiliation(s)
- Wendy Ashlock
- Department of Computer Science and Engineering, York University, 4700 Keele St., Toronto, ON, M3J 1P3, Canada.
| | | |
Collapse
|
1611
|
Wang Z, Hobson N, Galindo L, Zhu S, Shi D, McDill J, Yang L, Hawkins S, Neutelings G, Datla R, Lambert G, Galbraith DW, Grassa CJ, Geraldes A, Cronk QC, Cullis C, Dash PK, Kumar PA, Cloutier S, Sharpe AG, Wong GKS, Wang J, Deyholos MK. The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2012; 72:461-73. [PMID: 22757964 DOI: 10.1111/j.1365-313x.2012.05093.x] [Citation(s) in RCA: 252] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
Flax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole-genome shotgun sequencing of the nuclear genome of flax. Seven paired-end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep-coverage (approximately 94× raw, approximately 69× filtered) short-sequence reads (44-100 bp), produced a set of scaffolds with N(50) =694 kb, including contigs with N(50)=20.1 kb. The contig assembly contained 302 Mb of non-redundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole-genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis-assembly of regions at the genome scale. A total of 43384 protein-coding genes were predicted in the whole-genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (K(s) ) observed within duplicate gene pairs was consistent with a recent (5-9 MYA) whole-genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam-A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on whole-genome shotgun short-sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species.
Collapse
Affiliation(s)
- Zhiwen Wang
- BGI-Shenzen, Bei Shan Industrial Zone, Yantian District, Shenzhen 518083, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1612
|
Staton SE, Bakken BH, Blackman BK, Chapman MA, Kane NC, Tang S, Ungerer MC, Knapp SJ, Rieseberg LH, Burke JM. The sunflower (Helianthus annuus L.) genome reflects a recent history of biased accumulation of transposable elements. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2012; 72:142-53. [PMID: 22691070 DOI: 10.1111/j.1365-313x.2012.05072.x] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Aside from polyploidy, transposable elements are the major drivers of genome size increases in plants. Thus, understanding the diversity and evolutionary dynamics of transposable elements in sunflower (Helianthus annuus L.), especially given its large genome size (∼3.5 Gb) and the well-documented cases of amplification of certain transposons within the genus, is of considerable importance for understanding the evolutionary history of this emerging model species. By analyzing approximately 25% of the sunflower genome from random sequence reads and assembled bacterial artificial chromosome (BAC) clones, we show that it is composed of over 81% transposable elements, 77% of which are long terminal repeat (LTR) retrotransposons. Moreover, the LTR retrotransposon fraction in BAC clones harboring genes is disproportionately composed of chromodomain-containing Gypsy LTR retrotransposons ('chromoviruses'), and the majority of the intact chromoviruses contain tandem chromodomain duplications. We show that there is a bias in the efficacy of homologous recombination in removing LTR retrotransposon DNA, thereby providing insight into the mechanisms associated with transposable element (TE) composition in the sunflower genome. We also show that the vast majority of observed LTR retrotransposon insertions have likely occurred since the origin of this species, providing further evidence that biased LTR retrotransposon activity has played a major role in shaping the chromatin and DNA landscape of the sunflower genome. Although our findings on LTR retrotransposon age and structure could be influenced by the selection of the BAC clones analyzed, a global analysis of random sequence reads indicates that the evolutionary patterns described herein apply to the sunflower genome as a whole.
Collapse
Affiliation(s)
- S Evan Staton
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
1613
|
Krishnan NM, Pattnaik S, Jain P, Gaur P, Choudhary R, Vaidyanathan S, Deepak S, Hariharan AK, Krishna PB, Nair J, Varghese L, Valivarthi NK, Dhas K, Ramaswamy K, Panda B. A draft of the genome and four transcriptomes of a medicinal and pesticidal angiosperm Azadirachta indica. BMC Genomics 2012; 13:464. [PMID: 22958331 PMCID: PMC3507787 DOI: 10.1186/1471-2164-13-464] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2012] [Accepted: 09/03/2012] [Indexed: 12/05/2022] Open
Abstract
Background The Azadirachta indica (neem) tree is a source of a wide number of natural products, including the potent biopesticide azadirachtin. In spite of its widespread applications in agriculture and medicine, the molecular aspects of the biosynthesis of neem terpenoids remain largely unexplored. The current report describes the draft genome and four transcriptomes of A. indica and attempts to contextualise the sequence information in terms of its molecular phylogeny, transcript expression and terpenoid biosynthesis pathways. A. indica is the first member of the family Meliaceae to be sequenced using next generation sequencing approach. Results The genome and transcriptomes of A. indica were sequenced using multiple sequencing platforms and libraries. The A. indica genome is AT-rich, bears few repetitive DNA elements and comprises about 20,000 genes. The molecular phylogenetic analyses grouped A. indica together with Citrus sinensis from the Rutaceae family validating its conventional taxonomic classification. Comparative transcript expression analysis showed either exclusive or enhanced expression of known genes involved in neem terpenoid biosynthesis pathways compared to other sequenced angiosperms. Genome and transcriptome analyses in A. indica led to the identification of repeat elements, nucleotide composition and expression profiles of genes in various organs. Conclusions This study on A. indica genome and transcriptomes will provide a model for characterization of metabolic pathways involved in synthesis of bioactive compounds, comparative evolutionary studies among various Meliaceae family members and help annotate their genomes. A better understanding of molecular pathways involved in the azadirachtin synthesis in A. indica will pave ways for bulk production of environment friendly biopesticides.
Collapse
Affiliation(s)
- Neeraja M Krishnan
- Ganit Labs, Bio-IT Centre, Institute of Bioinformatics and Applied Biotechnology, Biotech Park, Electronic City Phase I, Bangalore 560100, India
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1614
|
Comparative genome analysis of Trichophyton rubrum and related dermatophytes reveals candidate genes involved in infection. mBio 2012; 3:e00259-12. [PMID: 22951933 PMCID: PMC3445971 DOI: 10.1128/mbio.00259-12] [Citation(s) in RCA: 172] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The major cause of athlete's foot is Trichophyton rubrum, a dermatophyte or fungal pathogen of human skin. To facilitate molecular analyses of the dermatophytes, we sequenced T. rubrum and four related species, Trichophyton tonsurans, Trichophyton equinum, Microsporum canis, and Microsporum gypseum. These species differ in host range, mating, and disease progression. The dermatophyte genomes are highly colinear yet contain gene family expansions not found in other human-associated fungi. Dermatophyte genomes are enriched for gene families containing the LysM domain, which binds chitin and potentially related carbohydrates. These LysM domains differ in sequence from those in other species in regions of the peptide that could affect substrate binding. The dermatophytes also encode novel sets of fungus-specific kinases with unknown specificity, including nonfunctional pseudokinases, which may inhibit phosphorylation by competing for kinase sites within substrates, acting as allosteric effectors, or acting as scaffolds for signaling. The dermatophytes are also enriched for a large number of enzymes that synthesize secondary metabolites, including dermatophyte-specific genes that could synthesize novel compounds. Finally, dermatophytes are enriched in several classes of proteases that are necessary for fungal growth and nutrient acquisition on keratinized tissues. Despite differences in mating ability, genes involved in mating and meiosis are conserved across species, suggesting the possibility of cryptic mating in species where it has not been previously detected. These genome analyses identify gene families that are important to our understanding of how dermatophytes cause chronic infections, how they interact with epithelial cells, and how they respond to the host immune response.
Collapse
|
1615
|
Manetti ME, Rossi M, Cruz GMQ, Saccaro NL, Nakabashi M, Altebarmakian V, Rodier-Goud M, Domingues D, D’Hont A, Van Sluys MA. Mutator System Derivatives Isolated from Sugarcane Genome Sequence. TROPICAL PLANT BIOLOGY 2012; 5:233-243. [PMID: 22905278 PMCID: PMC3418495 DOI: 10.1007/s12042-012-9104-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/02/2012] [Accepted: 05/03/2012] [Indexed: 06/01/2023]
Abstract
Mutator-like transposase is the most represented transposon transcript in the sugarcane transcriptome. Phylogenetic reconstructions derived from sequenced transcripts provided evidence that at least four distinct classes exist (I-IV) and that diversification among these classes occurred early in Angiosperms, prior to the divergence of Monocots/Eudicots. The four previously described classes served as probes to select and further sequence six BAC clones from a genomic library of cultivar R570. A total of 579,352 sugarcane base pairs were produced from these "Mutator system" BAC containing regions for further characterization. The analyzed genomic regions confirmed that the predicted structure and organization of the Mutator system in sugarcane is composed of two true transposon lineages, each containing a specific terminal inverted repeat and two transposase lineages considered to be domesticated. Each Mutator transposase class displayed a particular molecular structure supporting lineage specific evolution. MUSTANG, previously described domesticated genes, are located in syntenic regions across Sacharineae and, as expected for a host functional gene, posses the same gene structure as in other Poaceae. Two sequenced BACs correspond to hom(eo)logous locus with specific retrotransposon insertions that discriminate sugarcane haplotypes. The comparative studies presented, add information to the Mutator systems previously identified in the maize and rice genomes by describing lineage specific molecular structure and genomic distribution pattern in the sugarcane genome. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s12042-012-9104-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- M. E. Manetti
- Departamento de Botânica-IB-USP, GaTE Lab, Brasil, Rua do Matão, 277, 05508-900 São Paulo, SP Brazil
| | - M. Rossi
- Departamento de Botânica-IB-USP, GaTE Lab, Brasil, Rua do Matão, 277, 05508-900 São Paulo, SP Brazil
| | - G. M. Q. Cruz
- Departamento de Botânica-IB-USP, GaTE Lab, Brasil, Rua do Matão, 277, 05508-900 São Paulo, SP Brazil
| | - N. L. Saccaro
- Departamento de Botânica-IB-USP, GaTE Lab, Brasil, Rua do Matão, 277, 05508-900 São Paulo, SP Brazil
| | - M. Nakabashi
- Departamento de Botânica-IB-USP, GaTE Lab, Brasil, Rua do Matão, 277, 05508-900 São Paulo, SP Brazil
| | - V. Altebarmakian
- Departamento de Botânica-IB-USP, GaTE Lab, Brasil, Rua do Matão, 277, 05508-900 São Paulo, SP Brazil
| | - M. Rodier-Goud
- Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), UMR AGAP, Avenue Agropolis, 34398 Montpellier Cedex 5, France
| | - D. Domingues
- Departamento de Botânica-IB-USP, GaTE Lab, Brasil, Rua do Matão, 277, 05508-900 São Paulo, SP Brazil
- Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), UMR AGAP, Avenue Agropolis, 34398 Montpellier Cedex 5, France
| | - A. D’Hont
- Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), UMR AGAP, Avenue Agropolis, 34398 Montpellier Cedex 5, France
| | - M. A. Van Sluys
- Departamento de Botânica-IB-USP, GaTE Lab, Brasil, Rua do Matão, 277, 05508-900 São Paulo, SP Brazil
| |
Collapse
|
1616
|
Chapman MA, Burke JM. Evidence of selection on fatty acid biosynthetic genes during the evolution of cultivated sunflower. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2012; 125:897-907. [PMID: 22580969 DOI: 10.1007/s00122-012-1881-z] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2012] [Accepted: 04/19/2012] [Indexed: 05/21/2023]
Abstract
The identification of genes underlying the phenotypic transitions that took place during crop evolution, as well as the genomic extent of resultant selective sweeps, is of great interest to both evolutionary biologists and applied plant scientists. In this study, we report the results of a molecular evolutionary analysis of 11 genes that underlie fatty acid biosynthesis and metabolism in wild and cultivated sunflower (Helianthus annuus). Seven of these 11 genes showed evidence of selection at the nucleotide level, with 1 (FAD7) having experienced selection prior to domestication, 2 (FAD2-3 and FAD3) having experienced selection during domestication, and 4 (FAB1, FAD2-1, FAD6, and FATB) having experienced selection during the subsequent period of improvement. Sequencing of a subset of these genes from an extended panel of sunflower cultivars revealed little additional variation, and an analysis of the genomic region surrounding one of these genes (FAD2-1) revealed the occurrence of an extensive selective sweep affecting a region spanning at least ca. 100 kb. Given that previous population genetic analyses have revealed a relatively rapid decay of linkage disequilibrium in sunflower, this finding indicates the occurrence of strong selection and a rapid sweep.
Collapse
Affiliation(s)
- Mark A Chapman
- Department of Plant Biology, University of Georgia, Miller Plant Sciences Bldg., Athens, GA 30602, USA.
| | | |
Collapse
|
1617
|
Brown K, Moreton J, Malla S, Aboobaker AA, Emes RD, Tarlinton RE. Characterisation of retroviruses in the horse genome and their transcriptional activity via transcriptome sequencing. Virology 2012; 433:55-63. [PMID: 22868041 DOI: 10.1016/j.virol.2012.07.010] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2012] [Revised: 06/19/2012] [Accepted: 07/12/2012] [Indexed: 01/13/2023]
Abstract
The recently released draft horse genome is incompletely characterised in terms of its repetitive element profile. This paper presents characterisation of the endogenous retrovirus (ERVs) of the horse genome based on a data-mining strategy using murine leukaemia virus proteins as queries. 978 ERV gene sequences were identified. Sequences were identified from the gamma, epsilon and betaretrovirus genera. At least one full length gammaretroviral locus was identified, though the gammaretroviral sequences are very degenerate. Using these data the RNA expression of these ERVs were derived from RNA transcriptome data from a variety of equine tissues. Unlike the well studied human and murine ERVs there do not appear to be particular phylogenetic groups of equine ERVs that are more transcriptionally active. Using this novel approach provided a more technically feasible method to characterise ERV expression than previous studies.
Collapse
Affiliation(s)
- Katherine Brown
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, Loughborough LE12 5RD, United Kingdom
| | | | | | | | | | | |
Collapse
|
1618
|
Peters SA, Bargsten JW, Szinay D, van de Belt J, Visser RGF, Bai Y, de Jong H. Structural homology in the Solanaceae: analysis of genomic regions in support of synteny studies in tomato, potato and pepper. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2012; 71:602-14. [PMID: 22463056 DOI: 10.1111/j.1365-313x.2012.05012.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
We have analysed the structural homology in euchromatin regions of tomato, potato and pepper with special attention for the long arm of chromosome 2 (2L). Molecular organization and colinear junctions were delineated using multi-color BAC FISH analysis and comparative sequence alignment. We found large-scale rearrangements including inversions and segmental translocations that were not reported in previous comparative studies. Some of the structural rearrangements are specific for the tomato clade, and differentiate tomato from potato, pepper and other Solanaceous species. Although local gene vicinity is largely preserved, there are many small-scale synteny perturbations. Gene adjacency in the aligned segments was frequently disrupted for 47% of the ortholog pairs as a result of gene and LTR retrotransposon insertions, and occasionally by single gene inversions and translocations. Our data also suggests that long distance intra-chromosomal rearrangements and local gene rearrangements have evolved frequently during speciation in the Solanum genus, and that small changes are more prevalent than large-scale differences. The occurrence of sonata and harbinger transposable elements and other repeats near or at junction breaks is considered in the light of repeat-mediated rearrangements and a reconstruction scenario for an ancestral 2L topology is discussed.
Collapse
Affiliation(s)
- Sander A Peters
- Plant Research International, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands.
| | | | | | | | | | | | | |
Collapse
|
1619
|
Xu HE, Zhang HH, Han MJ, Shen YH, Huang XZ, Xiang ZH, Zhang Z. [Computational approaches for identification and classification of transposable elements in eukaryotic genomes]. YI CHUAN = HEREDITAS 2012; 34:1009-1019. [PMID: 22917906 DOI: 10.3724/sp.j.1005.2012.01009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Repetitive sequences (repeats) represent a significant fraction of the eukaryotic genomes and can be divided into tandem repeats, segmental duplications, and interspersed repeats on the basis of their sequence characteristics and how they are formed. Most interspersed repeats are derived from transposable elements (TEs). Eukaryotic TEs have been subdivided into two major classes according to the intermediate they use to move. The transposition and amplification of TEs have a great impact on the evolution of genes and the stability of genomes. However, identification and classification of TEs are complex and difficult due to the fact that their structure and classification are complex and diverse compared with those of other types of repeats. Here, we briefly introduced the function and classification of TEs, and summarized three different steps for identification, classification and annotation of TEs in eukaryotic genomes: (1) assembly of a repeat library, (2) repeat correction and classification, and (3) genome annotation. The existing computational approaches for each step were summarized and the advantages and disadvantages of the approaches were also highlighted in this review. To accurately identify, classify, and annotate the TEs in eukaryotic genomes requires combined methods. This review provides useful information for biologists who are not familiar with these approaches to find their way through the forest of programs.
Collapse
Affiliation(s)
- Hong-En Xu
- The Institute of Sericulture and Systems Biology, Southwest University, Chongqing, China.
| | | | | | | | | | | | | |
Collapse
|
1620
|
Kim WC, Lee KH, Shin KS, You RN, Lee YK, Cho K, Cho DH. REMiner-II: a tool for rapid identification and configuration of repetitive element arrays from large mammalian chromosomes as a single query. Genomics 2012; 100:131-40. [PMID: 22750555 DOI: 10.1016/j.ygeno.2012.06.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2012] [Revised: 06/04/2012] [Accepted: 06/12/2012] [Indexed: 01/17/2023]
Abstract
Genes occupy ~3% of the human and mouse genomes whereas repetitive elements (REs), whose biologic functions are largely uncharacterized, constitute greater than 50%. A heterogeneous population of RE arrays (arrangement structures) is formed by combinations of various REs in mammalian genomes. In this study, REMiner-II was refined from the original REMiner for a more efficient identification and configuration of RE arrays from large queries (e.g., human chromosomes) using an unbiased self-alignment protocol. Chromosome-wide RE array profiles for the entire sets of human and mouse chromosomes were obtained using REMiner-II on a personal computer. REMiner-II provides 10 adjustable parameters and three data output modes to accommodate different experimental settings and/or goals. Examination of the human and mouse chromosome data using the REMiner-II viewer revealed species-specific libraries of complexly organized RE arrays. In conclusion, REMiner-II is an efficient tool for chromosome-wide identification and characterization of RE arrays from mammalian genomes.
Collapse
Affiliation(s)
- Woo-Chan Kim
- Department of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South Korea
| | | | | | | | | | | | | |
Collapse
|
1621
|
Ballinger MJ, Bruenn JA, Taylor DJ. Phylogeny, integration and expression of sigma virus-like genes in Drosophila. Mol Phylogenet Evol 2012; 65:251-8. [PMID: 22750113 DOI: 10.1016/j.ympev.2012.06.008] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Revised: 06/07/2012] [Accepted: 06/14/2012] [Indexed: 01/11/2023]
Abstract
The recent and surprising discovery of widespread NIRVs (non-retroviral integrated RNA viruses) has highlighted the importance of genomic interactions between non-retroviral RNA viruses and their eukaryotic hosts. Among the viruses with integrated representatives are the rhabdoviruses, a family of negative sense single-stranded RNA viruses. We identify sigma virus-like NIRVs of Drosophila spp. that represent unique cases where NIRVs are closely related to exogenous RNA viruses in a model host organism. We have used a combination of bioinformatics and laboratory methods to explore the evolution and expression of sigma virus-like NIRVs in Drosophila. Recent integrations in Drosophila provide a promising experimental system to study functionality of NIRVs. Moreover, the genomic architecture of recent NIRVs provides an unusual evolutionary window on the integration mechanism. For example, we found that a sigma virus-like polymerase associated protein (P) gene appears to have been integrated by template switching of the blastopia-like LTR retrotransposon. The sigma virus P-like NIRV is present in multiple retroelement fused open reading frames on the X and 3R chromosomes of Drosophila yakuba - the X-linked copy is transcribed to produce an RNA product in adult flies. We present the first account of sigma virus-like NIRVs and the first example of NIRV expression in a model animal system, and therefore provide a platform for further study of the possible functions of NIRVs in animal hosts.
Collapse
Affiliation(s)
- Matthew J Ballinger
- Department of Biological Sciences, The State University of New York at Buffalo, Buffalo, NY 14260, USA.
| | | | | |
Collapse
|
1622
|
Cui J, Tachedjian G, Tachedjian M, Holmes EC, Zhang S, Wang LF. Identification of diverse groups of endogenous gammaretroviruses in mega- and microbats. J Gen Virol 2012; 93:2037-2045. [PMID: 22694899 DOI: 10.1099/vir.0.043760-0] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
A previous phylogenetic study suggested that mammalian gammaretroviruses may have originated in bats. Here we report the discovery of RNA transcripts from two putative endogenous gammaretroviruses in frugivorous (Rousettus leschenaultii retrovirus, RlRV) and insectivorous (Megaderma lyra retrovirus, MlRV) bat species. Both genomes possess a large deletion in pol, indicating that they are defective retroviruses. Phylogenetic analysis places RlRV and MlRV within the diversity of mammalian gammaretroviruses, with the former falling closer to porcine endogenous retroviruses and the latter to Mus dunni endogenous virus, koala retrovirus and gibbon ape leukemia virus. Additional genomic mining suggests that both microbat (Myotis lucifugus) and megabat (Pteropus vampyrus) genomes harbour many copies of endogenous retroviral forms related to RlRV and MlRV. Furthermore, phylogenetic analysis reveals the presence of three genetically diverse groups of endogenous gammaretroviruses in bat genomes, with M. lucifugus possessing members of all three groups. Taken together, this study indicates that bats harbour distinct gammaretroviruses and may have played an important role as reservoir hosts during the diversification of mammalian gammaretroviruses.
Collapse
Affiliation(s)
- Jie Cui
- Center for Infectious Disease Dynamics, Department of Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Gilda Tachedjian
- Department of Medicine, Monash University, Melbourne, Victoria, 3004, Australia.,Department of Microbiology, Monash University, Clayton, Victoria, 3168, Australia.,Retroviral Biology and Antivirals Laboratory, Centre for Virology, Burnet Institute, Melbourne, Victoria 3004, Australia
| | - Mary Tachedjian
- CSIRO Livestock Industries, Australian Animal Health Laboratory, Geelong, Victoria 3220, Australia
| | - Edward C Holmes
- Fogarty International Center, National Institutes of Health, Bethesda, MD 20892, USA.,Center for Infectious Disease Dynamics, Department of Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Shuyi Zhang
- Institute of Molecular Ecology and Evolution, Institutes for Advanced Interdisciplinary Research, East China Normal University, Shanghai 200062, PR China
| | - Lin-Fa Wang
- Department of Microbiology and Immunology, The University of Melbourne, Parkville, Victoria 3010, Australia.,CSIRO Livestock Industries, Australian Animal Health Laboratory, Geelong, Victoria 3220, Australia
| |
Collapse
|
1623
|
Comparative analysis of a plant pseudoautosomal region (PAR) in Silene latifolia with the corresponding S. vulgaris autosome. BMC Genomics 2012; 13:226. [PMID: 22681719 PMCID: PMC3431222 DOI: 10.1186/1471-2164-13-226] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2011] [Accepted: 06/08/2012] [Indexed: 11/10/2022] Open
Abstract
Background The sex chromosomes of Silene latifolia are heteromorphic as in mammals, with females being homogametic (XX) and males heterogametic (XY). While recombination occurs along the entire X chromosome in females, recombination between the X and Y chromosomes in males is restricted to the pseudoautosomal region (PAR). In the few mammals so far studied, PARs are often characterized by elevated recombination and mutation rates and high GC content compared with the rest of the genome. However, PARs have not been studied in plants until now. In this paper we report the construction of a BAC library for S. latifolia and the first analysis of a > 100 kb fragment of a S. latifolia PAR that we compare to the homologous autosomal region in the closely related gynodioecious species S. vulgaris. Results Six new sex-linked genes were identified in the S. latifolia PAR, together with numerous transposable elements. The same genes were found on the S. vulgaris autosomal segment, with no enlargement of the predicted coding sequences in S. latifolia. Intergenic regions were on average 1.6 times longer in S. latifolia than in S. vulgaris, mainly as a consequence of the insertion of transposable elements. The GC content did not differ significantly between the PAR region in S. latifolia and the corresponding autosomal region in S. vulgaris. Conclusions Our results demonstrate the usefulness of the BAC library developed here for the analysis of plant sex chromosomes and indicate that the PAR in the evolutionarily young S. latifolia sex chromosomes has diverged from the corresponding autosomal region in the gynodioecious S. vulgaris mainly with respect to the insertion of transposable elements. Gene order between the PAR and autosomal region investigated is conserved, and the PAR does not have the high GC content observed in evolutionarily much older mammalian sex chromosomes.
Collapse
|
1624
|
Reference genome sequence of the model plant Setaria. Nat Biotechnol 2012; 30:555-61. [PMID: 22580951 DOI: 10.1038/nbt.2196] [Citation(s) in RCA: 537] [Impact Index Per Article: 41.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2011] [Accepted: 03/29/2012] [Indexed: 11/08/2022]
|
1625
|
Genome sequence of foxtail millet (Setaria italica) provides insights into grass evolution and biofuel potential. Nat Biotechnol 2012; 30:549-54. [PMID: 22580950 DOI: 10.1038/nbt.2195] [Citation(s) in RCA: 409] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2011] [Accepted: 03/26/2012] [Indexed: 01/19/2023]
Abstract
Foxtail millet (Setaria italica), a member of the Poaceae grass family, is an important food and fodder crop in arid regions and has potential for use as a C(4) biofuel. It is a model system for other biofuel grasses, including switchgrass and pearl millet. We produced a draft genome (∼423 Mb) anchored onto nine chromosomes and annotated 38,801 genes. Key chromosome reshuffling events were detected through collinearity identification between foxtail millet, rice and sorghum including two reshuffling events fusing rice chromosomes 7 and 9, 3 and 10 to foxtail millet chromosomes 2 and 9, respectively, that occurred after the divergence of foxtail millet and rice, and a single reshuffling event fusing rice chromosome 5 and 12 to foxtail millet chromosome 3 that occurred after the divergence of millet and sorghum. Rearrangements in the C(4) photosynthesis pathway were also identified.
Collapse
|
1626
|
Liu D, Gong J, Dai W, Kang X, Huang Z, Zhang HM, Liu W, Liu L, Ma J, Xia Z, Chen Y, Chen Y, Wang D, Ni P, Guo AY, Xiong X. The genome of Ganoderma lucidum provides insights into triterpenes biosynthesis and wood degradation [corrected]. PLoS One 2012; 7:e36146. [PMID: 22567134 PMCID: PMC3342255 DOI: 10.1371/journal.pone.0036146] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Accepted: 03/26/2012] [Indexed: 01/08/2023] Open
Abstract
Background Ganoderma lucidum (Reishi or Ling Zhi) is one of the most famous Traditional Chinese Medicines and has been widely used in the treatment of various human diseases in Asia countries. It is also a fungus with strong wood degradation ability with potential in bioenergy production. However, genes, pathways and mechanisms of these functions are still unknown. Methodology/Principal Findings The genome of G. lucidum was sequenced and assembled into a 39.9 megabases (Mb) draft genome, which encoded 12,080 protein-coding genes and ∼83% of them were similar to public sequences. We performed comprehensive annotation for G. lucidum genes and made comparisons with genes in other fungi genomes. Genes in the biosynthesis of the main G. lucidum active ingredients, ganoderic acids (GAs), were characterized. Among the GAs synthases, we identified a fusion gene, the N and C terminal of which are homologous to two different enzymes. Moreover, the fusion gene was only found in basidiomycetes. As a white rot fungus with wood degradation ability, abundant carbohydrate-active enzymes and ligninolytic enzymes were identified in the G. lucidum genome and were compared with other fungi. Conclusions/Significance The genome sequence and well annotation of G. lucidum will provide new insights in function analyses including its medicinal mechanism. The characterization of genes in the triterpene biosynthesis and wood degradation will facilitate bio-engineering research in the production of its active ingredients and bioenergy.
Collapse
Affiliation(s)
- Dongbo Liu
- Hunan Agricultural University, Changsha, Hunan, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1627
|
Bousios A, Minga E, Kalitsou N, Pantermali M, Tsaballa A, Darzentas N. MASiVEdb: the Sirevirus Plant Retrotransposon Database. BMC Genomics 2012; 13:158. [PMID: 22545773 PMCID: PMC3414828 DOI: 10.1186/1471-2164-13-158] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2011] [Accepted: 04/30/2012] [Indexed: 11/10/2022] Open
Abstract
Background Sireviruses are an ancient genus of the Copia superfamily of LTR retrotransposons, and the only one that has exclusively proliferated within plant genomes. Based on experimental data and phylogenetic analyses, Sireviruses have successfully infiltrated many branches of the plant kingdom, extensively colonizing the genomes of grass species. Notably, it was recently shown that they have been a major force in the make-up and evolution of the maize genome, where they currently occupy ~21% of the nuclear content and ~90% of the Copia population. It is highly likely, therefore, that their life dynamics have been fundamental in the genome composition and organization of a plethora of plant hosts. To assist studies into their impact on plant genome evolution and also facilitate accurate identification and annotation of transposable elements in sequencing projects, we developed MASiVEdb (Mapping and Analysis of SireVirus Elements Database), a collective and systematic resource of Sireviruses in plants. Description Taking advantage of the increasing availability of plant genomic sequences, and using an updated version of MASiVE, an algorithm specifically designed to identify Sireviruses based on their highly conserved genome structure, we populated MASiVEdb (http://bat.infspire.org/databases/masivedb/) with data on 16,243 intact Sireviruses (total length >158Mb) discovered in 11 fully-sequenced plant genomes. MASiVEdb is unlike any other transposable element database, providing a multitude of highly curated and detailed information on a specific genus across its hosts, such as complete set of coordinates, insertion age, and an analytical breakdown of the structure and gene complement of each element. All data are readily available through basic and advanced query interfaces, batch retrieval, and downloadable files. A purpose-built system is also offered for detecting and visualizing similarity between user sequences and Sireviruses, as well as for coding domain discovery and phylogenetic analysis. Conclusion MASiVEdb is currently the most comprehensive directory of Sireviruses, and as such complements other efforts in cataloguing plant transposable elements and elucidating their role in host genome evolution. Such insights will gradually deepen, as we plan to further improve MASiVEdb by phylogenetically mapping Sireviruses into families, by including data on fragments and solo LTRs, and by incorporating elements from newly-released genomes.
Collapse
Affiliation(s)
- Alexandros Bousios
- Institute of Agrobiotechnology, Centre for Research and Technology Hellas, Thessaloniki, 57001, Greece.
| | | | | | | | | | | |
Collapse
|
1628
|
Steinbauerová V, Neumann P, Novák P, Macas J. A widespread occurrence of extra open reading frames in plant Ty3/gypsy retrotransposons. Genetica 2012; 139:1543-55. [PMID: 22544262 DOI: 10.1007/s10709-012-9654-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2012] [Accepted: 04/16/2012] [Indexed: 01/21/2023]
Abstract
Long terminal repeat (LTR) retrotransposons make up substantial parts of most higher plant genomes where they accumulate due to their replicative mode of transposition. Although the transposition is facilitated by proteins encoded within the gag-pol region which is common to all autonomous elements, some LTR retrotransposons were found to potentially carry an additional protein coding capacity represented by extra open reading frames located upstream or downstream of gag-pol. In this study, we performed a comprehensive in silico survey and comparative analysis of these extra open reading frames (ORFs) in the group of Ty3/gypsy LTR retrotransposons as the first step towards our understanding of their origin and function. We found that extra ORFs occur in all three major lineages of plant Ty3/gypsy elements, being the most frequent in the Tat lineage where most (77 %) of identified elements contained extra ORFs. This lineage was also characterized by the highest diversity of extra ORF arrangement (position and orientation) within the elements. On the other hand, all of these ORFs could be classified into only two broad groups based on their mutual similarities or the presence of short conserved motifs in their inferred protein sequences. In the Athila lineage, the extra ORFs were confined to the element 3' regions but they displayed much higher sequence diversity compared to those found in Tat. In the lineage of Chromoviruses the extra ORFs were relatively rare, occurring only in 5' regions of a group of elements present in a single plant family (Poaceae). In all three lineages, most extra ORFs lacked sequence similarities to characterized gene sequences or functional protein domains, except for two Athila-like elements with similarities to LOGL4 gene and part of the Chromoviruses extra ORFs that displayed partial similarity to histone H3 gene. Thus, in these cases the extra ORFs most likely originated by transduction or recombination of cellular gene sequences. In addition, the protein domain which is otherwise associated with DNA transposons have been detected in part of the Tat-like extra ORFs, pointing to their origin from an insertion event of a mobile element.
Collapse
Affiliation(s)
- Veronika Steinbauerová
- Institute of Plant Molecular Biology, Biology Centre ASCR, Branišovská 31, Ceske Budejovice, Czech Republic
| | | | | | | |
Collapse
|
1629
|
Cegan R, Vyskot B, Kejnovsky E, Kubat Z, Blavet H, Šafář J, Doležel J, Blavet N, Hobza R. Genomic diversity in two related plant species with and without sex chromosomes--Silene latifolia and S. vulgaris. PLoS One 2012; 7:e31898. [PMID: 22393373 PMCID: PMC3290532 DOI: 10.1371/journal.pone.0031898] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2011] [Accepted: 01/16/2012] [Indexed: 01/25/2023] Open
Abstract
Background Genome size evolution is a complex process influenced by polyploidization, satellite DNA accumulation, and expansion of retroelements. How this process could be affected by different reproductive strategies is still poorly understood. Methodology/Principal Findings We analyzed differences in the number and distribution of major repetitive DNA elements in two closely related species, Silene latifolia and S. vulgaris. Both species are diploid and possess the same chromosome number (2n = 24), but differ in their genome size and mode of reproduction. The dioecious S. latifolia (1C = 2.70 pg DNA) possesses sex chromosomes and its genome is 2.5× larger than that of the gynodioecious S. vulgaris (1C = 1.13 pg DNA), which does not possess sex chromosomes. We discovered that the genome of S. latifolia is larger mainly due to the expansion of Ogre retrotransposons. Surprisingly, the centromeric STAR-C and TR1 tandem repeats were found to be more abundant in S. vulgaris, the species with the smaller genome. We further examined the distribution of major repetitive sequences in related species in the Caryophyllaceae family. The results of FISH (fluorescence in situ hybridization) on mitotic chromosomes with the Retand element indicate that large rearrangements occurred during the evolution of the Caryophyllaceae family. Conclusions/Significance Our data demonstrate that the evolution of genome size in the genus Silene is accompanied by the expansion of different repetitive elements with specific patterns in the dioecious species possessing the sex chromosomes.
Collapse
Affiliation(s)
- Radim Cegan
- Department of Plant Developmental Genetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Brno, Czech Republic
- Department of Plant Biology, Faculty of Agronomy, Mendel University in Brno, Brno, Czech Republic
| | - Boris Vyskot
- Department of Plant Developmental Genetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Brno, Czech Republic
| | - Eduard Kejnovsky
- Department of Plant Developmental Genetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Brno, Czech Republic
| | - Zdenek Kubat
- Department of Plant Developmental Genetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Brno, Czech Republic
| | - Hana Blavet
- Department of Plant Developmental Genetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Brno, Czech Republic
| | - Jan Šafář
- Centre of the Region Haná for Biotechnological and Agricultural Research, Institute of Experimental Botany, Olomouc, Czech Republic
| | - Jaroslav Doležel
- Centre of the Region Haná for Biotechnological and Agricultural Research, Institute of Experimental Botany, Olomouc, Czech Republic
| | - Nicolas Blavet
- Institute of Integrative Biology, Plant Ecological Genetics, ETH Zurich, Zurich, Switzerland
| | - Roman Hobza
- Department of Plant Developmental Genetics, Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Brno, Czech Republic
- * E-mail:
| |
Collapse
|
1630
|
Goldstone HMH, Tokunaga S, Schlezinger JJ, Goldstone JV, Stegeman JJ. EZR1: a novel family of highly expressed retroelements induced by TCDD and regulated by a NF-κB-like factor in embryos of zebrafish (Danio rerio). Zebrafish 2012; 9:15-25. [PMID: 22356696 DOI: 10.1089/zeb.2011.0722] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Transcript profiling using a zebrafish heart cDNA library previously revealed abundant expressed sequence tags (ESTs) upregulated in zebrafish embryos treated with the aryl hydrocarbon receptor (AHR) agonist 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD). Here, we identify those ESTs as LTR-containing retroelements termed EZR1 (Expressed-Zebrafish-Retroelement group 1). EZR1 is highly redundant in the genome and includes canonical long terminal repeats (LTRs) flanking an integrase-like open reading frame and a region similar to retroviral envelope protein genes. EZR1 sequences lack reverse transcriptase, RNase H, or protease, indicating retrotransposition would be nonautonomous. No AHR binding motifs were found in the EZR1 promoter region. A putative NF-κB-binding site was found, and TCDD-treated zebrafish embryos had significantly increased levels of nuclear protein(s) binding to this sequence. Protein-EZR1 DNA complex formation was partially competed by a mammalian consensus κB sequence, consistent with NF-κB-like activation contributing to increased protein binding to this site. Mobility of the TCDD-induced protein-EZR1 complex differed from that of authentic NF-κB protein bound to the consensus κB site. The results suggest that EZR1 is regulated by interaction with NF-κB or NF-κB-like protein(s) different from the NF-κB protein binding to the consensus κB site. The nature of the NF-κB-like protein and the relationship between EZR1 induction and cardiovascular toxicity caused by TCDD warrant further investigation.
Collapse
Affiliation(s)
- Heather M H Goldstone
- Biology Department, Woods Hole Oceanographic Institution, Woods Hole, Massachusetts, USA
| | | | | | | | | |
Collapse
|
1631
|
Gao D, Chen J, Chen M, Meyers BC, Jackson S. A highly conserved, small LTR retrotransposon that preferentially targets genes in grass genomes. PLoS One 2012; 7:e32010. [PMID: 22359654 PMCID: PMC3281118 DOI: 10.1371/journal.pone.0032010] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2011] [Accepted: 01/18/2012] [Indexed: 12/31/2022] Open
Abstract
LTR retrotransposons are often the most abundant components of plant genomes and can impact gene and genome evolution. Most reported LTR retrotransposons are large elements (>4 kb) and are most often found in heterochromatic (gene poor) regions. We report the smallest LTR retrotransposon found to date, only 292 bp. The element is found in rice, maize, sorghum and other grass genomes, which indicates that it was present in the ancestor of grass species, at least 50-80 MYA. Estimated insertion times, comparisons between sequenced rice lines, and mRNA data indicate that this element may still be active in some genomes. Unlike other LTR retrotransposons, the small LTR retrotransposons (SMARTs) are distributed throughout the genomes and are often located within or near genes with insertion patterns similar to MITEs (miniature inverted repeat transposable elements). Our data suggests that insertions of SMARTs into or near genes can, in a few instances, alter both gene structures and gene expression. Further evidence for a role in regulating gene expression, SMART-specific small RNAs (sRNAs) were identified that may be involved in gene regulation. Thus, SMARTs may have played an important role in genome evolution and genic innovation and may provide a valuable tool for gene tagging systems in grass.
Collapse
Affiliation(s)
- Dongying Gao
- Center for Applied Genetic Technologies and Institute for Plant Breeding Genetics and Genomics, University of Georgia, Athens, Georgia, United States of America
| | - Jinfeng Chen
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
| | - Mingsheng Chen
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
| | - Blake C. Meyers
- Department of Plant and Soil Sciences, and Delaware Biotechnology Institute, University of Delaware, Newark, Delaware, United States of America
| | - Scott Jackson
- Center for Applied Genetic Technologies and Institute for Plant Breeding Genetics and Genomics, University of Georgia, Athens, Georgia, United States of America
- * E-mail:
| |
Collapse
|
1632
|
Characterization of transcriptional activation and inserted-into-gene preference of various transposable elements in the Brassica species. Mol Biol Rep 2012; 39:7513-23. [DOI: 10.1007/s11033-012-1585-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2011] [Accepted: 01/30/2012] [Indexed: 12/24/2022]
|
1633
|
Liu SY, Lin JQ, Wu HL, Wang CC, Huang SJ, Luo YF, Sun JH, Zhou JX, Yan SJ, He JG, Wang J, He ZM. Bisulfite sequencing reveals that Aspergillus flavus holds a hollow in DNA methylation. PLoS One 2012; 7:e30349. [PMID: 22276181 PMCID: PMC3262820 DOI: 10.1371/journal.pone.0030349] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2011] [Accepted: 12/14/2011] [Indexed: 12/12/2022] Open
Abstract
Aspergillus flavus first gained scientific attention for its production of aflatoxin. The underlying regulation of aflatoxin biosynthesis has been serving as a theoretical model for biosynthesis of other microbial secondary metabolites. Nevertheless, for several decades, the DNA methylation status, one of the important epigenomic modifications involved in gene regulation, in A. flavus remains to be controversial. Here, we applied bisulfite sequencing in conjunction with a biological replicate strategy to investigate the DNA methylation profiling of A. flavus genome. Both the bisulfite sequencing data and the methylome comparisons with other fungi confirm that the DNA methylation level of this fungus is negligible. Further investigation into the DNA methyltransferase of Aspergillus uncovers its close relationship with RID-like enzymes as well as its divergence with the methyltransferase of species with validated DNA methylation. The lack of repeat contents of the A. flavus' genome and the high RIP-index of the small amount of remanent repeat potentially support our speculation that DNA methylation may be absent in A. flavus or that it may possess de novo DNA methylation which occurs very transiently during the obscure sexual stage of this fungal species. This work contributes to our understanding on the DNA methylation status of A. flavus, as well as reinforces our views on the DNA methylation in fungal species. In addition, our strategy of applying bisulfite sequencing to DNA methylation detection in species with low DNA methylation may serve as a reference for later scientific investigations in other hypomethylated species.
Collapse
Affiliation(s)
- Si-Yang Liu
- MOE Key Laboratory of Aquatic Product Safety, Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
- BGI-Shenzhen, Shenzhen, China
| | - Jian-Qing Lin
- MOE Key Laboratory of Aquatic Product Safety, Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | | | - Cheng-Cheng Wang
- MOE Key Laboratory of Aquatic Product Safety, Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | | | - Yan-Feng Luo
- MOE Key Laboratory of Aquatic Product Safety, Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | | | - Jian-Xiang Zhou
- MOE Key Laboratory of Aquatic Product Safety, Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | | | - Jian-Guo He
- MOE Key Laboratory of Aquatic Product Safety, Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
- * E-mail: (J-GH); (JW); (Z-MH)
| | - Jun Wang
- BGI-Shenzhen, Shenzhen, China
- * E-mail: (J-GH); (JW); (Z-MH)
| | - Zhu-Mei He
- MOE Key Laboratory of Aquatic Product Safety, Key Laboratory of Gene Engineering of the Ministry of Education, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
- * E-mail: (J-GH); (JW); (Z-MH)
| |
Collapse
|
1634
|
Janicki M, Rooke R, Yang G. Bioinformatics and genomic analysis of transposable elements in eukaryotic genomes. Chromosome Res 2012; 19:787-808. [PMID: 21850457 DOI: 10.1007/s10577-011-9230-7] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
A major portion of most eukaryotic genomes are transposable elements (TEs). During evolution, TEs have introduced profound changes to genome size, structure, and function. As integral parts of genomes, the dynamic presence of TEs will continue to be a major force in reshaping genomes. Early computational analyses of TEs in genome sequences focused on filtering out "junk" sequences to facilitate gene annotation. When the high abundance and diversity of TEs in eukaryotic genomes were recognized, these early efforts transformed into the systematic genome-wide categorization and classification of TEs. The availability of genomic sequence data reversed the classical genetic approaches to discovering new TE families and superfamilies. Curated TE databases and their accurate annotation of genome sequences in turn facilitated the studies on TEs in a number of frontiers including: (1) TE-mediated changes of genome size and structure, (2) the influence of TEs on genome and gene functions, (3) TE regulation by host, (4) the evolution of TEs and their population dynamics, and (5) genomic scale studies of TE activity. Bioinformatics and genomic approaches have become an integral part of large-scale studies on TEs to extract information with pure in silico analyses or to assist wet lab experimental studies. The current revolution in genome sequencing technology facilitates further progress in the existing frontiers of research and emergence of new initiatives. The rapid generation of large-sequence datasets at record low costs on a routine basis is challenging the computing industry on storage capacity and manipulation speed and the bioinformatics community for improvement in algorithms and their implementations.
Collapse
Affiliation(s)
- Mateusz Janicki
- Department of Biology, University of Toronto at Mississauga, 3359 Mississauga Road, Mississauga, ON L5L1C6, Canada
| | | | | |
Collapse
|
1635
|
Flutre T, Permal E, Quesneville H. Transposable Element Annotation in Completely Sequenced Eukaryote Genomes. PLANT TRANSPOSABLE ELEMENTS 2012. [DOI: 10.1007/978-3-642-31842-9_2] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
1636
|
Abstract
Most genomes are populated by thousands of sequences that originated from mobile elements. On the one hand, these sequences present a real challenge in the process of genome analysis and annotation. On the other hand, there are very interesting biological subjects involved in many cellular processes. Here, we present an overview of transposable elements (TEs) biodiversity and their impact on genomic evolution. Finally, we discuss different approaches to the TEs detection and analyses.
Collapse
|
1637
|
Muszewska A, Hoffman-Sommer M, Grynberg M. LTR retrotransposons in fungi. PLoS One 2011; 6:e29425. [PMID: 22242120 PMCID: PMC3248453 DOI: 10.1371/journal.pone.0029425] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2011] [Accepted: 11/28/2011] [Indexed: 01/17/2023] Open
Abstract
Transposable elements with long terminal direct repeats (LTR TEs) are one of the best studied groups of mobile elements. They are ubiquitous elements present in almost all eukaryotic genomes. Their number and state of conservation can be a highlight of genome dynamics. We searched all published fungal genomes for LTR-containing retrotransposons, including both complete, functional elements and remnant copies. We identified a total of over 66,000 elements, all of which belong to the Ty1/Copia or Ty3/Gypsy superfamilies. Most of the detected Gypsy elements represent Chromoviridae, i.e. they carry a chromodomain in the pol ORF. We analyzed our data from a genome-ecology perspective, looking at the abundance of various types of LTR TEs in individual genomes and at the highest-copy element from each genome. The TE content is very variable among the analyzed genomes. Some genomes are very scarce in LTR TEs (<50 elements), others demonstrate huge expansions (>8000 elements). The data shows that transposon expansions in fungi usually involve an increase both in the copy number of individual elements and in the number of element types. The majority of the highest-copy TEs from all genomes are Ty3/Gypsy transposons. Phylogenetic analysis of these elements suggests that TE expansions have appeared independently of each other, in distant genomes and at different taxonomical levels. We also analyzed the evolutionary relationships between protein domains encoded by the transposon pol ORF and we found that the protease is the fastest evolving domain whereas reverse transcriptase and RNase H evolve much slower and in correlation with each other.
Collapse
Affiliation(s)
- Anna Muszewska
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Poland.
| | | | | |
Collapse
|
1638
|
Li Z, Zhang Z, Yan P, Huang S, Fei Z, Lin K. RNA-Seq improves annotation of protein-coding genes in the cucumber genome. BMC Genomics 2011; 12:540. [PMID: 22047402 PMCID: PMC3219749 DOI: 10.1186/1471-2164-12-540] [Citation(s) in RCA: 133] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2011] [Accepted: 11/02/2011] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND As more and more genomes are sequenced, genome annotation becomes increasingly important in bridging the gap between sequence and biology. Gene prediction, which is at the center of genome annotation, usually integrates various resources to compute consensus gene structures. However, many newly sequenced genomes have limited resources for gene predictions. In an effort to create high-quality gene models of the cucumber genome (Cucumis sativus var. sativus), based on the EVidenceModeler gene prediction pipeline, we incorporated the massively parallel complementary DNA sequencing (RNA-Seq) reads of 10 cucumber tissues into EVidenceModeler. We applied the new pipeline to the reassembled cucumber genome and included a comparison between our predicted protein-coding gene sets and a published set. RESULTS The reassembled cucumber genome, annotated with RNA-Seq reads from 10 tissues, has 23, 248 identified protein-coding genes. Compared with the published prediction in 2009, approximately 8, 700 genes reveal structural modifications and 5, 285 genes only appear in the reassembled cucumber genome. All the related results, including genome sequence and annotations, are available at http://cmb.bnu.edu.cn/Cucumis_sativus_v20/. CONCLUSIONS We conclude that RNA-Seq greatly improves the accuracy of prediction of protein-coding genes in the reassembled cucumber genome. The comparison between the two gene sets also suggests that it is feasible to use RNA-Seq reads to annotate newly sequenced or less-studied genomes.
Collapse
Affiliation(s)
- Zhen Li
- College of Life Sciences, Beijing Normal University, 19 Xinjiekouwai Street, Beijing, 100875, China
| | | | | | | | | | | |
Collapse
|
1639
|
|
1640
|
Desjardins CA, Champion MD, Holder JW, Muszewska A, Goldberg J, Bailão AM, Brigido MM, Ferreira MEDS, Garcia AM, Grynberg M, Gujja S, Heiman DI, Henn MR, Kodira CD, León-Narváez H, Longo LVG, Ma LJ, Malavazi I, Matsuo AL, Morais FV, Pereira M, Rodríguez-Brito S, Sakthikumar S, Salem-Izacc SM, Sykes SM, Teixeira MM, Vallejo MC, Walter MEMT, Yandava C, Young S, Zeng Q, Zucker J, Felipe MS, Goldman GH, Haas BJ, McEwen JG, Nino-Vega G, Puccia R, San-Blas G, Soares CMDA, Birren BW, Cuomo CA. Comparative genomic analysis of human fungal pathogens causing paracoccidioidomycosis. PLoS Genet 2011; 7:e1002345. [PMID: 22046142 PMCID: PMC3203195 DOI: 10.1371/journal.pgen.1002345] [Citation(s) in RCA: 126] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2011] [Accepted: 08/30/2011] [Indexed: 12/29/2022] Open
Abstract
Paracoccidioides is a fungal pathogen and the cause of paracoccidioidomycosis, a health-threatening human systemic mycosis endemic to Latin America. Infection by Paracoccidioides, a dimorphic fungus in the order Onygenales, is coupled with a thermally regulated transition from a soil-dwelling filamentous form to a yeast-like pathogenic form. To better understand the genetic basis of growth and pathogenicity in Paracoccidioides, we sequenced the genomes of two strains of Paracoccidioides brasiliensis (Pb03 and Pb18) and one strain of Paracoccidioides lutzii (Pb01). These genomes range in size from 29.1 Mb to 32.9 Mb and encode 7,610 to 8,130 genes. To enable genetic studies, we mapped 94% of the P. brasiliensis Pb18 assembly onto five chromosomes. We characterized gene family content across Onygenales and related fungi, and within Paracoccidioides we found expansions of the fungal-specific kinase family FunK1. Additionally, the Onygenales have lost many genes involved in carbohydrate metabolism and fewer genes involved in protein metabolism, resulting in a higher ratio of proteases to carbohydrate active enzymes in the Onygenales than their relatives. To determine if gene content correlated with growth on different substrates, we screened the non-pathogenic onygenale Uncinocarpus reesii, which has orthologs for 91% of Paracoccidioides metabolic genes, for growth on 190 carbon sources. U. reesii showed growth on a limited range of carbohydrates, primarily basic plant sugars and cell wall components; this suggests that Onygenales, including dimorphic fungi, can degrade cellulosic plant material in the soil. In addition, U. reesii grew on gelatin and a wide range of dipeptides and amino acids, indicating a preference for proteinaceous growth substrates over carbohydrates, which may enable these fungi to also degrade animal biomass. These capabilities for degrading plant and animal substrates suggest a duality in lifestyle that could enable pathogenic species of Onygenales to transfer from soil to animal hosts.
Collapse
Affiliation(s)
| | - Mia D. Champion
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Jason W. Holder
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Anna Muszewska
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warszawa, Poland
| | - Jonathan Goldberg
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Alexandre M. Bailão
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Brazil
| | | | | | - Ana Maria Garcia
- Unidad de Biología Celular y Molecular, Corporación para Investigaciones Biológicas, Medellín, Colombia
| | - Marcin Grynberg
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warszawa, Poland
| | - Sharvari Gujja
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - David I. Heiman
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Matthew R. Henn
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Chinnappa D. Kodira
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Henry León-Narváez
- Centro de Microbiología y Biología Celular, Instituto Venezolano de Investigaciones Científicas, Caracas, Venezuela
| | - Larissa V. G. Longo
- Departamento de Microbiologia, Imunologia, e Parasitologia, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil
| | - Li-Jun Ma
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Iran Malavazi
- Faculdade de Ciências Farmacêuticas de Ribeirão Preto Universidade de São Paulo, Ribeirão Preto, Brazil
| | - Alisson L. Matsuo
- Departamento de Microbiologia, Imunologia, e Parasitologia, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil
| | - Flavia V. Morais
- Departamento de Microbiologia, Imunologia, e Parasitologia, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil
- Instituto de Pesquisa y Desenvolvimento, Universidade do Vale do Paraíba, São José dos Campos, Brazil
| | - Maristela Pereira
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Brazil
| | - Sabrina Rodríguez-Brito
- Centro de Microbiología y Biología Celular, Instituto Venezolano de Investigaciones Científicas, Caracas, Venezuela
| | - Sharadha Sakthikumar
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Silvia M. Salem-Izacc
- Laboratório de Biologia Molecular, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiânia, Brazil
| | - Sean M. Sykes
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | | | - Milene C. Vallejo
- Departamento de Microbiologia, Imunologia, e Parasitologia, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil
| | | | - Chandri Yandava
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Sarah Young
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Qiandong Zeng
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Jeremy Zucker
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Maria Sueli Felipe
- Instituto de Ciências Biológicas, Universidade de Brasília, Brasília, Brazil
| | - Gustavo H. Goldman
- Faculdade de Ciências Farmacêuticas de Ribeirão Preto Universidade de São Paulo, Ribeirão Preto, Brazil
- Laboratório Nacional de Ciência e Tecnologia do Bioetanol – CTBE, São Paulo, Brazil
| | - Brian J. Haas
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Juan G. McEwen
- Unidad de Biología Celular y Molecular, Corporación para Investigaciones Biológicas, Medellín, Colombia
- Facultad de Medicina, Universidad de Antioquia, Medellín, Colombia
| | - Gustavo Nino-Vega
- Centro de Microbiología y Biología Celular, Instituto Venezolano de Investigaciones Científicas, Caracas, Venezuela
| | - Rosana Puccia
- Departamento de Microbiologia, Imunologia, e Parasitologia, Escola Paulista de Medicina, Universidade Federal de São Paulo, São Paulo, Brazil
| | - Gioconda San-Blas
- Centro de Microbiología y Biología Celular, Instituto Venezolano de Investigaciones Científicas, Caracas, Venezuela
| | | | - Bruce W. Birren
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Christina A. Cuomo
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| |
Collapse
|
1641
|
|
1642
|
Buti M, Giordani T, Cattonaro F, Cossu RM, Pistelli L, Vukich M, Morgante M, Cavallini A, Natali L. Temporal dynamics in the evolution of the sunflower genome as revealed by sequencing and annotation of three large genomic regions. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2011; 123:779-91. [PMID: 21647740 DOI: 10.1007/s00122-011-1626-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2011] [Accepted: 05/09/2011] [Indexed: 05/02/2023]
Abstract
Improved knowledge of genome composition, especially of its repetitive component, generates important informations in both theoretical and applied research. In this study, we provide the first insight into the local organization of the sunflower genome by sequencing and annotating 349,380 bp from 3 BAC clones, each including one single-copy gene. These analyses resulted in the identification of 11 putative gene sequences, 18 full-length LTR retrotransposons, 6 incomplete LTR retrotransposons, 2 non-autonomous LTR-retroelements (LINEs), 2 putative DNA transposons fragments and one putative helitron. Among LTR-retrotransposons, non-autonomous elements (the so-called LARDs), which do not carry any protein-encoding sequence, were discovered for the first time in the sunflower. The insertion time of intact retroelements was measured, based on sister LTRs divergence. All isolated elements were inserted relatively recently, especially those belonging to the Gypsy superfamily. Retrotransposon families related to those identified in the BAC clones are present also in other species of Helianthus, both annual and perennial, and even in other Asteraceae. In one of the three BAC clones, we found five copies of a lipid transfer protein (LTP) encoding gene within less than 100,000 bp, four of which are potentially functional. Two of these are interrupted by LTR retrotransposons, in the intron and in the coding sequence, respectively. The divergence between sister LTRs of the retrotransposons inserted within the genes indicates that LTP gene duplication started earlier than 1.749 MYRS ago. On the whole, the results reported in this study confirm that the sunflower is an excellent system to study transposons dynamics and evolution.
Collapse
Affiliation(s)
- M Buti
- Department of Crop Plant Biology, University of Pisa, Pisa, Italy
| | | | | | | | | | | | | | | | | |
Collapse
|
1643
|
Renfree MB, Papenfuss AT, Deakin JE, Lindsay J, Heider T, Belov K, Rens W, Waters PD, Pharo EA, Shaw G, Wong ESW, Lefèvre CM, Nicholas KR, Kuroki Y, Wakefield MJ, Zenger KR, Wang C, Ferguson-Smith M, Nicholas FW, Hickford D, Yu H, Short KR, Siddle HV, Frankenberg SR, Chew KY, Menzies BR, Stringer JM, Suzuki S, Hore TA, Delbridge ML, Mohammadi A, Schneider NY, Hu Y, O'Hara W, Al Nadaf S, Wu C, Feng ZP, Cocks BG, Wang J, Flicek P, Searle SMJ, Fairley S, Beal K, Herrero J, Carone DM, Suzuki Y, Sugano S, Toyoda A, Sakaki Y, Kondo S, Nishida Y, Tatsumoto S, Mandiou I, Hsu A, McColl KA, Lansdell B, Weinstock G, Kuczek E, McGrath A, Wilson P, Men A, Hazar-Rethinam M, Hall A, Davis J, Wood D, Williams S, Sundaravadanam Y, Muzny DM, Jhangiani SN, Lewis LR, Morgan MB, Okwuonu GO, Ruiz SJ, Santibanez J, Nazareth L, Cree A, Fowler G, Kovar CL, Dinh HH, Joshi V, Jing C, Lara F, Thornton R, Chen L, Deng J, Liu Y, Shen JY, Song XZ, Edson J, Troon C, Thomas D, Stephens A, Yapa L, Levchenko T, Gibbs RA, Cooper DW, Speed TP, Fujiyama A, M Graves JA, O'Neill RJ, et alRenfree MB, Papenfuss AT, Deakin JE, Lindsay J, Heider T, Belov K, Rens W, Waters PD, Pharo EA, Shaw G, Wong ESW, Lefèvre CM, Nicholas KR, Kuroki Y, Wakefield MJ, Zenger KR, Wang C, Ferguson-Smith M, Nicholas FW, Hickford D, Yu H, Short KR, Siddle HV, Frankenberg SR, Chew KY, Menzies BR, Stringer JM, Suzuki S, Hore TA, Delbridge ML, Mohammadi A, Schneider NY, Hu Y, O'Hara W, Al Nadaf S, Wu C, Feng ZP, Cocks BG, Wang J, Flicek P, Searle SMJ, Fairley S, Beal K, Herrero J, Carone DM, Suzuki Y, Sugano S, Toyoda A, Sakaki Y, Kondo S, Nishida Y, Tatsumoto S, Mandiou I, Hsu A, McColl KA, Lansdell B, Weinstock G, Kuczek E, McGrath A, Wilson P, Men A, Hazar-Rethinam M, Hall A, Davis J, Wood D, Williams S, Sundaravadanam Y, Muzny DM, Jhangiani SN, Lewis LR, Morgan MB, Okwuonu GO, Ruiz SJ, Santibanez J, Nazareth L, Cree A, Fowler G, Kovar CL, Dinh HH, Joshi V, Jing C, Lara F, Thornton R, Chen L, Deng J, Liu Y, Shen JY, Song XZ, Edson J, Troon C, Thomas D, Stephens A, Yapa L, Levchenko T, Gibbs RA, Cooper DW, Speed TP, Fujiyama A, M Graves JA, O'Neill RJ, Pask AJ, Forrest SM, Worley KC. Genome sequence of an Australian kangaroo, Macropus eugenii, provides insight into the evolution of mammalian reproduction and development. Genome Biol 2011; 12:R81. [PMID: 21854559 PMCID: PMC3277949 DOI: 10.1186/gb-2011-12-8-r81] [Show More Authors] [Citation(s) in RCA: 147] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2011] [Revised: 07/22/2011] [Accepted: 08/19/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND We present the genome sequence of the tammar wallaby, Macropus eugenii, which is a member of the kangaroo family and the first representative of the iconic hopping mammals that symbolize Australia to be sequenced. The tammar has many unusual biological characteristics, including the longest period of embryonic diapause of any mammal, extremely synchronized seasonal breeding and prolonged and sophisticated lactation within a well-defined pouch. Like other marsupials, it gives birth to highly altricial young, and has a small number of very large chromosomes, making it a valuable model for genomics, reproduction and development. RESULTS The genome has been sequenced to 2 × coverage using Sanger sequencing, enhanced with additional next generation sequencing and the integration of extensive physical and linkage maps to build the genome assembly. We also sequenced the tammar transcriptome across many tissues and developmental time points. Our analyses of these data shed light on mammalian reproduction, development and genome evolution: there is innovation in reproductive and lactational genes, rapid evolution of germ cell genes, and incomplete, locus-specific X inactivation. We also observe novel retrotransposons and a highly rearranged major histocompatibility complex, with many class I genes located outside the complex. Novel microRNAs in the tammar HOX clusters uncover new potential mammalian HOX regulatory elements. CONCLUSIONS Analyses of these resources enhance our understanding of marsupial gene evolution, identify marsupial-specific conserved non-coding elements and critical genes across a range of biological systems, including reproduction, development and immunity, and provide new insight into marsupial and mammalian biology and genome evolution.
Collapse
Affiliation(s)
- Marilyn B Renfree
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Anthony T Papenfuss
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
- Department of Mathematics and Statistics, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Janine E Deakin
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Research School of Biology, The Australian National University, Canberra, ACT 0200, Australia
| | - James Lindsay
- Department of Molecular and Cell Biology, Center for Applied Genetics and Technology, University of Connecticut, Storrs, CT 06269, USA
| | - Thomas Heider
- Department of Molecular and Cell Biology, Center for Applied Genetics and Technology, University of Connecticut, Storrs, CT 06269, USA
| | - Katherine Belov
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Willem Rens
- Department of Veterinary Medicine, University of Cambridge, Madingley Rd, Cambridge, CB3 0ES, UK
| | - Paul D Waters
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Research School of Biology, The Australian National University, Canberra, ACT 0200, Australia
| | - Elizabeth A Pharo
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Geoff Shaw
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Emily SW Wong
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Christophe M Lefèvre
- Institute for Technology Research and Innovation, Deakin University, Geelong, Victoria, 3214, Australia
| | - Kevin R Nicholas
- Institute for Technology Research and Innovation, Deakin University, Geelong, Victoria, 3214, Australia
| | - Yoko Kuroki
- RIKEN Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Matthew J Wakefield
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
| | - Kyall R Zenger
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
- School of Marine and Tropical Biology, James Cook University, Townsville, Queensland 4811, Australia
| | - Chenwei Wang
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Malcolm Ferguson-Smith
- Department of Veterinary Medicine, University of Cambridge, Madingley Rd, Cambridge, CB3 0ES, UK
| | - Frank W Nicholas
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Danielle Hickford
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Hongshi Yu
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Kirsty R Short
- Department of Microbiology and Immunology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Hannah V Siddle
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Stephen R Frankenberg
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Keng Yih Chew
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Brandon R Menzies
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
- Leibniz Institute for Zoo and Wildlife Research, Alfred-Kowalke-Str. 17, Berlin 10315, Germany
| | - Jessica M Stringer
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Shunsuke Suzuki
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Timothy A Hore
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Laboratory of Developmental Genetics and Imprinting, The Babraham Institute, Cambridge, CB22 3AT, UK
| | - Margaret L Delbridge
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Research School of Biology, The Australian National University, Canberra, ACT 0200, Australia
| | - Amir Mohammadi
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Research School of Biology, The Australian National University, Canberra, ACT 0200, Australia
| | - Nanette Y Schneider
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
- Department of Molecular Genetics, German Institute of Human Nutrition, Potsdam-Rehbruecke, Arthur-Scheunert-Allee 114-116, 14558 Nuthetal, Germany
| | - Yanqiu Hu
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - William O'Hara
- Department of Molecular and Cell Biology, Center for Applied Genetics and Technology, University of Connecticut, Storrs, CT 06269, USA
| | - Shafagh Al Nadaf
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Research School of Biology, The Australian National University, Canberra, ACT 0200, Australia
| | - Chen Wu
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Zhi-Ping Feng
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Benjamin G Cocks
- Biosciences Research Division, Department of Primary Industries, Victoria, 1 Park Drive, Bundoora 3083, Australia
| | - Jianghui Wang
- Biosciences Research Division, Department of Primary Industries, Victoria, 1 Park Drive, Bundoora 3083, Australia
| | - Paul Flicek
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Stephen MJ Searle
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Susan Fairley
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Kathryn Beal
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Javier Herrero
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Dawn M Carone
- Department of Molecular and Cell Biology, Center for Applied Genetics and Technology, University of Connecticut, Storrs, CT 06269, USA
- Department of Cell Biology, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Yutaka Suzuki
- Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8560, Japan
| | - Sumio Sugano
- Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8560, Japan
| | - Atsushi Toyoda
- National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
| | - Yoshiyuki Sakaki
- RIKEN Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Shinji Kondo
- RIKEN Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Yuichiro Nishida
- RIKEN Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Shoji Tatsumoto
- RIKEN Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Ion Mandiou
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269, USA
| | - Arthur Hsu
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
- Department of Medical Biology, The University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Kaighin A McColl
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
| | - Benjamin Lansdell
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
| | - George Weinstock
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Elizabeth Kuczek
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
- Westmead Institute for Cancer Research, University of Sydney, Westmead, New South Wales 2145, Australia
| | - Annette McGrath
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Peter Wilson
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Artem Men
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Mehlika Hazar-Rethinam
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Allison Hall
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - John Davis
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - David Wood
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Sarah Williams
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Yogi Sundaravadanam
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Donna M Muzny
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Shalini N Jhangiani
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Lora R Lewis
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Margaret B Morgan
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Geoffrey O Okwuonu
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - San Juana Ruiz
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Jireh Santibanez
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Lynne Nazareth
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Andrew Cree
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Gerald Fowler
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Christie L Kovar
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Huyen H Dinh
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Vandita Joshi
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Chyn Jing
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Fremiet Lara
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Rebecca Thornton
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Lei Chen
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Jixin Deng
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Yue Liu
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Joshua Y Shen
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Xing-Zhi Song
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Janette Edson
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Carmen Troon
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Daniel Thomas
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Amber Stephens
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Lankesha Yapa
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Tanya Levchenko
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Richard A Gibbs
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| | - Desmond W Cooper
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Biological, Earth and Environmental Sciences, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Terence P Speed
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia
| | - Asao Fujiyama
- National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
- National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
| | - Jennifer A M Graves
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Research School of Biology, The Australian National University, Canberra, ACT 0200, Australia
| | - Rachel J O'Neill
- Department of Molecular and Cell Biology, Center for Applied Genetics and Technology, University of Connecticut, Storrs, CT 06269, USA
| | - Andrew J Pask
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Department of Zoology, The University of Melbourne, Melbourne, Victoria 3010, Australia
- Department of Molecular and Cell Biology, Center for Applied Genetics and Technology, University of Connecticut, Storrs, CT 06269, USA
| | - Susan M Forrest
- The Australian Research Council Centre of Excellence in Kangaroo Genomics, Australia
- Australian Genome Research Facility, Melbourne, Victoria, 3052 and the University of Queensland, St Lucia, Queensland 4072, Australia
| | - Kim C Worley
- Human Genome Sequencing Center, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
1644
|
Abstract
SCAN is a protein domain frequently found at the N termini of proteins encoded by mammalian tandem zinc finger (ZF) genes, whose structure is known to be similar to that of retroviral gag capsid domains and whose multimerization has been proposed as a model for retroviral assembly. We report that the SCAN domain is derived from the C-terminal portion of the gag capsid (CA) protein from the Gmr1-like family of Gypsy/Ty3-like retrotransposons. On the basis of sequence alignments and phylogenetic distributions, we show that the ancestral host SCAN domain (ESCAN for extended SCAN) was exapted from a full-length CA gene from a Gmr1-like retrotransposon at or near the root of the tetrapod animal branch. A truncated variant of ESCAN that corresponds to the annotated SCAN domain arose shortly thereafter and appears to be the only form extant in mammals. The Anolis lizard has a large number of tandem ZF genes with N-terminal ESCAN or SCAN domains. We predict DNA binding sites for all Anolis ESCAN-ZF and SCAN-ZF proteins and demonstrate several highly significant matches to Anolis Gmr1-like sequences, suggesting that at least some of these proteins target retroelements. SCAN is known to mediate protein dimerization, and the CA protein multimerizes to form the core retroviral and retrotransposon capsid structure. We speculate that the SCAN domain originally functioned to target host ZF proteins to retroelement capsids.
Collapse
|
1645
|
Kemen E, Gardiner A, Schultz-Larsen T, Kemen AC, Balmuth AL, Robert-Seilaniantz A, Bailey K, Holub E, Studholme DJ, MacLean D, Jones JDG. Gene gain and loss during evolution of obligate parasitism in the white rust pathogen of Arabidopsis thaliana. PLoS Biol 2011; 9:e1001094. [PMID: 21750662 PMCID: PMC3130010 DOI: 10.1371/journal.pbio.1001094] [Citation(s) in RCA: 188] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2010] [Accepted: 05/10/2011] [Indexed: 01/21/2023] Open
Abstract
Biotrophic eukaryotic plant pathogens require a living host for their growth and form an intimate haustorial interface with parasitized cells. Evolution to biotrophy occurred independently in fungal rusts and powdery mildews, and in oomycete white rusts and downy mildews. Biotroph evolution and molecular mechanisms of biotrophy are poorly understood. It has been proposed, but not shown, that obligate biotrophy results from (i) reduced selection for maintenance of biosynthetic pathways and (ii) gain of mechanisms to evade host recognition or suppress host defence. Here we use Illumina sequencing to define the genome, transcriptome, and gene models for the obligate biotroph oomycete and Arabidopsis parasite, Albugo laibachii. A. laibachii is a member of the Chromalveolata, which incorporates Heterokonts (containing the oomycetes), Apicomplexa (which includes human parasites like Plasmodium falciparum and Toxoplasma gondii), and four other taxa. From comparisons with other oomycete plant pathogens and other chromalveolates, we reveal independent loss of molybdenum-cofactor-requiring enzymes in downy mildews, white rusts, and the malaria parasite P. falciparum. Biotrophy also requires "effectors" to suppress host defence; we reveal RXLR and Crinkler effectors shared with other oomycetes, and also discover and verify a novel class of effectors, the "CHXCs", by showing effector delivery and effector functionality. Our findings suggest that evolution to progressively more intimate association between host and parasite results in reduced selection for retention of certain biosynthetic pathways, and particularly reduced selection for retention of molybdopterin-requiring biosynthetic pathways. These mechanisms are not only relevant to plant pathogenic oomycetes but also to human pathogens within the Chromalveolata.
Collapse
Affiliation(s)
- Eric Kemen
- The Sainsbury Laboratory, Norwich Research Park, Norwich, United Kingdom
| | - Anastasia Gardiner
- The Sainsbury Laboratory, Norwich Research Park, Norwich, United Kingdom
| | | | - Ariane C. Kemen
- The Sainsbury Laboratory, Norwich Research Park, Norwich, United Kingdom
| | - Alexi L. Balmuth
- The Sainsbury Laboratory, Norwich Research Park, Norwich, United Kingdom
- The GenePool, The University of Edinburgh, Edinburgh, United Kingdom
| | | | - Kate Bailey
- The Sainsbury Laboratory, Norwich Research Park, Norwich, United Kingdom
| | - Eric Holub
- School of Life Sciences, University of Warwick, Wellesbourne Campus, United Kingdom
| | | | - Dan MacLean
- The Sainsbury Laboratory, Norwich Research Park, Norwich, United Kingdom
| | | |
Collapse
|
1646
|
De novo genome sequencing and comparative genomics of date palm (Phoenix dactylifera). Nat Biotechnol 2011; 29:521-7. [PMID: 21623354 DOI: 10.1038/nbt.1860] [Citation(s) in RCA: 198] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2011] [Accepted: 03/29/2011] [Indexed: 02/06/2023]
Abstract
Date palm is one of the most economically important woody crops cultivated in the Middle East and North Africa and is a good candidate for improving agricultural yields in arid environments. Nonetheless, long generation times (5-8 years) and dioecy (separate male and female trees) have complicated its cultivation and genetic analysis. To address these issues, we assembled a draft genome for a Khalas variety female date palm, the first publicly available resource of its type for a member of the order Arecales. The ∼380 Mb sequence, spanning mainly gene-rich regions, includes >25,000 gene models and is predicted to cover ∼90% of genes and ∼60% of the genome. Sequencing of eight other cultivars, including females of the Deglet Noor and Medjool varieties and their backcrossed males, identified >3.5 million polymorphic sites, including >10,000 genic copy number variations. A small subset of these polymorphisms can distinguish multiple varieties. We identified a region of the genome linked to gender and found evidence that date palm employs an XY system of gender inheritance.
Collapse
|
1647
|
Banks JA, Nishiyama T, Hasebe M, Bowman JL, Gribskov M, dePamphilis C, Albert VA, Aono N, Aoyama T, Ambrose BA, Ashton NW, Axtell MJ, Barker E, Barker MS, Bennetzen JL, Bonawitz ND, Chapple C, Cheng C, Correa LGG, Dacre M, DeBarry J, Dreyer I, Elias M, Engstrom EM, Estelle M, Feng L, Finet C, Floyd SK, Frommer WB, Fujita T, Gramzow L, Gutensohn M, Harholt J, Hattori M, Heyl A, Hirai T, Hiwatashi Y, Ishikawa M, Iwata M, Karol KG, Koehler B, Kolukisaoglu U, Kubo M, Kurata T, Lalonde S, Li K, Li Y, Litt A, Lyons E, Manning G, Maruyama T, Michael TP, Mikami K, Miyazaki S, Morinaga SI, Murata T, Mueller-Roeber B, Nelson DR, Obara M, Oguri Y, Olmstead RG, Onodera N, Petersen BL, Pils B, Prigge M, Rensing SA, Riaño-Pachón DM, Roberts AW, Sato Y, Scheller HV, Schulz B, Schulz C, Shakirov EV, Shibagaki N, Shinohara N, Shippen DE, Sørensen I, Sotooka R, Sugimoto N, Sugita M, Sumikawa N, Tanurdzic M, Theissen G, Ulvskov P, Wakazuki S, Weng JK, Willats WWGT, Wipf D, Wolf PG, Yang L, Zimmer AD, Zhu Q, Mitros T, Hellsten U, Loqué D, Otillar R, Salamov A, Schmutz J, Shapiro H, Lindquist E, et alBanks JA, Nishiyama T, Hasebe M, Bowman JL, Gribskov M, dePamphilis C, Albert VA, Aono N, Aoyama T, Ambrose BA, Ashton NW, Axtell MJ, Barker E, Barker MS, Bennetzen JL, Bonawitz ND, Chapple C, Cheng C, Correa LGG, Dacre M, DeBarry J, Dreyer I, Elias M, Engstrom EM, Estelle M, Feng L, Finet C, Floyd SK, Frommer WB, Fujita T, Gramzow L, Gutensohn M, Harholt J, Hattori M, Heyl A, Hirai T, Hiwatashi Y, Ishikawa M, Iwata M, Karol KG, Koehler B, Kolukisaoglu U, Kubo M, Kurata T, Lalonde S, Li K, Li Y, Litt A, Lyons E, Manning G, Maruyama T, Michael TP, Mikami K, Miyazaki S, Morinaga SI, Murata T, Mueller-Roeber B, Nelson DR, Obara M, Oguri Y, Olmstead RG, Onodera N, Petersen BL, Pils B, Prigge M, Rensing SA, Riaño-Pachón DM, Roberts AW, Sato Y, Scheller HV, Schulz B, Schulz C, Shakirov EV, Shibagaki N, Shinohara N, Shippen DE, Sørensen I, Sotooka R, Sugimoto N, Sugita M, Sumikawa N, Tanurdzic M, Theissen G, Ulvskov P, Wakazuki S, Weng JK, Willats WWGT, Wipf D, Wolf PG, Yang L, Zimmer AD, Zhu Q, Mitros T, Hellsten U, Loqué D, Otillar R, Salamov A, Schmutz J, Shapiro H, Lindquist E, Lucas S, Rokhsar D, Grigoriev IV. The Selaginella genome identifies genetic changes associated with the evolution of vascular plants. Science 2011; 332:960-3. [PMID: 21551031 PMCID: PMC3166216 DOI: 10.1126/science.1203810] [Show More Authors] [Citation(s) in RCA: 613] [Impact Index Per Article: 43.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Vascular plants appeared ~410 million years ago, then diverged into several lineages of which only two survive: the euphyllophytes (ferns and seed plants) and the lycophytes. We report here the genome sequence of the lycophyte Selaginella moellendorffii (Selaginella), the first nonseed vascular plant genome reported. By comparing gene content in evolutionarily diverse taxa, we found that the transition from a gametophyte- to a sporophyte-dominated life cycle required far fewer new genes than the transition from a nonseed vascular to a flowering plant, whereas secondary metabolic genes expanded extensively and in parallel in the lycophyte and angiosperm lineages. Selaginella differs in posttranscriptional gene regulation, including small RNA regulation of repetitive elements, an absence of the trans-acting small interfering RNA pathway, and extensive RNA editing of organellar genes.
Collapse
Affiliation(s)
- Jo Ann Banks
- Department of Botany and Plant Pathology, Purdue University, West Lafayette, IN 47907, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1648
|
Spliceosomal intron size expansion in domesticated grapevine (Vitis vinifera). BMC Res Notes 2011; 4:52. [PMID: 21385391 PMCID: PMC3058033 DOI: 10.1186/1756-0500-4-52] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2011] [Accepted: 03/08/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Spliceosomal introns are important components of eukaryotic genes as their structure, sizes and contents reflect the architecture of gene and genomes. Intron size, determined by both neutral evolution, repetitive elements activities and potential functional constraints, varies significantly in eukaryotes, suggesting unique dynamics and evolution in different lineages of eukaryotic organisms. However, the evolution of intron size, is rarely studied. To investigate intron size dynamics in flowering plants, in particular domesticated grapevines, a survey of intron size and content in wine grape (Vitis vinifera Pinot Noir) genes was conducted by assembling and mapping the transcriptome of V. vinifera genes from ESTs to characterize and analyze spliceosomal introns. RESULTS Uncommonly large size of spliceosomal intron was observed in V. vinifera genome, otherwise inconsistent with overall genome size dynamics when comparing Arabidopsis, Populus and Vitis. In domesticated grapevine, intron size is generally not related to gene function. The composition of enlarged introns in grapevines indicated extensive transposable element (TE) activity within intronic regions. TEs comprise about 80% of the expanded intron space and in particular, recent LTR retrotransposon insertions are enriched in these intronic regions, suggesting an intron size expansion in the lineage leading to domesticated grapevine, instead of size contractions in Arabidopsis and Populus. Comparative analysis of selected intronic regions in V. vinifera cultivars and wild grapevine species revealed that accelerated TE activity was associated with grapevine domestication, and in some cases with the development of specific cultivars. CONCLUSIONS In this study, we showed intron size expansion driven by TE activities in domesticated grapevines, likely a result of long-term vegetative propagation and intensive human care, which simultaneously promote TE proliferation and repress TE removal mechanisms such as recombination. The intron size expansion observed in domesticated grapevines provided an example of rapid plant genome evolution in response to artificial selection and propagation, and may shed light on the important genomic changes during domestication. In addition, the transcriptome approach used to gather intron size data significantly improved annotations of the V. vinifera genome.
Collapse
|
1649
|
Plant centromeric retrotransposons: a structural and cytogenetic perspective. Mob DNA 2011; 2:4. [PMID: 21371312 PMCID: PMC3059260 DOI: 10.1186/1759-8753-2-4] [Citation(s) in RCA: 153] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2010] [Accepted: 03/03/2011] [Indexed: 12/12/2022] Open
Abstract
Background The centromeric and pericentromeric regions of plant chromosomes are colonized by Ty3/gypsy retrotransposons, which, on the basis of their reverse transcriptase sequences, form the chromovirus CRM clade. Despite their potential importance for centromere evolution and function, they have remained poorly characterized. In this work, we aimed to carry out a comprehensive survey of CRM clade elements with an emphasis on their diversity, structure, chromosomal distribution and transcriptional activity. Results We have surveyed a set of 190 CRM elements belonging to 81 different retrotransposon families, derived from 33 host species and falling into 12 plant families. The sequences at the C-terminus of their integrases were unexpectedly heterogeneous, despite the understanding that they are responsible for targeting to the centromere. This variation allowed the division of the CRM clade into the three groups A, B and C, and the members of each differed considerably with respect to their chromosomal distribution. The differences in chromosomal distribution coincided with variation in the integrase C-terminus sequences possessing a putative targeting domain (PTD). A majority of the group A elements possess the CR motif and are concentrated in the centromeric region, while members of group C have the type II chromodomain and are dispersed throughout the genome. Although representatives of the group B lack a PTD of any type, they appeared to be localized preferentially in the centromeres of tested species. All tested elements were found to be transcriptionally active. Conclusions Comprehensive analysis of the CRM clade elements showed that genuinely centromeric retrotransposons represent only a fraction of the CRM clade (group A). These centromeric retrotransposons represent an active component of centromeres of a wide range of angiosperm species, implying that they play an important role in plant centromere evolution. In addition, their transcriptional activity is consistent with the notion that the transcription of centromeric retrotransposons has a role in normal centromere function.
Collapse
|
1650
|
Finkers-Tomczak A, Bakker E, de Boer J, van der Vossen E, Achenbach U, Golas T, Suryaningrat S, Smant G, Bakker J, Goverse A. Comparative sequence analysis of the potato cyst nematode resistance locus H1 reveals a major lack of co-linearity between three haplotypes in potato (Solanum tuberosum ssp.). TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2011; 122:595-608. [PMID: 21049265 PMCID: PMC3026667 DOI: 10.1007/s00122-010-1472-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2010] [Accepted: 09/30/2010] [Indexed: 05/04/2023]
Abstract
The H1 locus confers resistance to the potato cyst nematode Globodera rostochiensis pathotypes 1 and 4. It is positioned at the distal end of chromosome V of the diploid Solanum tuberosum genotype SH83-92-488 (SH) on an introgression segment derived from S. tuberosum ssp. andigena. Markers from a high-resolution genetic map of the H1 locus (Bakker et al. in Theor Appl Genet 109:146-152, 2004) were used to screen a BAC library to construct a physical map covering a 341-kb region of the resistant haplotype coming from SH. For comparison, physical maps were also generated of the two haplotypes from the diploid susceptible genotype RH89-039-16 (S. tuberosum ssp. tuberosum/S. phureja), spanning syntenic regions of 700 and 319 kb. Gene predictions on the genomic segments resulted in the identification of a large cluster consisting of variable numbers of the CC-NB-LRR type of R genes for each haplotype. Furthermore, the regions were interspersed with numerous transposable elements and genes coding for an extensin-like protein and an amino acid transporter. Comparative analysis revealed a major lack of gene order conservation in the sequences of the three closely related haplotypes. Our data provide insight in the evolutionary mechanisms shaping the H1 locus and will facilitate the map-based cloning of the H1 resistance gene.
Collapse
|