1
|
Assembly and characterization of the genome of chard (Beta vulgaris ssp. vulgaris var. cicla). J Biotechnol 2021; 333:67-76. [PMID: 33932500 DOI: 10.1016/j.jbiotec.2021.04.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Revised: 04/09/2021] [Accepted: 04/25/2021] [Indexed: 10/21/2022]
Abstract
Chard (Beta vulgaris ssp. vulgaris var. cicla) is a member of one of four different cultigroups of beets. While the genome of sugar beet, the most prominent beet crop, has been studied extensively, molecular data on other beet cultivars is scant. Here, we present a genome assembly of chard, a vegetable crop grown for its fleshy leaves. We report a de novo genome assembly of 604 Mbp, slightly larger than sugar beet assemblies presented so far. About 57 % of the assembly was annotated as repetitive sequence, of which LTR retrotransposons were the most abundant. Based on the presence of conserved genes, the chard assembly was estimated to be at least 96 % complete regarding its gene space. We predicted 34,521 genes of which 27,582 genes were supported by evidence from transcriptomic sequencing reads, and 5503 of the evidence-supported genes had multiple isoforms. We compared the chard gene set with gene sets from sugar beet and two wild beets (i.e. Beta vulgaris ssp. maritima and Beta patula) to find orthology relationships and identified genome-wide syntenic regions between chard and sugar beet. Lastly, we determined genomic variants that distinguish sugar beet and chard. Assessing the variation distribution along the chard chromosomes, we found extensive haplotype sharing between the two cultivars. In summary, our work provides a foundation for the molecular analysis of Beta vulgaris cultigroups as a basis for chard genomics and to unravel the domestication history of beet crops.
Collapse
|
2
|
The genome of Ectocarpus subulatus - A highly stress-tolerant brown alga. Mar Genomics 2020; 52:100740. [PMID: 31937506 DOI: 10.1016/j.margen.2020.100740] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 01/01/2020] [Indexed: 11/20/2022]
Abstract
Brown algae are multicellular photosynthetic stramenopiles that colonize marine rocky shores worldwide. Ectocarpus sp. Ec32 has been established as a genomic model for brown algae. Here we present the genome and metabolic network of the closely related species, Ectocarpus subulatus Kützing, which is characterized by high abiotic stress tolerance. Since their separation, both strains show new traces of viral sequences and the activity of large retrotransposons, which may also be related to the expansion of a family of chlorophyll-binding proteins. Further features suspected to contribute to stress tolerance include an expanded family of heat shock proteins, the reduction of genes involved in the production of halogenated defence compounds, and the presence of fewer cell wall polysaccharide-modifying enzymes. Overall, E. subulatus has mainly lost members of gene families down-regulated in low salinities, and conserved those that were up-regulated in the same condition. However, 96% of genes that differed between the two examined Ectocarpus species, as well as all genes under positive selection, were found to encode proteins of unknown function. This underlines the uniqueness of brown algal stress tolerance mechanisms as well as the significance of establishing E. subulatus as a comparative model for future functional studies.
Collapse
|
3
|
Fatal perinatal mitochondrial cardiac failure caused by recurrent de novo duplications in the ATAD3 locus. MED 2020; 2:49-73. [PMID: 33575671 DOI: 10.1016/j.medj.2020.06.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Background In about half of all patients with a suspected monogenic disease, genomic investigations fail to identify the diagnosis. A contributing factor is the difficulty with repetitive regions of the genome, such as those generated by segmental duplications. The ATAD3 locus is one such region, in which recessive deletions and dominant duplications have recently been reported to cause lethal perinatal mitochondrial diseases characterized by pontocerebellar hypoplasia or cardiomyopathy, respectively. Methods Whole exome, whole genome and long-read DNA sequencing techniques combined with studies of RNA and quantitative proteomics were used to investigate 17 subjects from 16 unrelated families with suspected mitochondrial disease. Findings We report six different de novo duplications in the ATAD3 gene locus causing a distinctive presentation including lethal perinatal cardiomyopathy, persistent hyperlactacidemia, and frequently corneal clouding or cataracts and encephalopathy. The recurrent 68 Kb ATAD3 duplications are identifiable from genome and exome sequencing but usually missed by microarrays. The ATAD3 duplications result in the formation of identical chimeric ATAD3A/ATAD3C proteins, altered ATAD3 complexes and a striking reduction in mitochondrial oxidative phosphorylation complex I and its activity in heart tissue. Conclusions ATAD3 duplications appear to act in a dominant-negative manner and the de novo inheritance infers a low recurrence risk for families, unlike most pediatric mitochondrial diseases. More than 350 genes underlie mitochondrial diseases. In our experience the ATAD3 locus is now one of the five most common causes of nuclear-encoded pediatric mitochondrial disease but the repetitive nature of the locus means ATAD3 diagnoses may be frequently missed by current genomic strategies. Funding Australian NHMRC, US Department of Defense, Japanese AMED and JSPS agencies, Australian Genomics Health Alliance and Australian Mito Foundation.
Collapse
|
4
|
Genomes of the wild beets Beta patula and Beta vulgaris ssp. maritima. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2019; 99:1242-1253. [PMID: 31104348 PMCID: PMC9546096 DOI: 10.1111/tpj.14413] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 04/23/2019] [Accepted: 05/02/2019] [Indexed: 05/04/2023]
Abstract
We present draft genome assemblies of Beta patula, a critically endangered wild beet endemic to the Madeira archipelago, and of the closely related Beta vulgaris ssp. maritima (sea beet). Evidence-based reference gene sets for B. patula and sea beet were generated, consisting of 25 127 and 27 662 genes, respectively. The genomes and gene sets of the two wild beets were compared with their cultivated sister taxon B. vulgaris ssp. vulgaris (sugar beet). Large syntenic regions were identified, and a display tool for automatic genome-wide synteny image generation was developed. Phylogenetic analysis based on 9861 genes showing 1:1:1 orthology supported the close relationship of B. patula to sea beet and sugar beet. A comparative analysis of the Rz2 locus, responsible for rhizomania resistance, suggested that the sequenced B. patula accession was rhizomania susceptible. Reference karyotypes for the two wild beets were established, and genomic rearrangements were detected. We consider our data as highly valuable and comprehensive resources for wild beet studies, B. patula conservation management, and sugar beet breeding research.
Collapse
|
5
|
Whole Genome Sequencing Improves Outcomes of Genetic Testing in Patients With Hypertrophic Cardiomyopathy. J Am Coll Cardiol 2018; 72:419-429. [DOI: 10.1016/j.jacc.2018.04.078] [Citation(s) in RCA: 109] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/20/2017] [Revised: 03/18/2018] [Accepted: 04/24/2018] [Indexed: 11/24/2022]
|
6
|
Defining the genetic basis of early onset hereditary spastic paraplegia using whole genome sequencing. Neurogenetics 2016; 17:265-270. [PMID: 27679996 PMCID: PMC5061846 DOI: 10.1007/s10048-016-0495-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2016] [Accepted: 09/12/2016] [Indexed: 12/20/2022]
Abstract
We performed whole genome sequencing (WGS) in nine families from India with early-onset hereditary spastic paraplegia (HSP). We obtained a genetic diagnosis in 4/9 (44 %) families within known HSP genes (DDHD2 and CYP2U1), as well as perixosomal biogenesis disorders (PEX16) and GM1 gangliosidosis (GLB1). In the remaining patients, no candidate structural variants, copy number variants or predicted splice variants affecting an extended candidate gene list were identified. Our findings demonstrate the efficacy of using WGS for diagnosing early-onset HSP, particularly in consanguineous families (4/6 diagnosed), highlighting that two of the diagnoses would not have been made using a targeted approach.
Collapse
|
7
|
Genome and transcriptome analysis of the Mesoamerican common bean and the role of gene duplications in establishing tissue and temporal specialization of genes. Genome Biol 2016; 17:32. [PMID: 26911872 PMCID: PMC4766624 DOI: 10.1186/s13059-016-0883-6] [Citation(s) in RCA: 114] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2015] [Accepted: 01/22/2016] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Legumes are the third largest family of angiosperms and the second most important crop class. Legume genomes have been shaped by extensive large-scale gene duplications, including an approximately 58 million year old whole genome duplication shared by most crop legumes. RESULTS We report the genome and the transcription atlas of coding and non-coding genes of a Mesoamerican genotype of common bean (Phaseolus vulgaris L., BAT93). Using a comprehensive phylogenomics analysis, we assessed the past and recent evolution of common bean, and traced the diversification of patterns of gene expression following duplication. We find that successive rounds of gene duplications in legumes have shaped tissue and developmental expression, leading to increased levels of specialization in larger gene families. We also find that many long non-coding RNAs are preferentially expressed in germ-line-related tissues (pods and seeds), suggesting that they play a significant role in fruit development. Our results also suggest that most bean-specific gene family expansions, including resistance gene clusters, predate the split of the Mesoamerican and Andean gene pools. CONCLUSIONS The genome and transcriptome data herein generated for a Mesoamerican genotype represent a counterpart to the genomic resources already available for the Andean gene pool. Altogether, this information will allow the genetic dissection of the characters involved in the domestication and adaptation of the crop, and their further implementation in breeding strategies for this important crop.
Collapse
|
8
|
Exploiting single-molecule transcript sequencing for eukaryotic gene prediction. Genome Biol 2015; 16:184. [PMID: 26328666 PMCID: PMC4556409 DOI: 10.1186/s13059-015-0729-7] [Citation(s) in RCA: 94] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2015] [Accepted: 07/22/2015] [Indexed: 12/20/2022] Open
Abstract
We develop a method to predict and validate gene models using PacBio single-molecule, real-time (SMRT) cDNA reads. Ninety-eight percent of full-insert SMRT reads span complete open reading frames. Gene model validation using SMRT reads is developed as automated process. Optimized training and prediction settings and mRNA-seq noise reduction of assisting Illumina reads results in increased gene prediction sensitivity and precision. Additionally, we present an improved gene set for sugar beet (Beta vulgaris) and the first genome-wide gene set for spinach (Spinacia oleracea). The workflow and guidelines are a valuable resource to obtain comprehensive gene sets for newly sequenced genomes of non-model eukaryotes.
Collapse
|
9
|
Abstract
Background Anaplastic lymphoma kinase (ALK) genomic alterations have emerged as a potent predictor of benefit from treatment with ALK inhibitors in several cancers. Currently, there is no information about ALK gene alterations in urothelial carcinoma (UC) and its correlation with clinical or pathologic features and outcome. Methods Samples from patients with advanced UC and correlative clinical data were collected. Genomic imbalances were investigated by array comparative genomic hybridization (aCGH). ALK gene status was evaluated by fluorescence in situ hybridization (FISH). ALK expression was assessed by immunohistochemistry (IHC) and high-throughput mutation analysis with Oncomap 3 platform. Next generation sequencing was performed using Illumina Genome Analyzer IIx, and Illumina HiSeq 2000 in the FISH positive case. Results 70 of 96 patients had tissue available for all the tests performed. Arm level copy number gains at chromosome 2 were identified in 17 (24%) patients. Minor copy number alterations (CNAs) in the proximity of ALK locus were found in 3 patients by aCGH. By FISH analysis, one of these samples had a deletion of the 5′ALK. Whole genome next generation sequencing was inconclusive to confirm the deletion at the level of the ALK gene at the coverage level used. We did not observe an association between ALK CNA and overall survival, ECOG PS, or development of visceral disease. Conclusions ALK genomic alterations are rare and probably without prognostic implications in UC. The potential for testing ALK inhibitors in UC merits further investigation but might be restricted to the identification of an enriched population.
Collapse
|
10
|
Profiling of extensively diversified plant LINEs reveals distinct plant-specific subclades. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2014; 79:385-97. [PMID: 24862340 DOI: 10.1111/tpj.12565] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Revised: 05/12/2014] [Accepted: 05/15/2014] [Indexed: 05/03/2023]
Abstract
A large fraction of eukaryotic genomes is made up of long interspersed nuclear elements (LINEs). Due to their capability to create novel copies via error-prone reverse transcription, they generate multiple families and reach high copy numbers. Although mammalian LINEs have been well described, plant LINEs have been only poorly investigated. Here, we present a systematic cross-species survey of LINEs in higher plant genomes shedding light on plant LINE evolution as well as diversity, and facilitating their annotation in genome projects. Applying a Hidden Markov Model (HMM)-based analysis, 59 390 intact LINE reverse transcriptases (RTs) were extracted from 23 plant genomes. These fall in only two out of 28 LINE clades (L1 and RTE) known in eukaryotes. While plant RTE LINEs are highly homogenous and mostly constitute only a single family per genome, plant L1 LINEs are extremely diverse and form numerous families. Despite their heterogeneity, all members across the 23 species fall into only seven L1 subclades, some of them defined here. Exemplarily focusing on the L1 LINEs of a basal reference plant genome (Beta vulgaris), we show that the subclade classification level does not only reflect RT sequence similarity, but also mirrors structural aspects of complete LINE retrotransposons, like element size, position and type of encoded enzymatic domains. Our comprehensive catalogue of plant LINE RTs serves the classification of highly diverse plant LINEs, while the provided subclade-specific HMMs facilitate their annotation.
Collapse
|
11
|
Cytosine methylation of an ancient satellite family in the wild beet Beta procumbens. Cytogenet Genome Res 2014; 143:157-67. [PMID: 24994030 DOI: 10.1159/000363485] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
DNA methylation is an essential epigenetic feature for the regulation and maintenance of heterochromatin. Satellite DNA is a repetitive sequence component that often occurs in large arrays in heterochromatin of subtelomeric, intercalary and centromeric regions. Knowledge about the methylation status of satellite DNA is important for understanding the role of repetitive DNA in heterochromatization. In this study, we investigated the cytosine methylation of the ancient satellite family pEV in the wild beet Beta procumbens. The pEV satellite is widespread in species-specific pEV subfamilies in the genus Beta and most likely originated before the radiation of the Betoideae and Chenopodioideae. In B. procumbens, the pEV subfamily occurs abundantly and spans intercalary and centromeric regions. To uncover its cytosine methylation, we performed chromosome-wide immunostaining and bisulfite sequencing of pEV satellite repeats. We found that CG and CHG sites are highly methylated while CHH sites show only low levels of methylation. As a consequence of the low frequency of CG and CHG sites and the preferential occurrence of most cytosines in the CHH motif in pEV monomers, this satellite family displays only low levels of total cytosine methylation.
Collapse
|
12
|
The CHH motif in sugar beet satellite DNA: a modulator for cytosine methylation. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2014; 78:937-50. [PMID: 24661787 DOI: 10.1111/tpj.12519] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2013] [Revised: 03/17/2014] [Accepted: 03/18/2014] [Indexed: 05/03/2023]
Abstract
Methylation of DNA is important for the epigenetic silencing of repetitive DNA in plant genomes. Knowledge about the cytosine methylation status of satellite DNAs, a major class of repetitive DNA, is scarce. One reason for this is that arrays of tandemly arranged sequences are usually collapsed in next-generation sequencing assemblies. We applied strategies to overcome this limitation and quantified the level of cytosine methylation and its pattern in three satellite families of sugar beet (Beta vulgaris) which differ in their abundance, chromosomal localization and monomer size. We visualized methylation levels along pachytene chromosomes with respect to small satellite loci at maximum resolution using chromosome-wide fluorescent in situ hybridization complemented with immunostaining and super-resolution microscopy. Only reduced methylation of many satellite arrays was obtained. To investigate methylation at the nucleotide level we performed bisulfite sequencing of 1569 satellite sequences. We found that the level of methylation of cytosine strongly depends on the sequence context: cytosines in the CHH motif show lower methylation (44-52%), while CG and CHG motifs are more strongly methylated. This affects the overall methylation of satellite sequences because CHH occurs frequently while CG and CHG are rare or even absent in the satellite arrays investigated. Evidently, CHH is the major target for modulation of the cytosine methylation level of adjacent monomers within individual arrays and contributes to their epigenetic function. This strongly indicates that asymmetric cytosine methylation plays a role in the epigenetic modification of satellite repeats in plant genomes.
Collapse
|
13
|
Differential expression patterns of non-symbiotic hemoglobins in sugar beet (Beta vulgaris ssp. vulgaris). PLANT & CELL PHYSIOLOGY 2014; 55:834-44. [PMID: 24486763 DOI: 10.1093/pcp/pcu027] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Biennial sugar beet (Beta vulgaris spp. vulgaris) is a Caryophyllidae that has adapted its growth cycle to the seasonal temperature and daylength variation of temperate regions. This is the first time a holistic study of the expression pattern of non-symbiotic hemoglobins (nsHbs) is being carried out in a member of this group and under two essential environmental conditions for flowering, namely vernalization and length of photoperiod. BvHb genes were identified by sequence homology searches against the latest draft of the sugar beet genome. Three nsHb genes (BvHb1.1, BvHb1.2 and BvHb2) and one truncated Hb gene (BvHb3) were found in the genome of sugar beet. Gene expression profiling of the nsHb genes was carried out by quantitative PCR in different organs and developmental stages, as well as during vernalization and under different photoperiods. BvHb1.1 and BvHb2 showed differential expression during vernalization as well as during long and short days. The high expression of BvHb2 indicates that it has an active role in the cell, maybe even taking over some BvHb1.2 functions, except during germination where BvHb1.2 together with BvHb1.1-both Class 1 nsHbs-are highly expressed. The unprecedented finding of a leader peptide at the N-terminus of BvHb1.1, for the first time in an nsHb from higher plants, together with its observed expression indicate that it may have a very specific role due to its suggested location in chloroplasts. Our findings open up new possibilities for research, breeding and engineering since Hbs could be more involved in plant development than previously was anticipated.
Collapse
|
14
|
The genome of the recently domesticated crop plant sugar beet (Beta vulgaris). Nature 2013; 505:546-9. [PMID: 24352233 DOI: 10.1038/nature12817] [Citation(s) in RCA: 326] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2013] [Accepted: 10/29/2013] [Indexed: 01/25/2023]
Abstract
Sugar beet (Beta vulgaris ssp. vulgaris) is an important crop of temperate climates which provides nearly 30% of the world's annual sugar production and is a source for bioethanol and animal feed. The species belongs to the order of Caryophylalles, is diploid with 2n = 18 chromosomes, has an estimated genome size of 714-758 megabases and shares an ancient genome triplication with other eudicot plants. Leafy beets have been cultivated since Roman times, but sugar beet is one of the most recently domesticated crops. It arose in the late eighteenth century when lines accumulating sugar in the storage root were selected from crosses made with chard and fodder beet. Here we present a reference genome sequence for sugar beet as the first non-rosid, non-asterid eudicot genome, advancing comparative genomics and phylogenetic reconstructions. The genome sequence comprises 567 megabases, of which 85% could be assigned to chromosomes. The assembly covers a large proportion of the repetitive sequence content that was estimated to be 63%. We predicted 27,421 protein-coding genes supported by transcript data and annotated them on the basis of sequence homology. Phylogenetic analyses provided evidence for the separation of Caryophyllales before the split of asterids and rosids, and revealed lineage-specific gene family expansions and losses. We sequenced spinach (Spinacia oleracea), another Caryophyllales species, and validated features that separate this clade from rosids and asterids. Intraspecific genomic variation was analysed based on the genome sequences of sea beet (Beta vulgaris ssp. maritima; progenitor of all beet crops) and four additional sugar beet accessions. We identified seven million variant positions in the reference genome, and also large regions of low variability, indicating artificial selection. The sugar beet genome sequence enables the identification of genes affecting agronomically relevant traits, supports molecular breeding and maximizes the plant's potential in energy biotechnology.
Collapse
|
15
|
Highly diverse chromoviruses of Beta vulgaris are classified by chromodomains and chromosomal integration. Mob DNA 2013; 4:8. [PMID: 23448600 PMCID: PMC3605345 DOI: 10.1186/1759-8753-4-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2012] [Accepted: 01/22/2013] [Indexed: 12/25/2022] Open
Abstract
Background Chromoviruses are one of the three genera of Ty3-gypsy long terminal repeat (LTR) retrotransposons, and are present in high copy numbers in plant genomes. They are widely distributed within the plant kingdom, with representatives even in lower plants such as green and red algae. Their hallmark is the presence of a chromodomain at the C-terminus of the integrase. The chromodomain exhibits structural characteristics similar to proteins of the heterochromatin protein 1 (HP1) family, which mediate the binding of each chromovirus type to specific histone variants. A specific integration via the chromodomain has been shown for only a few chromoviruses. However, a detailed study of different chromoviral clades populating a single plant genome has not yet been carried out. Results We conducted a comprehensive survey of chromoviruses within the Beta vulgaris (sugar beet) genome, and found a highly diverse chromovirus population, with significant differences in element size, primarily caused by their flanking LTRs. In total, we identified and annotated full-length members of 16 families belonging to the four plant chromoviral clades: CRM, Tekay, Reina, and Galadriel. The families within each clade are structurally highly conserved; in particular, the position of the chromodomain coding region relative to the polypurine tract is clade-specific. Two distinct groups of chromodomains were identified. The group II chromodomain was present in three chromoviral clades, whereas families of the CRM clade contained a more divergent motif. Physical mapping using representatives of all four clades identified a clade-specific integration pattern. For some chromoviral families, we detected the presence of expressed sequence tags, indicating transcriptional activity. Conclusions We present a detailed study of chromoviruses, belonging to the four major clades, which populate a single plant genome. Our results illustrate the diversity and family structure of B. vulgaris chromoviruses, and emphasize the role of chromodomains in the targeted integration of these viruses. We suggest that the diverse sets of plant chromoviruses with their different localization patterns might help to facilitate plant-genome organization in a structural and functional manner.
Collapse
|
16
|
Evolutionary reshuffling in the Errantivirus lineage Elbe within the Beta vulgaris genome. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2012; 72:636-51. [PMID: 22804913 DOI: 10.1111/j.1365-313x.2012.05107.x] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
LTR retrotransposons and retroviruses are closely related. Although a viral envelope gene is found in some LTR retrotransposons and all retroviruses, only the latter show infectivity. The identification of Ty3-gypsy-like retrotransposons possessing putative envelope-like open reading frames blurred the taxonomical borders and led to the establishment of the Errantivirus, Metavirus and Chromovirus genera within the Metaviridae. Only a few plant Errantiviruses have been described, and their evolutionary history is not well understood. In this study, we investigated 27 retroelements of four abundant Elbe retrotransposon families belonging to the Errantiviruses in Beta vulgaris (sugar beet). Retroelements of the Elbe lineage integrated between 0.02 and 5.59 million years ago, and show family-specific variations in autonomy and degree of rearrangements: while Elbe3 members are highly fragmented, often truncated and present in a high number of solo LTRs, Elbe2 members are mainly autonomous. We observed extensive reshuffling of structural motifs across families, leading to the formation of new retrotransposon families. Elbe retrotransposons harbor a typical envelope-like gene, often encoding transmembrane domains. During the course of Elbe evolution, the additional open reading frames have been strongly modified or independently acquired. Taken together, the Elbe lineage serves as retrotransposon model reflecting the various stages in Errantivirus evolution, and allows a detailed analysis of retrotransposon family formation.
Collapse
|
17
|
Survey of sugar beet (Beta vulgaris L.) hAT transposons and MITE-like hATpin derivatives. PLANT MOLECULAR BIOLOGY 2012; 78:393-405. [PMID: 22246381 DOI: 10.1007/s11103-011-9872-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2011] [Accepted: 12/20/2011] [Indexed: 05/03/2023]
Abstract
Genome-wide analyses of repetitive DNA suggest a significant impact particularly of transposable elements on genome size and evolution of virtually all eukaryotic organisms. In this study, we analyzed the abundance and diversity of the hAT transposon superfamily of the sugar beet (B. vulgaris) genome, using molecular, bioinformatic and cytogenetic approaches. We identified 81 transposase-coding sequences, three of which are part of structurally intact but nonfunctional hAT transposons (BvhAT), in a B. vulgaris BAC library as well as in whole genome sequencing-derived data sets. Additionally, 116 complete and 497 truncated non-autonomous BvhAT derivatives lacking the transposase gene were in silico-detected. The 116 complete derivatives were subdivided into four BvhATpin groups each characterized by a distinct terminal inverted repeat motif. Both BvhAT and BvhATpin transposons are specific for species of the genus Beta and closely related species, showing a localization on B. vulgaris chromosomes predominantely in euchromatic regions. The lack of any BvhAT transposase function together with the high degree of degeneration observed for the BvhAT and the BvhATpin genomic fraction contrasts with the abundance and activity of autonomous and non-autonomous hAT transposons revealed in other plant species. This indicates a possible genus-specific structural and functional repression of the hAT transposon superfamily during Beta diversification and evolution.
Collapse
|
18
|
Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and genome analyzer systems. Genome Biol 2011; 12:R112. [PMID: 22067484 PMCID: PMC3334598 DOI: 10.1186/gb-2011-12-11-r112] [Citation(s) in RCA: 385] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2011] [Revised: 10/21/2011] [Accepted: 11/08/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The generation and analysis of high-throughput sequencing data are becoming a major component of many studies in molecular biology and medical research. Illumina's Genome Analyzer (GA) and HiSeq instruments are currently the most widely used sequencing devices. Here, we comprehensively evaluate properties of genomic HiSeq and GAIIx data derived from two plant genomes and one virus, with read lengths of 95 to 150 bases. RESULTS We provide quantifications and evidence for GC bias, error rates, error sequence context, effects of quality filtering, and the reliability of quality values. By combining different filtering criteria we reduced error rates 7-fold at the expense of discarding 12.5% of alignable bases. While overall error rates are low in HiSeq data we observed regions of accumulated wrong base calls. Only 3% of all error positions accounted for 24.7% of all substitution errors. Analyzing the forward and reverse strands separately revealed error rates of up to 18.7%. Insertions and deletions occurred at very low rates on average but increased to up to 2% in homopolymers. A positive correlation between read coverage and GC content was found depending on the GC content range. CONCLUSIONS The errors and biases we report have implications for the use and the interpretation of Illumina sequencing data. GAIIx and HiSeq data sets show slightly different error profiles. Quality filtering is essential to minimize downstream analysis artifacts. Supporting previous recommendations, the strand-specificity provides a criterion to distinguish sequencing errors from low abundance polymorphisms.
Collapse
|