1
|
Pellicer J, Fernández P, Fay MF, Michálková E, Leitch IJ. Genome Size Doubling Arises From the Differential Repetitive DNA Dynamics in the Genus Heloniopsis (Melanthiaceae). Front Genet 2021; 12:726211. [PMID: 34552621 PMCID: PMC8450539 DOI: 10.3389/fgene.2021.726211] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 08/19/2021] [Indexed: 12/23/2022] Open
Abstract
Plant genomes are highly diverse in size and repetitive DNA composition. In the absence of polyploidy, the dynamics of repetitive elements, which make up the bulk of the genome in many species, are the main drivers underpinning changes in genome size and the overall evolution of the genomic landscape. The advent of high-throughput sequencing technologies has enabled investigation of genome evolutionary dynamics beyond model plants to provide exciting new insights in species across the biodiversity of life. Here we analyze the evolution of repetitive DNA in two closely related species of Heloniopsis (Melanthiaceae), which despite having the same chromosome number differ nearly twofold in genome size [i.e., H. umbellata (1C = 4,680 Mb), and H. koreana (1C = 2,480 Mb)]. Low-coverage genome skimming and the RepeatExplorer2 pipeline were used to identify the main repeat families responsible for the significant differences in genome sizes. Patterns of repeat evolution were found to correlate with genome size with the main classes of transposable elements identified being twice as abundant in the larger genome of H. umbellata compared with H. koreana. In addition, among the satellite DNA families recovered, a single shared satellite (HeloSAT) was shown to have contributed significantly to the genome expansion of H. umbellata. Evolutionary changes in repetitive DNA composition and genome size indicate that the differences in genome size between these species have been underpinned by the activity of several distinct repeat lineages.
Collapse
Affiliation(s)
- Jaume Pellicer
- Institut Botànic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Barcelona, Spain.,Royal Botanic Gardens, Kew, Richmond, United Kingdom
| | - Pol Fernández
- Institut Botànic de Barcelona (IBB, CSIC-Ajuntament de Barcelona), Barcelona, Spain
| | - Michael F Fay
- Royal Botanic Gardens, Kew, Richmond, United Kingdom.,School of Plant Biology, University of Western Australia, Crawley, WA, Australia
| | | | - Ilia J Leitch
- Royal Botanic Gardens, Kew, Richmond, United Kingdom
| |
Collapse
|
2
|
Garrido-Ramos MA. The Genomics of Plant Satellite DNA. PROGRESS IN MOLECULAR AND SUBCELLULAR BIOLOGY 2021; 60:103-143. [PMID: 34386874 DOI: 10.1007/978-3-030-74889-0_5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
The twenty-first century began with a certain indifference to the research of satellite DNA (satDNA). Neither genome sequencing projects were able to accurately encompass the study of satDNA nor classic methodologies were able to go further in undertaking a better comprehensive study of the whole set of satDNA sequences of a genome. Nonetheless, knowledge of satDNA has progressively advanced during this century with the advent of new analytical techniques. The enormous advantages that genome-wide approaches have brought to its analysis have now stimulated a renewed interest in the study of satDNA. At this point, we can look back and try to assess more accurately many of the key questions that were left unsolved in the past about this enigmatic and important component of the genome. I review here the understanding gathered on plant satDNAs over the last few decades with an eye on the near future.
Collapse
|
3
|
Jiang W, Jiang C, Yuan W, Zhang M, Fang Z, Li Y, Li G, Jia J, Yang Z. A universal karyotypic system for hexaploid and diploid Avena species brings oat cytogenetics into the genomics era. BMC PLANT BIOLOGY 2021; 21:213. [PMID: 33980176 PMCID: PMC8114715 DOI: 10.1186/s12870-021-02999-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 04/28/2021] [Indexed: 05/04/2023]
Abstract
BACKGROUND The identification of chromosomes among Avena species have been studied by C-banding and in situ hybridization. However, the complicated results from several cytogenetic nomenclatures for identifying oat chromosomes are often contradictory. A universal karyotyping nomenclature system for precise chromosome identification and comparative evolutionary studies would be essential for genus Avena based on the recently released genome sequences of hexaploid and diploid Avena species. RESULTS Tandem repetitive sequences were predicted and physically located on chromosomal regions of the released Avena sativa OT3098 genome assembly v1. Eight new oligonucleotide (oligo) probes for sequential fluorescence in situ hybridization (FISH) were designed and then applied for chromosome karyotyping on mitotic metaphase spreads of A. brevis, A. nuda, A. wiestii, A. ventricosa, A. fatua, and A. sativa species. We established a high-resolution standard karyotype of A. sativa based on the distinct FISH signals of multiple oligo probes. FISH painting with bulked oligos, based on wheat-barley collinear regions, was used to validate the linkage group assignment for individual A. sativa chromosomes. We integrated our new Oligo-FISH based karyotype system with earlier karyotype nomenclatures through sequential C-banding and FISH methods, then subsequently determined the precise breakage points of some chromosome translocations in A. sativa. CONCLUSIONS This new universal chromosome identification system will be a powerful tool for describing the genetic diversity, chromosomal rearrangements and evolutionary relationships among Avena species by comparative cytogenetic and genomic approaches.
Collapse
Affiliation(s)
- Wenxi Jiang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, 611731, Chengdu, China
| | - Chengzhi Jiang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, 611731, Chengdu, China
| | - Weiguang Yuan
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, 611731, Chengdu, China
| | - Meijun Zhang
- College of Agronomy, Shanxi Agricultural University, 030801, Taigu, China
| | - Zijie Fang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, 611731, Chengdu, China
| | - Yang Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, 611731, Chengdu, China
| | - Guangrong Li
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, 611731, Chengdu, China
| | - Juqing Jia
- College of Agronomy, Shanxi Agricultural University, 030801, Taigu, China.
| | - Zujun Yang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, 611731, Chengdu, China.
| |
Collapse
|
4
|
Satellitome Analysis in the Ladybird Beetle Hippodamia variegata (Coleoptera, Coccinellidae). Genes (Basel) 2020; 11:genes11070783. [PMID: 32668664 PMCID: PMC7397073 DOI: 10.3390/genes11070783] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 07/09/2020] [Accepted: 07/09/2020] [Indexed: 12/29/2022] Open
Abstract
Hippodamia variegata is one of the most commercialized ladybirds used for the biological control of aphid pest species in many economically important crops. This species is the first Coccinellidae whose satellitome has been studied by applying new sequencing technologies and bioinformatics tools. We found that 47% of the H. variegata genome is composed of repeated sequences. We identified 30 satellite DNA (satDNA) families with a median intragenomic divergence of 5.75% and A+T content between 45.6% and 74.7%. This species shows satDNA families with highly variable sizes although the most common size is 100–200 bp. However, we highlight the existence of a satDNA family with a repeat unit of 2 kb, the largest repeat unit described in Coleoptera. PCR amplifications for fluorescence in situ hybridization (FISH) probe generation were performed for the four most abundant satDNA families. FISH with the most abundant satDNA family as a probe shows its pericentromeric location on all chromosomes. This location is coincident with the heterochromatin revealed by C-banding and DAPI staining, also analyzed in this work. Hybridization signals for other satDNA families were located only on certain bivalents and the X chromosome. These satDNAs could be very useful as chromosomal markers due to their reduced location.
Collapse
|
5
|
McCann J, Macas J, Novák P, Stuessy TF, Villaseñor JL, Weiss-Schneeweiss H. Differential Genome Size and Repetitive DNA Evolution in Diploid Species of Melampodium sect. Melampodium (Asteraceae). FRONTIERS IN PLANT SCIENCE 2020; 11:362. [PMID: 32296454 PMCID: PMC7136903 DOI: 10.3389/fpls.2020.00362] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 03/12/2020] [Indexed: 05/18/2023]
Abstract
Plant genomes vary greatly in composition and size mainly due to the diversity of repetitive DNAs and the inherent propensity for their amplification and removal from the host genome. Most studies addressing repeatome dynamics focus on model organisms, whereas few provide comprehensive investigations across the genomes of related taxa. Herein, we analyze the evolution of repeats of the 13 species in Melampodium sect. Melampodium, representing all but two of its diploid taxa, in a phylogenetic context. The investigated genomes range in size from 0.49 to 2.27 pg/1C (ca. 4.5-fold variation), despite having the same base chromosome number (x = 10) and very strong phylogenetic affinities. Phylogenetic analysis performed in BEAST and ancestral genome size reconstruction revealed mixed patterns of genome size increases and decreases across the group. High-throughput genome skimming and the RepeatExplorer pipeline were utilized to determine the repeat families responsible for the differences in observed genome sizes. Patterns of repeat evolution were found to be highly correlated with phylogenetic position, namely taxonomic series circumscription. Major differences found were in the abundances of the SIRE (Ty1-copia), Athila (Ty3-gypsy), and CACTA (DNA transposon) lineages. Additionally, several satellite DNA families were found to be highly group-specific, although their overall contribution to genome size variation was relatively small. Evolutionary changes in repetitive DNA composition and genome size were complex, with independent patterns of genome up- and downsizing throughout the evolution of the analyzed diploids. A model-based analysis of genome size and repetitive DNA composition revealed evidence for strong phylogenetic signal and differential evolutionary rates of major lineages of repeats in the diploid genomes.
Collapse
Affiliation(s)
- Jamie McCann
- Department of Botany and Biodiversity Research, University of Vienna, Vienna, Austria
| | - Jiří Macas
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, České Budějovice, Czechia
| | - Petr Novák
- Biology Centre, Czech Academy of Sciences, Institute of Plant Molecular Biology, České Budějovice, Czechia
| | - Tod F. Stuessy
- Herbarium and Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, OH, United States
| | - Jose L. Villaseñor
- Department of Botany, National Autonomous University of Mexico, Mexico City, Mexico
| | | |
Collapse
|
6
|
Discovery of 33mer in chromosome 21 - the largest alpha satellite higher order repeat unit among all human somatic chromosomes. Sci Rep 2019; 9:12629. [PMID: 31477765 PMCID: PMC6718397 DOI: 10.1038/s41598-019-49022-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 08/13/2019] [Indexed: 11/10/2022] Open
Abstract
The centromere is important for segregation of chromosomes during cell division in eukaryotes. Its destabilization results in chromosomal missegregation, aneuploidy, hallmarks of cancers and birth defects. In primate genomes centromeres contain tandem repeats of ~171 bp alpha satellite DNA, commonly organized into higher order repeats (HORs). In spite of crucial importance, satellites have been understudied because of gaps in sequencing - genomic “black holes”. Bioinformatical studies of genomic sequences open possibilities to revolutionize understanding of repetitive DNA datasets. Here, using robust (Global Repeat Map) algorithm we identified in hg38 sequence of human chromosome 21 complete ensemble of alpha satellite HORs with six long repeat units (≥20 mers), five of them novel. Novel 33mer HOR has the longest HOR unit identified so far among all somatic chromosomes and novel 23mer reverse HOR is distant far from the centromere. Also, we discovered that for hg38 assembly the 33mer sequences in chromosomes 21, 13, 14, and 22 are 100% identical but nearby gaps are present; that seems to require an additional more precise sequencing. Chromosome 21 is of significant interest for deciphering the molecular base of Down syndrome and of aneuploidies in general. Since the chromosome identifier probes are largely based on the detection of higher order alpha satellite repeats, distinctions between alpha satellite HORs in chromosomes 21 and 13 here identified might lead to a unique chromosome 21 probe in molecular cytogenetics, which would find utility in diagnostics. It is expected that its complete sequence analysis will have profound implications for understanding pathogenesis of diseases and development of new therapeutic approaches.
Collapse
|
7
|
Liu Q, Li X, Zhou X, Li M, Zhang F, Schwarzacher T, Heslop-Harrison JS. The repetitive DNA landscape in Avena (Poaceae): chromosome and genome evolution defined by major repeat classes in whole-genome sequence reads. BMC PLANT BIOLOGY 2019; 19:226. [PMID: 31146681 PMCID: PMC6543597 DOI: 10.1186/s12870-019-1769-z] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 04/09/2019] [Indexed: 05/18/2023]
Abstract
BACKGROUND Repetitive DNA motifs - not coding genetic information and repeated millions to hundreds of times - make up the majority of many genomes. Here, we identify the nature, abundance and organization of all the repetitive DNA families in oats (Avena sativa, 2n = 6x = 42, AACCDD), a recognized health-food, and its wild relatives. RESULTS Whole-genome sequencing followed by k-mer and RepeatExplorer graph-based clustering analyses enabled assessment of repetitive DNA composition in common oat and its wild relatives' genomes. Fluorescence in situ hybridization (FISH)-based karyotypes are developed to understand chromosome and repetitive sequence evolution of common oat. We show that some 200 repeated DNA motifs make up 70% of the Avena genome, with less than 20 families making up 20% of the total. Retroelements represent the major component, with Ty3/Gypsy elements representing more than 40% of all the DNA, nearly three times more abundant than Ty1/Copia elements. DNA transposons are about 5% of the total, while tandemly repeated, satellite DNA sequences fit into 55 families and represent about 2% of the genome. The Avena species are monophyletic, but both bioinformatic comparisons of repeats in the different genomes, and in situ hybridization to metaphase chromosomes from the hexaploid species, shows that some repeat families are specific to individual genomes, or the A and D genomes together. Notably, there are terminal regions of many chromosomes showing different repeat families from the rest of the chromosome, suggesting presence of translocations between the genomes. CONCLUSIONS The relatively small number of repeat families shows there are evolutionary constraints on their nature and amplification, with mechanisms leading to homogenization, while repeat characterization is useful in providing genome markers and to assist with future assemblies of this large genome (c. 4100 Mb in the diploid). The frequency of inter-genomic translocations suggests optimum strategies to exploit genetic variation from diploid oats for improvement of the hexaploid may differ from those used widely in bread wheat.
Collapse
Affiliation(s)
- Qing Liu
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization / Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.
| | - Xiaoyu Li
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization / Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xiangying Zhou
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization / Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Mingzhi Li
- Genepioneer Biotechnologies Co. Ltd., Nanjing, China
| | - Fengjiao Zhang
- Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| | - Trude Schwarzacher
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization / Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
- Department of Genetics and Genome Biology, University of Leicester, Leicester, LE1 7RH, UK
| | - John Seymour Heslop-Harrison
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization / Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.
- Department of Genetics and Genome Biology, University of Leicester, Leicester, LE1 7RH, UK.
| |
Collapse
|
8
|
Wang GX, He QY, Zhao H, Cai ZX, Guo N, Zong M, Han S, Liu F, Jin WW. ChIP-cloning analysis uncovers centromere-specific retrotransposons in Brassica nigra and reveals their rapid diversification in Brassica allotetraploids. Chromosoma 2019; 128:119-131. [PMID: 30993455 DOI: 10.1007/s00412-019-00701-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Revised: 03/14/2019] [Accepted: 03/20/2019] [Indexed: 01/12/2023]
Abstract
Centromeres are indispensable functional units of chromosomes. The evolutionary mechanisms underlying the rapid evolution of centromeric repeats, especially those following polyploidy, remain unknown. In this study, we isolated centromeric sequences of Brassica nigra, a model diploid progenitor (B genome) of the allopolyploid species B. juncea (AB genome) and B. carinata (BC genome) by chromatin immunoprecipitation of nucleosomes containing the centromere-specific histone CENH3. Sequence analysis detected no centromeric satellite DNAs, and most B. nigra centromeric repeats were found to originate from Tyl/copia-class retrotransposons. In cytological analyses, six of the seven analyzed repeat clusters had no FISH signals in A or C genomes of the related diploid species B. rapa and B. oleracea. Notably, five repeat clusters had FISH signals in both A and B subgenomes in the tetraploid B. juncea. In the tetraploid B. carinata, only CL23 displayed three pairs of signals in terminal or interstitial regions of the C-derived chromosome, and no evidence of colonization of CLs onto C-subgenome centromeres was found in B. carinata. This observation suggests that centromeric repeats spread and proliferated between genomes after polyploidization. CL3 and CRB are likely ancient centromeric sequences arising prior to the divergence of diploid Brassica which have detected signals across the genus. And in allotetraploids B. juncea and B. carinata, the FISH signal intensity of CL3 and CRB differed among subgenomes. We discussed possible mechanisms for centromeric repeat divergence during Brassica speciation and polyploid evolution, thus providing insights into centromeric repeat establishment and targeting.
Collapse
Affiliation(s)
- Gui-Xiang Wang
- Beijing Vegetable Research Center, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture, Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing Academy of Agriculture and Forestry Sciences, Beijing, 100097, China
| | - Qun-Yan He
- National Maize Improvement Center of China, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing, 100193, China
| | - Hong Zhao
- Beijing Vegetable Research Center, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture, Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing Academy of Agriculture and Forestry Sciences, Beijing, 100097, China
| | - Ze-Xi Cai
- National Maize Improvement Center of China, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing, 100193, China
| | - Ning Guo
- Beijing Vegetable Research Center, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture, Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing Academy of Agriculture and Forestry Sciences, Beijing, 100097, China
| | - Mei Zong
- Beijing Vegetable Research Center, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture, Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing Academy of Agriculture and Forestry Sciences, Beijing, 100097, China
| | - Shuo Han
- Beijing Vegetable Research Center, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture, Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing Academy of Agriculture and Forestry Sciences, Beijing, 100097, China
| | - Fan Liu
- Beijing Vegetable Research Center, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Ministry of Agriculture, Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing Academy of Agriculture and Forestry Sciences, Beijing, 100097, China
| | - Wei-Wei Jin
- National Maize Improvement Center of China, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing, 100193, China.
| |
Collapse
|
9
|
Bolsheva NL, Melnikova NV, Kirov IV, Dmitriev AA, Krasnov GS, Amosova АV, Samatadze TE, Yurkevich OY, Zoshchuk SA, Kudryavtseva AV, Muravenko OV. Characterization of repeated DNA sequences in genomes of blue-flowered flax. BMC Evol Biol 2019; 19:49. [PMID: 30813893 PMCID: PMC6391757 DOI: 10.1186/s12862-019-1375-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Members of different sections of the genus Linum are characterized by wide variability in size, morphology and number of chromosomes in karyotypes. Since such variability is determined mainly by the amount and composition of repeated sequences, we conducted a comparative study of the repeatomes of species from four sections forming a clade of blue-flowered flax. Based on the results of high-throughput genome sequencing performed in this study as well as available WGS data, bioinformatic analyses of repeated sequences from 12 flax samples were carried out using a graph-based clustering method. RESULTS It was found that the genomes of closely related species, which have a similar karyotype structure, are also similar in the repeatome composition. In contrast, the repeatomes of karyologically distinct species differed significantly, and no similar tandem-organized repeats have been identified in their genomes. At the same time, many common mobile element families have been identified in genomes of all species, among them, Athila Ty3/gypsy LTR retrotransposon was the most abundant. The 30-chromosome members of the sect. Linum (including the cultivated species L. usitatissimum) differed significantly from other studied species by a great number of satellite DNA families as well as their relative content in genomes. CONCLUSIONS The evolution of studied flax species was accompanied by waves of amplification of satellite DNAs and LTR retrotransposons. The observed inverse correlation between the total contents of dispersed repeats and satellite DNAs allowed to suggest a relationship between both classes of repeating sequences. Significant interspecific differences in satellite DNA sets indicated a high rate of evolution of this genomic fraction. The phylogenetic relationships between the investigated flax species, obtained by comparison of the repeatomes, agreed with the results of previous molecular phylogenetic studies.
Collapse
Affiliation(s)
- Nadezhda L. Bolsheva
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Nataliya V. Melnikova
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Ilya V. Kirov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow, Russia
| | - Alexey A. Dmitriev
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - George S. Krasnov
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Аlexandra V. Amosova
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Tatiana E. Samatadze
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Olga Yu. Yurkevich
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | | | - Anna V. Kudryavtseva
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| | - Olga V. Muravenko
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
10
|
Dluhošová J, Ištvánek J, Nedělník J, Řepková J. Red Clover ( Trifolium pratense) and Zigzag Clover ( T. medium) - A Picture of Genomic Similarities and Differences. FRONTIERS IN PLANT SCIENCE 2018; 9:724. [PMID: 29922311 PMCID: PMC5996420 DOI: 10.3389/fpls.2018.00724] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2018] [Accepted: 05/14/2018] [Indexed: 05/29/2023]
Abstract
The genus clover (Trifolium sp.) is one of the most economically important genera in the Fabaceae family. More than 10 species are grown as manure plants or forage legumes. Red clover's (T. pratense) genome size is one of the smallest in the Trifolium genus, while many clovers with potential breeding value have much larger genomes. Zigzag clover (T. medium) is closely related to the sequenced red clover; however, its genome is approximately 7.5x larger. Currently, almost nothing is known about the architecture of this large genome and differences between these two clover species. We sequenced the T. medium genome (2n = 8x = 64) with ∼23× coverage and managed to partially assemble 492.7 Mbp of its genomic sequence. A thorough comparison between red clover and zigzag clover sequencing reads resulted in the successful validation of 7 T. pratense- and 45 T. medium-specific repetitive elements. The newly discovered repeats led to the set-up of the first partial T. medium karyotype. Newly discovered red clover and zigzag clover tandem repeats were summarized. The structure of centromere-specific satellite repeat resembling that of T. repens was inferred in T. pratense. Two repeats, TrM300 and TrM378, showed a specific localization into centromeres of a half of all zigzag clover chromosomes; TrM300 on eight chromosomes and TrM378 on 24 chromosomes. A comparison with the red clover draft sequence was also used to mine more than 105,000 simple sequence repeats (SSRs) and 1,170,000 single nucleotide variants (SNVs). The presented data obtained from the sequencing of zigzag clover represent the first glimpse on the genomic sequence of this species. Centromeric repeats indicated its allopolyploid origin and naturally occurring homogenization of the centromeric repeat motif was somehow prevented. Using various repeats, highly uniform 64 chromosomes were separated into eight types of chromosomes. Zigzag clover genome underwent substantial chromosome rearrangements and cannot be counted as a true octoploid. The resulting data, especially the large number of predicted SSRs and SNVs, may have great potential for further research of the legume family and for rapid advancements in clover breeding.
Collapse
Affiliation(s)
- Jana Dluhošová
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, Czechia
| | - Jan Ištvánek
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, Czechia
| | | | - Jana Řepková
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, Czechia
| |
Collapse
|
11
|
Novák P, Ávila Robledillo L, Koblížková A, Vrbová I, Neumann P, Macas J. TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads. Nucleic Acids Res 2017. [PMID: 28402514 DOI: 10.1093/nar/gkx257.] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Satellite DNA is one of the major classes of repetitive DNA, characterized by tandemly arranged repeat copies that form contiguous arrays up to megabases in length. This type of genomic organization makes satellite DNA difficult to assemble, which hampers characterization of satellite sequences by computational analysis of genomic contigs. Here, we present tandem repeat analyzer (TAREAN), a novel computational pipeline that circumvents this problem by detecting satellite repeats directly from unassembled short reads. The pipeline first employs graph-based sequence clustering to identify groups of reads that represent repetitive elements. Putative satellite repeats are subsequently detected by the presence of circular structures in their cluster graphs. Consensus sequences of repeat monomers are then reconstructed from the most frequent k-mers obtained by decomposing read sequences from corresponding clusters. The pipeline performance was successfully validated by analyzing low-pass genome sequencing data from five plant species where satellite DNA was previously experimentally characterized. Moreover, novel satellite repeats were predicted for the genome of Vicia faba and three of these repeats were verified by detecting their sequences on metaphase chromosomes using fluorescence in situ hybridization.
Collapse
Affiliation(s)
- Petr Novák
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Laura Ávila Robledillo
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Andrea Koblížková
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Iva Vrbová
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Pavel Neumann
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Jirí Macas
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| |
Collapse
|
12
|
Novák P, Ávila Robledillo L, Koblížková A, Vrbová I, Neumann P, Macas J. TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads. Nucleic Acids Res 2017; 45:e111. [PMID: 28402514 PMCID: PMC5499541 DOI: 10.1093/nar/gkx257] [Citation(s) in RCA: 174] [Impact Index Per Article: 24.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Revised: 03/23/2017] [Accepted: 04/04/2017] [Indexed: 12/21/2022] Open
Abstract
Satellite DNA is one of the major classes of repetitive DNA, characterized by tandemly arranged repeat copies that form contiguous arrays up to megabases in length. This type of genomic organization makes satellite DNA difficult to assemble, which hampers characterization of satellite sequences by computational analysis of genomic contigs. Here, we present tandem repeat analyzer (TAREAN), a novel computational pipeline that circumvents this problem by detecting satellite repeats directly from unassembled short reads. The pipeline first employs graph-based sequence clustering to identify groups of reads that represent repetitive elements. Putative satellite repeats are subsequently detected by the presence of circular structures in their cluster graphs. Consensus sequences of repeat monomers are then reconstructed from the most frequent k-mers obtained by decomposing read sequences from corresponding clusters. The pipeline performance was successfully validated by analyzing low-pass genome sequencing data from five plant species where satellite DNA was previously experimentally characterized. Moreover, novel satellite repeats were predicted for the genome of Vicia faba and three of these repeats were verified by detecting their sequences on metaphase chromosomes using fluorescence in situ hybridization.
Collapse
Affiliation(s)
- Petr Novák
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Laura Ávila Robledillo
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Andrea Koblížková
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Iva Vrbová
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Pavel Neumann
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Jirí Macas
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| |
Collapse
|
13
|
Miga KH. The Promises and Challenges of Genomic Studies of Human Centromeres. PROGRESS IN MOLECULAR AND SUBCELLULAR BIOLOGY 2017; 56:285-304. [PMID: 28840242 DOI: 10.1007/978-3-319-58592-5_12] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Human centromeres are genomic regions that act as sites of kinetochore assembly to ensure proper chromosome segregation during mitosis and meiosis. Although the biological importance of centromeres in genome stability, and ultimately, cell viability are well understood, the complete sequence content and organization in these multi-megabase-sized regions remains unknown. The lack of a high-resolution reference assembly inhibits standard bioinformatics protocols, and as a result, sequence-based studies involving human centromeres lag far behind the advances made for the non-repetitive sequences in the human genome. In this chapter, I introduce what is known about the genomic organization in the highly repetitive regions spanning human centromeres, and discuss the challenges these sequences pose for assembly, alignment, and data interpretation. Overcoming these obstacles is expected to issue a new era for centromere genomics, which will offer new discoveries in basic cell biology and human biomedical research.
Collapse
Affiliation(s)
- Karen H Miga
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA, USA.
| |
Collapse
|
14
|
Ribeiro T, Marques A, Novák P, Schubert V, Vanzela ALL, Macas J, Houben A, Pedrosa-Harand A. Centromeric and non-centromeric satellite DNA organisation differs in holocentric Rhynchospora species. Chromosoma 2016; 126:325-335. [DOI: 10.1007/s00412-016-0616-3] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2016] [Revised: 08/30/2016] [Accepted: 09/01/2016] [Indexed: 12/15/2022]
|
15
|
Sevim V, Bashir A, Chin CS, Miga KH. Alpha-CENTAURI: assessing novel centromeric repeat sequence variation with long read sequencing. Bioinformatics 2016; 32:1921-1924. [PMID: 27153570 PMCID: PMC4920115 DOI: 10.1093/bioinformatics/btw101] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2015] [Accepted: 02/17/2016] [Indexed: 11/13/2022] Open
Abstract
Motivation: Long arrays of near-identical tandem repeats are a common feature of centromeric and subtelomeric regions in complex genomes. These sequences present a source of repeat structure diversity that is commonly ignored by standard genomic tools. Unlike reads shorter than the underlying repeat structure that rely on indirect inference methods, e.g. assembly, long reads allow direct inference of satellite higher order repeat structure. To automate characterization of local centromeric tandem repeat sequence variation we have designed Alpha-CENTAURI (ALPHA satellite CENTromeric AUtomated Repeat Identification), that takes advantage of Pacific Bioscience long-reads from whole-genome sequencing datasets. By operating on reads prior to assembly, our approach provides a more comprehensive set of repeat-structure variants and is not impacted by rearrangements or sequence underrepresentation due to misassembly. Results: We demonstrate the utility of Alpha-CENTAURI in characterizing repeat structure for alpha satellite containing reads in the hydatidiform mole (CHM1, haploid-like) genome. The pipeline is designed to report local repeat organization summaries for each read, thereby monitoring rearrangements in repeat units, shifts in repeat orientation and sites of array transition into non-satellite DNA, typically defined by transposable element insertion. We validate the method by showing consistency with existing centromere high order repeat references. Alpha-CENTAURI can, in principle, run on any sequence data, offering a method to generate a sequence repeat resolution that could be readily performed using consensus sequences available for other satellite families in genomes without high-quality reference assemblies. Availability and implementation: Documentation and source code for Alpha-CENTAURI are freely available at http://github.com/volkansevim/alpha-CENTAURI. Contact:ali.bashir@mssm.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Volkan Sevim
- Pacific Biosciences, Inc., Menlo Park, CA 94025, USA
| | - Ali Bashir
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1425 Madison Avenue, New York, NY 10029, USA
| | | | - Karen H Miga
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA
| |
Collapse
|
16
|
Schrumpfová PP, Vychodilová I, Hapala J, Schořová Š, Dvořáček V, Fajkus J. Telomere binding protein TRB1 is associated with promoters of translation machinery genes in vivo. PLANT MOLECULAR BIOLOGY 2016; 90:189-206. [PMID: 26597966 DOI: 10.1007/s11103-015-0409-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Accepted: 11/16/2015] [Indexed: 05/24/2023]
Abstract
Recently we characterised TRB1, a protein from a single-myb-histone family, as a structural and functional component of telomeres in Arabidopsis thaliana. TRB proteins, besides their ability to bind specifically to telomeric DNA using their N-terminally positioned myb-like domain of the same type as in human shelterin proteins TRF1 or TRF2, also possess a histone-like domain which is involved in protein-protein interactions e.g., with POT1b. Here we set out to investigate the genome-wide localization pattern of TRB1 to reveal its preferential sites of binding to chromatin in vivo and its potential functional roles in the genome-wide context. Our results demonstrate that TRB1 is preferentially associated with promoter regions of genes involved in ribosome biogenesis, in addition to its roles at telomeres. This preference coincides with the frequent occurrence of telobox motifs in the upstream regions of genes in this category, but it is not restricted to the presence of a telobox. We conclude that TRB1 shows a specific genome-wide distribution pattern which suggests its role in regulation of genes involved in biogenesis of the translational machinery, in addition to its preferential telomeric localization.
Collapse
Affiliation(s)
- Petra Procházková Schrumpfová
- Mendel Centre for Plant Genomics and Proteomics, CEITEC - Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
- Laboratory of Functional Genomics and Proteomics, National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
| | - Ivona Vychodilová
- Mendel Centre for Plant Genomics and Proteomics, CEITEC - Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
- Laboratory of Functional Genomics and Proteomics, National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
| | - Jan Hapala
- Mendel Centre for Plant Genomics and Proteomics, CEITEC - Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
- Laboratory of Functional Genomics and Proteomics, National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
| | - Šárka Schořová
- Mendel Centre for Plant Genomics and Proteomics, CEITEC - Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
- Laboratory of Functional Genomics and Proteomics, National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
| | - Vojtěch Dvořáček
- Mendel Centre for Plant Genomics and Proteomics, CEITEC - Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic
- Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Královopolská 135, 61265, Brno, Czech Republic
| | - Jiří Fajkus
- Mendel Centre for Plant Genomics and Proteomics, CEITEC - Central European Institute of Technology, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic.
- Laboratory of Functional Genomics and Proteomics, National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 5, 625 00, Brno, Czech Republic.
- Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Královopolská 135, 61265, Brno, Czech Republic.
| |
Collapse
|
17
|
Pavlek M, Gelfand Y, Plohl M, Meštrović N. Genome-wide analysis of tandem repeats in Tribolium castaneum genome reveals abundant and highly dynamic tandem repeat families with satellite DNA features in euchromatic chromosomal arms. DNA Res 2015; 22:387-401. [PMID: 26428853 PMCID: PMC4675708 DOI: 10.1093/dnares/dsv021] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 08/26/2015] [Indexed: 12/31/2022] Open
Abstract
Although satellite DNAs are well-explored components of heterochromatin and centromeres, little is known about emergence, dispersal and possible impact of comparably structured tandem repeats (TRs) on the genome-wide scale. Our bioinformatics analysis of assembled Tribolium castaneum genome disclosed significant contribution of TRs in euchromatic chromosomal arms and clear predominance of satellite DNA-typical 170 bp monomers in arrays of ≥5 repeats. By applying different experimental approaches, we revealed that the nine most prominent TR families Cast1-Cast9 extracted from the assembly comprise ∼4.3% of the entire genome and reside almost exclusively in euchromatic regions. Among them, seven families that build ∼3.9% of the genome are based on ∼170 and ∼340 bp long monomers. Results of phylogenetic analyses of 2500 monomers originating from these families show high-sequence dynamics, evident by extensive exchanges between arrays on non-homologous chromosomes. In addition, our analysis shows that concerted evolution acts more efficiently on longer than on shorter arrays. Efficient genome-wide distribution of nine TR families implies the role of transposition only in expansion of the most dispersed family, and involvement of other mechanisms is anticipated. Despite similarities in sequence features, FISH experiments indicate high-level compartmentalization of centromeric and euchromatic tandem repeats.
Collapse
Affiliation(s)
- Martina Pavlek
- Ruđer Bošković Institute, Bijenička 54, Zagreb HR-10002, Croatia
| | - Yevgeniy Gelfand
- Laboratory for Biocomputing and Informatics, Boston University, Boston, MA 02215, USA
| | - Miroslav Plohl
- Ruđer Bošković Institute, Bijenička 54, Zagreb HR-10002, Croatia
| | | |
Collapse
|
18
|
Garrido-Ramos MA. Satellite DNA in Plants: More than Just Rubbish. Cytogenet Genome Res 2015; 146:153-170. [PMID: 26202574 DOI: 10.1159/000437008] [Citation(s) in RCA: 107] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/20/2015] [Indexed: 11/19/2022] Open
Abstract
For decades, satellite DNAs have been the hidden part of genomes. Initially considered as junk DNA, there is currently an increasing appreciation of the functional significance of satellite DNA repeats and of their sequences. Satellite DNA families accumulate in the heterochromatin in different parts of the eukaryotic chromosomes, mainly in pericentromeric and subtelomeric regions, but they also span the functional centromere. Tandem repeat sequences may spread from subtelomeric to interstitial loci, leading to the formation of chromosome-specific loci or to the accumulation in equilocal sites in different chromosomes. They also appear as the main components of the heterochromatin in the sex-specific region of sex chromosomes. Satellite DNA, required for chromosome organization, also plays a role in pairing and segregation. Some satellite repeats are transcribed and can participate in the formation and maintenance of heterochromatin structure and in the modulation of gene expression. In addition to the identification of the different satellite DNA families, their characteristics and location, we are interested in determining their impact on the genomes, by identifying the mechanisms leading to their appearance and amplification as well as in understanding how they change over time, the factors affecting these changes, and the influence exerted by the evolutionary history of the organisms. On the other hand, satellite DNA sequences are rapidly evolving sequences that may cause reproductive barriers between organisms and promote speciation. The accumulation of experimental data collected in recent years and the emergence of new approaches based on next-generation sequencing and high-throughput genome analysis are opening new perspectives that are changing our understanding of satellite DNA. This review examines recent data to provide a timely update on the overall information gathered about this part of the genome, focusing on the advances in the knowledge of its origin, its evolution, and its potential functional roles.
Collapse
|
19
|
Peška V, Fajkus P, Fojtová M, Dvořáčková M, Hapala J, Dvořáček V, Polanská P, Leitch AR, Sýkorová E, Fajkus J. Characterisation of an unusual telomere motif (TTTTTTAGGG)n in the plant Cestrum elegans (Solanaceae), a species with a large genome. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2015; 82:644-54. [PMID: 25828846 DOI: 10.1111/tpj.12839] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2015] [Revised: 03/20/2015] [Accepted: 03/23/2015] [Indexed: 05/26/2023]
Abstract
The characterization of unusual telomere sequence sheds light on patterns of telomere evolution, maintenance and function. Plant species from the closely related genera Cestrum, Vestia and Sessea (family Solanaceae) lack known plant telomeric sequences. Here we characterize the telomere of Cestrum elegans, work that was a challenge because of its large genome size and few chromosomes (1C 9.76 pg; n = 8). We developed an approach that combines BAL31 digestion, which digests DNA from the ends and chromosome breaks, with next-generation sequencing (NGS), to generate data analysed in RepeatExplorer, designed for de novo repeats identification and quantification. We identify an unique repeat motif (TTTTTTAGGG)n in C. elegans, occurring in ca. 30 400 copies per haploid genome, averaging ca. 1900 copies per telomere, and synthesized by telomerase. We demonstrate that the motif is synthesized by telomerase. The occurrence of an unusual eukaryote (TTTTTTAGGG)n telomeric motif in C. elegans represents a switch in motif from the 'typical' angiosperm telomere (TTTAGGG)n . That switch may have happened with the divergence of Cestrum, Sessea and Vestia. The shift in motif when it arose would have had profound effects on telomere activity. Thus our finding provides a unique handle to study how telomerase and telomeres responded to genetic change, studies that will shed more light on telomere function.
Collapse
Affiliation(s)
- Vratislav Peška
- Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Královopolská 135, CZ-61265, Brno, Czech Republic
- Faculty of Science, and CEITEC-Central European Institute of Technology, Masaryk University, Kamenice 5, CZ-62500, Brno, Czech Republic
| | - Petr Fajkus
- Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Královopolská 135, CZ-61265, Brno, Czech Republic
- Faculty of Science, and CEITEC-Central European Institute of Technology, Masaryk University, Kamenice 5, CZ-62500, Brno, Czech Republic
| | - Miloslava Fojtová
- Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Královopolská 135, CZ-61265, Brno, Czech Republic
- Faculty of Science, and CEITEC-Central European Institute of Technology, Masaryk University, Kamenice 5, CZ-62500, Brno, Czech Republic
| | - Martina Dvořáčková
- Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Královopolská 135, CZ-61265, Brno, Czech Republic
- Faculty of Science, and CEITEC-Central European Institute of Technology, Masaryk University, Kamenice 5, CZ-62500, Brno, Czech Republic
| | - Jan Hapala
- Faculty of Science, and CEITEC-Central European Institute of Technology, Masaryk University, Kamenice 5, CZ-62500, Brno, Czech Republic
| | - Vojtěch Dvořáček
- Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Královopolská 135, CZ-61265, Brno, Czech Republic
| | - Pavla Polanská
- Faculty of Science, and CEITEC-Central European Institute of Technology, Masaryk University, Kamenice 5, CZ-62500, Brno, Czech Republic
| | - Andrew R Leitch
- School of Biological and Chemical Sciences, Queen Mary University of London, London, E1 4NS, UK
| | - Eva Sýkorová
- Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Královopolská 135, CZ-61265, Brno, Czech Republic
- Faculty of Science, and CEITEC-Central European Institute of Technology, Masaryk University, Kamenice 5, CZ-62500, Brno, Czech Republic
| | - Jiří Fajkus
- Institute of Biophysics, Academy of Sciences of the Czech Republic, v.v.i., Královopolská 135, CZ-61265, Brno, Czech Republic
- Faculty of Science, and CEITEC-Central European Institute of Technology, Masaryk University, Kamenice 5, CZ-62500, Brno, Czech Republic
| |
Collapse
|
20
|
Gilchrist AS, Shearman DCA, Frommer M, Raphael KA, Deshpande NP, Wilkins MR, Sherwin WB, Sved JA. The draft genome of the pest tephritid fruit fly Bactrocera tryoni: resources for the genomic analysis of hybridising species. BMC Genomics 2014; 15:1153. [PMID: 25527032 PMCID: PMC4367827 DOI: 10.1186/1471-2164-15-1153] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Accepted: 12/12/2014] [Indexed: 01/08/2023] Open
Abstract
Background The tephritid fruit flies include a number of economically important pests of horticulture, with a large accumulated body of research on their biology and control. Amongst the Tephritidae, the genus Bactrocera, containing over 400 species, presents various species groups of potential utility for genetic studies of speciation, behaviour or pest control. In Australia, there exists a triad of closely-related, sympatric Bactrocera species which do not mate in the wild but which, despite distinct morphologies and behaviours, can be force-mated in the laboratory to produce fertile hybrid offspring. To exploit the opportunities offered by genomics, such as the efficient identification of genetic loci central to pest behaviour and to the earliest stages of speciation, investigators require genomic resources for future investigations. Results We produced a draft de novo genome assembly of Australia’s major tephritid pest species, Bactrocera tryoni. The male genome (650 -700 Mbp) includes approximately 150Mb of interspersed repetitive DNA sequences and 60Mb of satellite DNA. Assessment using conserved core eukaryotic sequences indicated 98% completeness. Over 16,000 MAKER-derived gene models showed a large degree of overlap with other Dipteran reference genomes. The sequence of the ribosomal RNA transcribed unit was also determined. Unscaffolded assemblies of B. neohumeralis and B. jarvisi were then produced; comparison with B. tryoni showed that the species are more closely related than any Drosophila species pair. The similarity of the genomes was exploited to identify 4924 potentially diagnostic indels between the species, all of which occur in non-coding regions. Conclusions This first draft B. tryoni genome resembles other dipteran genomes in terms of size and putative coding sequences. For all three species included in this study, we have identified a comprehensive set of non-redundant repetitive sequences, including the ribosomal RNA unit, and have quantified the major satellite DNA families. These genetic resources will facilitate the further investigations of genetic mechanisms responsible for the behavioural and morphological differences between these three species and other tephritids. We have also shown how whole genome sequence data can be used to generate simple diagnostic tests between very closely-related species where only one of the species is scaffolded. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-1153) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Anthony Stuart Gilchrist
- Evolution and Ecology Research Centre, School of Biological, Earth and Environmental Sciences, The University of New South Wales, Sydney, NSW 2052 Australia.
| | | | | | | | | | | | | | | |
Collapse
|
21
|
Emadzade K, Jang TS, Macas J, Kovařík A, Novák P, Parker J, Weiss-Schneeweiss H. Differential amplification of satellite PaB6 in chromosomally hypervariable Prospero autumnale complex (Hyacinthaceae). ANNALS OF BOTANY 2014; 114:1597-608. [PMID: 25169019 PMCID: PMC4273535 DOI: 10.1093/aob/mcu178] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
BACKGROUND AND AIMS Chromosomal evolution, including numerical and structural changes, is a major force in plant diversification and speciation. This study addresses genomic changes associated with the extensive chromosomal variation of the Mediterranean Prospero autumnale complex (Hyacinthaceae), which includes four diploid cytotypes each with a unique combination of chromosome number (x = 5, 6, 7), rDNA loci and genome size. METHODS A new satellite repeat PaB6 has previously been identified, and monomers were reconstructed from next-generation sequencing (NGS) data of P. autumnale cytotype B(6)B(6) (2n = 12). Monomers of all other Prospero cytotypes and species were sequenced to check for lineage-specific mutations. Copy number, restriction patterns and methylation levels of PaB6 were analysed using Southern blotting. PaB6 was localized on chromosomes using fluorescence in situ hybridization (FISH). KEY RESULTS The monomer of PaB6 is 249 bp long, contains several intact and truncated vertebrate-type telomeric repeats and is highly methylated. PaB6 is exceptional because of its high copy number and unprecedented variation among diploid cytotypes, ranging from 10(4) to 10(6) copies per 1C. PaB6 is always located in pericentromeric regions of several to all chromosomes. Additionally, two lineages of cytotype B(7)B(7) (x = 7), possessing either a single or duplicated 5S rDNA locus, differ in PaB6 copy number; the ancestral condition of a single locus is associated with higher PaB6 copy numbers. CONCLUSIONS Although present in all Prospero species, PaB6 has undergone differential amplification only in chromosomally variable P. autumnale, particularly in cytotypes B(6)B(6) and B(5)B(5). These arose via independent chromosomal fusions from x = 7 to x = 6 and 5, respectively, accompanied by genome size increases. The copy numbers of satellite DNA PaB6 are among the highest in angiosperms, and changes of PaB6 are exceptionally dynamic in this group of closely related cytotypes of a single species. The evolution of the PaB6 copy numbers is discussed, and it is suggested that PaB6 represents a recent and highly dynamic system originating from a small pool of ancestral repeats.
Collapse
Affiliation(s)
- Khatere Emadzade
- Department of Botany and Biodiversity Research, University of Vienna, Rennweg 14, A-1030 Vienna, Austria
| | - Tae-Soo Jang
- Department of Botany and Biodiversity Research, University of Vienna, Rennweg 14, A-1030 Vienna, Austria
| | - Jiří Macas
- Czech Academy of Sciences, Institute of Plant Molecular Biology, Ceske Budejovice, Czech Republic
| | - Ales Kovařík
- Czech Academy of Sciences, Institute of Biophysics, Brno, Czech Republic
| | - Petr Novák
- Czech Academy of Sciences, Institute of Plant Molecular Biology, Ceske Budejovice, Czech Republic
| | - John Parker
- Cambridge University Botanic Garden, Cambridge CB2 1JF, UK
| | - Hanna Weiss-Schneeweiss
- Department of Botany and Biodiversity Research, University of Vienna, Rennweg 14, A-1030 Vienna, Austria
| |
Collapse
|
22
|
Abstract
Power-law distributions are the main functional form for the distribution of repeat size and repeat copy number in the human genome. When the genome is broken into fragments for sequencing, the limited size of fragments and reads may prevent an unique alignment of repeat sequences to the reference sequence. Repeats in the human genome can be as long as 104 bases, or 105 − 106 bases when allowing for mismatches between repeat units. Sequence reads from these regions are therefore unmappable when the read length is in the range of 103 bases. With a read length of 1000 bases, slightly more than 1% of the assembled genome, and slightly less than 1% of the 1 kb reads, are unmappable, excluding the unassembled portion of the human genome (8% in GRCh37/hg19). The slow decay (long tail) of the power-law function implies a diminishing return in converting unmappable regions/reads to become mappable with the increase of the read length, with the understanding that increasing read length will always move toward the direction of 100% mappability.
Collapse
Affiliation(s)
- Wentian Li
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, North Shore LIJ Health System Manhasset, NY, USA
| | - Jan Freudenberg
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, North Shore LIJ Health System Manhasset, NY, USA
| |
Collapse
|
23
|
Bilinski P, Distor K, Gutierrez-Lopez J, Mendoza GM, Shi J, Dawe RK, Ross-Ibarra J. Diversity and evolution of centromere repeats in the maize genome. Chromosoma 2014; 124:57-65. [PMID: 25190528 DOI: 10.1007/s00412-014-0483-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2014] [Revised: 07/21/2014] [Accepted: 08/11/2014] [Indexed: 10/24/2022]
Abstract
Centromere repeats are found in most eukaryotes and play a critical role in kinetochore formation. Though centromere repeats exhibit considerable diversity both within and among species, little is understood about the mechanisms that drive centromere repeat evolution. Here, we use maize as a model to investigate how a complex history involving polyploidy, fractionation, and recent domestication has impacted the diversity of the maize centromeric repeat CentC. We first validate the existence of long tandem arrays of repeats in maize and other taxa in the genus Zea. Although we find considerable sequence diversity among CentC copies genome-wide, genetic similarity among repeats is highest within these arrays, suggesting that tandem duplications are the primary mechanism for the generation of new copies. Nonetheless, clustering analyses identify similar sequences among distant repeats, and simulations suggest that this pattern may be due to homoplasious mutation. Although the two ancestral subgenomes of maize have contributed nearly equal numbers of centromeres, our analysis shows that the majority of all CentC repeats derive from one of the parental genomes, with an even stronger bias when examining the largest assembled contiguous clusters. Finally, by comparing maize with its wild progenitor teosinte, we find that the abundance of CentC likely decreased after domestication, while the pericentromeric repeat Cent4 has drastically increased.
Collapse
Affiliation(s)
- Paul Bilinski
- Department of Plant Sciences, University of California Davis, Davis, CA, 95616, USA
| | | | | | | | | | | | | |
Collapse
|
24
|
Altemose N, Miga KH, Maggioni M, Willard HF. Genomic characterization of large heterochromatic gaps in the human genome assembly. PLoS Comput Biol 2014; 10:e1003628. [PMID: 24831296 PMCID: PMC4022460 DOI: 10.1371/journal.pcbi.1003628] [Citation(s) in RCA: 81] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2013] [Accepted: 03/26/2014] [Indexed: 01/24/2023] Open
Abstract
The largest gaps in the human genome assembly correspond to multi-megabase heterochromatic regions composed primarily of two related families of tandem repeats, Human Satellites 2 and 3 (HSat2,3). The abundance of repetitive DNA in these regions challenges standard mapping and assembly algorithms, and as a result, the sequence composition and potential biological functions of these regions remain largely unexplored. Furthermore, existing genomic tools designed to predict consensus-based descriptions of repeat families cannot be readily applied to complex satellite repeats such as HSat2,3, which lack a consistent repeat unit reference sequence. Here we present an alignment-free method to characterize complex satellites using whole-genome shotgun read datasets. Utilizing this approach, we classify HSat2,3 sequences into fourteen subfamilies and predict their chromosomal distributions, resulting in a comprehensive satellite reference database to further enable genomic studies of heterochromatic regions. We also identify 1.3 Mb of non-repetitive sequence interspersed with HSat2,3 across 17 unmapped assembly scaffolds, including eight annotated gene predictions. Finally, we apply our satellite reference database to high-throughput sequence data from 396 males to estimate array size variation of the predominant HSat3 array on the Y chromosome, confirming that satellite array sizes can vary between individuals over an order of magnitude (7 to 98 Mb) and further demonstrating that array sizes are distributed differently within distinct Y haplogroups. In summary, we present a novel framework for generating initial reference databases for unassembled genomic regions enriched with complex satellite DNA, and we further demonstrate the utility of these reference databases for studying patterns of sequence variation within human populations. At least 5–10% of the human genome remains unassembled, unmapped, and poorly characterized. The reference assembly annotates these missing regions as multi-megabase heterochromatic gaps, found primarily near centromeres and on the short arms of the acrocentric chromosomes. This missing fraction of the genome consists predominantly of long arrays of near-identical tandem repeats called satellite DNA. Due to the repetitive nature of satellite DNA, sequence assembly algorithms cannot uniquely align overlapping sequence reads, and thus satellite-rich domains have been omitted from the reference assembly and from most genome-wide studies of variation and function. Existing methods for analyzing some satellite DNAs cannot be easily extended to a large portion of satellites whose repeat structures are complex and largely uncharacterized, such as Human Satellites 2 and 3 (HSat2,3). Here we characterize HSat2,3 using a novel approach that does not depend on having a well-defined repeat structure. By classifying genome-wide HSat2,3 sequences into subfamilies and localizing them to chromosomes, we have generated an initial HSat2,3 genomic reference, which serves as a critical foundation for future studies of variation and function in these regions. This approach should be generally applicable to other classes of satellite DNA, in both the human genome and other complex genomes.
Collapse
Affiliation(s)
- Nicolas Altemose
- Genome Biology Group, Duke Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina, United States of America
| | - Karen H. Miga
- Genome Biology Group, Duke Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina, United States of America
- * E-mail:
| | - Mauro Maggioni
- Department of Mathematics, Duke University, Durham, North Carolina, United States of America
| | - Huntington F. Willard
- Genome Biology Group, Duke Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina, United States of America
| |
Collapse
|
25
|
Centromere identity from the DNA point of view. Chromosoma 2014; 123:313-25. [PMID: 24763964 PMCID: PMC4107277 DOI: 10.1007/s00412-014-0462-0] [Citation(s) in RCA: 129] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2013] [Revised: 03/28/2014] [Accepted: 04/01/2014] [Indexed: 02/05/2023]
Abstract
The centromere is a chromosomal locus responsible for the faithful segregation of genetic material during cell division. It has become evident that centromeres can be established literally on any DNA sequence, and the possible synergy between DNA sequences and the most prominent centromere identifiers, protein components, and epigenetic marks remains uncertain. However, some evolutionary preferences seem to exist, and long-term established centromeres are frequently formed on long arrays of satellite DNAs and/or transposable elements. Recent progress in understanding functional centromere sequences is based largely on the high-resolution DNA mapping of sequences that interact with the centromere-specific histone H3 variant, the most reliable marker of active centromeres. In addition, sequence assembly and mapping of large repetitive centromeric regions, as well as comparative genome analyses offer insight into their complex organization and evolution. The rapidly advancing field of transcription in centromere regions highlights the functional importance of centromeric transcripts. Here, we comprehensively review the current state of knowledge on the composition and functionality of DNA sequences underlying active centromeres and discuss their contribution to the functioning of different centromere types in higher eukaryotes.
Collapse
|
26
|
Iwata A, Tek AL, Richard MMS, Abernathy B, Fonsêca A, Schmutz J, Chen NWG, Thareau V, Magdelenat G, Li Y, Murata M, Pedrosa-Harand A, Geffroy V, Nagaki K, Jackson SA. Identification and characterization of functional centromeres of the common bean. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2013; 76:47-60. [PMID: 23795942 DOI: 10.1111/tpj.12269] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2013] [Revised: 06/15/2013] [Accepted: 06/20/2013] [Indexed: 05/07/2023]
Abstract
In higher eukaryotes, centromeres are typically composed of megabase-sized arrays of satellite repeats that evolve rapidly and homogenize within a species' genome. Despite the importance of centromeres, our knowledge is limited to a few model species. We conducted a comprehensive analysis of common bean (Phaseolus vulgaris) centromeric satellite DNA using genomic data, fluorescence in situ hybridization (FISH), immunofluorescence and chromatin immunoprecipitation (ChIP). Two unrelated centromere-specific satellite repeats, CentPv1 and CentPv2, and the common bean centromere-specific histone H3 (PvCENH3) were identified. FISH showed that CentPv1 and CentPv2 are predominantly located at subsets of eight and three centromeres, respectively. Immunofluorescence- and ChIP-based assays demonstrated the functional significance of CentPv1 and CentPv2 at centromeres. Genomic analysis revealed several interesting features of CentPv1 and CentPv2: (i) CentPv1 is organized into an higher-order repeat structure, named Nazca, of 528 bp, whereas CentPv2 is composed of tandemly organized monomers; (ii) CentPv1 and CentPv2 have undergone chromosome-specific homogenization; and (iii) CentPv1 and CentPv2 are not likely to be commingled in the genome. These findings suggest that two distinct sets of centromere sequences have evolved independently within the common bean genome, and provide insight into centromere satellite evolution.
Collapse
Affiliation(s)
- Aiko Iwata
- Center for Applied Genetic Technologies and Institute for Plant Breeding Genetics, and Genomics, University of Georgia, Athens, GA, 30602, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Heckmann S, Macas J, Kumke K, Fuchs J, Schubert V, Ma L, Novák P, Neumann P, Taudien S, Platzer M, Houben A. The holocentric species Luzula elegans shows interplay between centromere and large-scale genome organization. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2013; 73:555-65. [PMID: 23078243 DOI: 10.1111/tpj.12054] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2012] [Revised: 10/11/2012] [Accepted: 10/16/2012] [Indexed: 05/18/2023]
Abstract
In higher plants, the large-scale structure of monocentric chromosomes consists of distinguishable eu- and heterochromatic regions, the proportions and organization of which depend on a species' genome size. To determine whether the same interplay is maintained for holocentric chromosomes, we investigated the distribution of repetitive sequences and epigenetic marks in the woodrush Luzula elegans (3.81 Gbp/1C). Sixty-one per cent of the L. elegans genome is characterized by highly repetitive DNA, with over 30 distinct sequence families encoding an exceptionally high diversity of satellite repeats. Over 33% of the genome is composed of the Angela clade of Ty1/copia LTR retrotransposons, which are uniformly dispersed along the chromosomes, while the satellite repeats occur as bands whose distribution appears to be biased towards the chromosome termini. No satellite showed an almost chromosome-wide distribution pattern as expected for a holocentric chromosome and no typical centromere-associated LTR retrotransposons were found either. No distinguishable large-scale patterns of eu- and heterochromatin-typical epigenetic marks or early/late DNA replicating domains were found along mitotic chromosomes, although super-high-resolution light microscopy revealed distinguishable interspersed units of various chromatin types. Our data suggest a correlation between the centromere and overall genome organization in species with holocentric chromosomes.
Collapse
Affiliation(s)
- Stefan Heckmann
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstrasse 3, 06466, Gatersleben, Germany
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Čížková J, Hřibová E, Humplíková L, Christelová P, Suchánková P, Doležel J. Molecular analysis and genomic organization of major DNA satellites in banana (Musa spp.). PLoS One 2013; 8:e54808. [PMID: 23372772 PMCID: PMC3553004 DOI: 10.1371/journal.pone.0054808] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2012] [Accepted: 12/17/2012] [Indexed: 02/03/2023] Open
Abstract
Satellite DNA sequences consist of tandemly arranged repetitive units up to thousands nucleotides long in head-to-tail orientation. The evolutionary processes by which satellites arise and evolve include unequal crossing over, gene conversion, transposition and extra chromosomal circular DNA formation. Large blocks of satellite DNA are often observed in heterochromatic regions of chromosomes and are a typical component of centromeric and telomeric regions. Satellite-rich loci may show specific banding patterns and facilitate chromosome identification and analysis of structural chromosome changes. Unlike many other genomes, nuclear genomes of banana (Musa spp.) are poor in satellite DNA and the information on this class of DNA remains limited. The banana cultivars are seed sterile clones originating mostly from natural intra-specific crosses within M. acuminata (A genome) and inter-specific crosses between M. acuminata and M. balbisiana (B genome). Previous studies revealed the closely related nature of the A and B genomes, including similarities in repetitive DNA. In this study we focused on two main banana DNA satellites, which were previously identified in silico. Their genomic organization and molecular diversity was analyzed in a set of nineteen Musa accessions, including representatives of A, B and S (M. schizocarpa) genomes and their inter-specific hybrids. The two DNA satellites showed a high level of sequence conservation within, and a high homology between Musa species. FISH with probes for the satellite DNA sequences, rRNA genes and a single-copy BAC clone 2G17 resulted in characteristic chromosome banding patterns in M. acuminata and M. balbisiana which may aid in determining genomic constitution in interspecific hybrids. In addition to improving the knowledge on Musa satellite DNA, our study increases the number of cytogenetic markers and the number of individual chromosomes, which can be identified in Musa.
Collapse
Affiliation(s)
- Jana Čížková
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Eva Hřibová
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Lenka Humplíková
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Pavla Christelová
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Pavla Suchánková
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Jaroslav Doležel
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| |
Collapse
|
29
|
Steflova P, Tokan V, Vogel I, Lexa M, Macas J, Novak P, Hobza R, Vyskot B, Kejnovsky E. Contrasting patterns of transposable element and satellite distribution on sex chromosomes (XY1Y2) in the dioecious plant Rumex acetosa. Genome Biol Evol 2013; 5:769-82. [PMID: 23542206 PMCID: PMC3641822 DOI: 10.1093/gbe/evt049] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/25/2013] [Indexed: 12/23/2022] Open
Abstract
Rumex acetosa is a dioecious plant with the XY1Y2 sex chromosome system. Both Y chromosomes are heterochromatic and are thought to be degenerated. We performed low-pass 454 sequencing and similarity-based clustering of male and female genomic 454 reads to identify and characterize major groups of R. acetosa repetitive DNA. We found that Copia and Gypsy retrotransposons dominated, followed by DNA transposons and nonlong terminal repeat retrotransposons. CRM and Tat/Ogre retrotransposons dominated the Gypsy superfamily, whereas Maximus/Sireviruses were most abundant among Copia retrotransposons. Only one Gypsy subfamily had accumulated on Y1 and Y2 chromosomes, whereas many retrotransposons were ubiquitous on autosomes and the X chromosome, but absent on Y1 and Y2 chromosomes, and others were depleted from the X chromosome. One group of CRM Gypsy was specifically localized to centromeres. We also found that majority of previously described satellites (RAYSI, RAYSII, RAYSIII, and RAE180) are accumulated on the Y chromosomes where we identified Y chromosome-specific variant of RAE180. We discovered two novel satellites-RA160 satellite dominating on the X chromosome and RA690 localized mostly on the Y1 chromosome. The expression pattern obtained from Illumina RNA sequencing showed that the expression of transposable elements is similar in leaves of both sexes and that satellites are also expressed. Contrasting patterns of transposable elements (TEs) and satellite localization on sex chromosomes in R. acetosa, where not only accumulation but also depletion of repetitive DNA was observed, suggest that a plethora of evolutionary processes can shape sex chromosomes.
Collapse
Affiliation(s)
- Pavlina Steflova
- Department of Plant Developmental Genetics, Institute of
Biophysics ASCR, Brno, Czech Republic
- Laboratory of Genome Dynamics, CEITEC—Central European
Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Viktor Tokan
- Department of Plant Developmental Genetics, Institute of
Biophysics ASCR, Brno, Czech Republic
| | - Ivan Vogel
- Department of Plant Developmental Genetics, Institute of
Biophysics ASCR, Brno, Czech Republic
- Laboratory of Genome Dynamics, CEITEC—Central European
Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Matej Lexa
- Laboratory of Genome Dynamics, CEITEC—Central European
Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Jiri Macas
- Biology Centre ASCR, Institute of Plant Molecular Biology,
Ceske Budejovice, Czech Republic
| | - Petr Novak
- Biology Centre ASCR, Institute of Plant Molecular Biology,
Ceske Budejovice, Czech Republic
| | - Roman Hobza
- Department of Plant Developmental Genetics, Institute of
Biophysics ASCR, Brno, Czech Republic
- Laboratory of Molecular Cytogenetics and Cytometry, Centre
of the Region Haná for Biotechnological and Agricultural Research, Institute of
Experimental Botany, Olomouc, Czech Republic
| | - Boris Vyskot
- Department of Plant Developmental Genetics, Institute of
Biophysics ASCR, Brno, Czech Republic
| | - Eduard Kejnovsky
- Department of Plant Developmental Genetics, Institute of
Biophysics ASCR, Brno, Czech Republic
- Laboratory of Genome Dynamics, CEITEC—Central European
Institute of Technology, Masaryk University, Brno, Czech Republic
| |
Collapse
|
30
|
Hayden KE, Willard HF. Composition and organization of active centromere sequences in complex genomes. BMC Genomics 2012; 13:324. [PMID: 22817545 PMCID: PMC3422206 DOI: 10.1186/1471-2164-13-324] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2012] [Accepted: 07/20/2012] [Indexed: 01/13/2023] Open
Abstract
Background Centromeres are sites of chromosomal spindle attachment during mitosis and meiosis. While the sequence basis for centromere identity remains a subject of considerable debate, one approach is to examine the genomic organization at these active sites that are correlated with epigenetic marks of centromere function. Results We have developed an approach to characterize both satellite and non-satellite centromeric sequences that are missing from current assemblies in complex genomes, using the dog genome as an example. Combining this genomic reference with an epigenetic dataset corresponding to sequences associated with the histone H3 variant centromere protein A (CENP-A), we identify active satellite sequence domains that appear to be both functionally and spatially distinct within the overall definition of satellite families. Conclusions These findings establish a genomic and epigenetic foundation for exploring the functional role of centromeric sequences in the previously sequenced dog genome and provide a model for similar studies within the context of less-characterized genomes.
Collapse
Affiliation(s)
- Karen E Hayden
- Genome Biology Group, Duke Institute for Genome Sciences & Policy, Duke University, Durham, NC, USA.
| | | |
Collapse
|
31
|
Macas J, Kejnovský E, Neumann P, Novák P, Koblížková A, Vyskot B. Next generation sequencing-based analysis of repetitive DNA in the model dioecious [corrected] plant Silene latifolia. PLoS One 2011; 6:e27335. [PMID: 22096552 PMCID: PMC3212565 DOI: 10.1371/journal.pone.0027335] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2011] [Accepted: 10/14/2011] [Indexed: 01/04/2023] Open
Abstract
Background Silene latifolia is a dioceous plant with well distinguished X and Y chromosomes that is used as a model to study sex determination and sex chromosome evolution in plants. However, efficient utilization of this species has been hampered by the lack of large-scale sequencing resources and detailed analysis of its genome composition, especially with respect to repetitive DNA, which makes up the majority of the genome. Methodology/Principal Findings We performed low-pass 454 sequencing followed by similarity-based clustering of 454 reads in order to identify and characterize sequences of all major groups of S. latifolia repeats. Illumina sequencing data from male and female genomes were also generated and employed to quantify the genomic proportions of individual repeat families. The majority of identified repeats belonged to LTR-retrotransposons, constituting about 50% of genomic DNA, with Ty3/gypsy elements being more frequent than Ty1/copia. While there were differences between the male and female genome in the abundance of several repeat families, their overall repeat composition was highly similar. Specific localization patterns on sex chromosomes were found for several satellite repeats using in situ hybridization with probes based on k-mer frequency analysis of Illumina sequencing data. Conclusions/Significance This study provides comprehensive information about the sequence composition and abundance of repeats representing over 60% of the S. latifolia genome. The results revealed generally low divergence in repeat composition between the sex chromosomes, which is consistent with their relatively recent origin. In addition, the study generated various data resources that are available for future exploration of the S. latifolia genome.
Collapse
Affiliation(s)
- Jiří Macas
- Biology Centre of the Academy of Sciences of the Czech Republic, Institute of Plant Molecular Biology, České Budějovice, Czech Republic.
| | | | | | | | | | | |
Collapse
|
32
|
Organization and evolution of subtelomeric satellite repeats in the potato genome. G3-GENES GENOMES GENETICS 2011; 1:85-92. [PMID: 22384321 PMCID: PMC3276127 DOI: 10.1534/g3.111.000125] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2011] [Accepted: 05/03/2011] [Indexed: 12/30/2022]
Abstract
Subtelomeric domains immediately adjacent to telomeres represent one of the most dynamic and rapidly evolving regions in eukaryotic genomes. A common feature associated with subtelomeric regions in different eukaryotes is the presence of long arrays of tandemly repeated satellite sequences. However, studies on molecular organization and evolution of subtelomeric repeats are rare. We isolated two subtelomeric repeats, CL14 and CL34, from potato (Solanum tuberosum). The CL14 and CL34 repeats are organized as independent long arrays, up to 1-3 Mb, of 182 bp and 339 bp monomers, respectively. The CL14 and CL34 repeat arrays are directly connected with the telomeric repeats at some chromosomal ends. The CL14 repeat was detected at the subtelomeric regions among highly diverged Solanum species, including tomato (Solanum lycopersicum). In contrast, CL34 was only found in potato and its closely related species. Interestingly, the CL34 repeat array was always proximal to the telomeres when both CL14 and CL34 were found at the same chromosomal end. In addition, the CL34 repeat family showed more sequence variability among monomers compared with the CL14 repeat family. We conclude that the CL34 repeat family emerged recently from the subtelomeric regions of potato chromosomes and is rapidly evolving. These results provide further evidence that subtelomeric domains are among the most dynamic regions in eukaryotic genomes.
Collapse
|