1
|
Gálvez-Galván A, Garrido-Ramos MA, Prieto P. Bread wheat satellitome: a complex scenario in a huge genome. PLANT MOLECULAR BIOLOGY 2024; 114:8. [PMID: 38291213 PMCID: PMC10827815 DOI: 10.1007/s11103-023-01404-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 11/01/2023] [Indexed: 02/01/2024]
Abstract
In bread wheat (Triticum aestivum L.), chromosome associations during meiosis are extremely regulated and initiate at the telomeres and subtelomeres, which are enriched in satellite DNA (satDNA). We present the study and characterization of the bread wheat satellitome to shed light on the molecular organization of wheat subtelomeres. Our results revealed that the 2.53% of bread wheat genome is composed by satDNA and subtelomeres are particularly enriched in such DNA sequences. Thirty-four satellite DNA (21 for the first time in this work) have been identified, analyzed and cytogenetically validated. Many of the satDNAs were specifically found at particular subtelomeric chromosome regions revealing the asymmetry in subtelomere organisation among the wheat subgenomes, which might play a role in proper homologous recognition and pairing during meiosis. An integrated physical map of the wheat satellitome was also constructed. To the best of our knowledge, our results show that the combination of both cytogenetics and genome research allowed the first comprehensive analysis of the wheat satellitome, shedding light on the complex wheat genome organization, especially on the polymorphic nature of subtelomeres and their putative implication in chromosome recognition and pairing during meiosis.
Collapse
Affiliation(s)
- Ana Gálvez-Galván
- Plant Breeding Department, Institute for Sustainable Agriculture, Agencia Estatal Consejo Superior de Investigaciones Científicas (CSIC), Avda. Menéndez Pidal, Campus Alameda del Obispo S/N, 14004, Córdoba, Spain
| | - Manuel A Garrido-Ramos
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Avda. Fuentenueva S/N, 18071, Granada, Spain.
| | - Pilar Prieto
- Plant Breeding Department, Institute for Sustainable Agriculture, Agencia Estatal Consejo Superior de Investigaciones Científicas (CSIC), Avda. Menéndez Pidal, Campus Alameda del Obispo S/N, 14004, Córdoba, Spain.
| |
Collapse
|
2
|
Kirov I, Kolganova E, Dudnikov M, Yurkevich OY, Amosova AV, Muravenko OV. A Pipeline NanoTRF as a New Tool for De Novo Satellite DNA Identification in the Raw Nanopore Sequencing Reads of Plant Genomes. PLANTS (BASEL, SWITZERLAND) 2022; 11:2103. [PMID: 36015406 PMCID: PMC9413040 DOI: 10.3390/plants11162103] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 08/08/2022] [Accepted: 08/11/2022] [Indexed: 06/15/2023]
Abstract
High-copy tandemly organized repeats (TRs), or satellite DNA, is an important but still enigmatic component of eukaryotic genomes. TRs comprise arrays of multi-copy and highly similar tandem repeats, which makes the elucidation of TRs a very challenging task. Oxford Nanopore sequencing data provide a valuable source of information on TR organization at the single molecule level. However, bioinformatics tools for de novo identification of TRs in raw Nanopore data have not been reported so far. We developed NanoTRF, a new python pipeline for TR repeat identification, characterization and consensus monomer sequence assembly. This new pipeline requires only a raw Nanopore read file from low-depth (<1×) genome sequencing. The program generates an informative html report and figures on TR genome abundance, monomer sequence and monomer length. In addition, NanoTRF performs annotation of transposable elements (TEs) sequences within or near satDNA arrays, and the information can be used to elucidate how TR−TE co-evolve in the genome. Moreover, we validated by FISH that the NanoTRF report is useful for the evaluation of TR chromosome organization—clustered or dispersed. Our findings showed that NanoTRF is a robust method for the de novo identification of satellite repeats in raw Nanopore data without prior read assembly. The obtained sequences can be used in many downstream analyses including genome assembly assistance and gap estimation, chromosome mapping and cytogenetic marker development.
Collapse
Affiliation(s)
- Ilya Kirov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, Moscow 127550, Russia
- Moscow Institute of Physics and Technology, Dolgoprudny 141701, Russia
| | - Elizaveta Kolganova
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, Moscow 127550, Russia
| | - Maxim Dudnikov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, Moscow 127550, Russia
- Moscow Institute of Physics and Technology, Dolgoprudny 141701, Russia
| | - Olga Yu. Yurkevich
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia
| | - Alexandra V. Amosova
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia
| | - Olga V. Muravenko
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow 119991, Russia
| |
Collapse
|
3
|
Garrido-Ramos MA. The Genomics of Plant Satellite DNA. PROGRESS IN MOLECULAR AND SUBCELLULAR BIOLOGY 2021; 60:103-143. [PMID: 34386874 DOI: 10.1007/978-3-030-74889-0_5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
The twenty-first century began with a certain indifference to the research of satellite DNA (satDNA). Neither genome sequencing projects were able to accurately encompass the study of satDNA nor classic methodologies were able to go further in undertaking a better comprehensive study of the whole set of satDNA sequences of a genome. Nonetheless, knowledge of satDNA has progressively advanced during this century with the advent of new analytical techniques. The enormous advantages that genome-wide approaches have brought to its analysis have now stimulated a renewed interest in the study of satDNA. At this point, we can look back and try to assess more accurately many of the key questions that were left unsolved in the past about this enigmatic and important component of the genome. I review here the understanding gathered on plant satDNAs over the last few decades with an eye on the near future.
Collapse
|
4
|
Kuo YT, Ishii T, Fuchs J, Hsieh WH, Houben A, Lin YR. The Evolutionary Dynamics of Repetitive DNA and Its Impact on the Genome Diversification in the Genus Sorghum. FRONTIERS IN PLANT SCIENCE 2021; 12:729734. [PMID: 34475879 PMCID: PMC8407070 DOI: 10.3389/fpls.2021.729734] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 07/23/2021] [Indexed: 05/11/2023]
Abstract
Polyploidization is an evolutionary event leading to structural changes of the genome(s), particularly allopolyploidization, which combines different genomes of distinct species. The tetraploid species, Sorghum halepense, is assumed an allopolyploid species formed by hybridization between diploid S. bicolor and S. propinquum. The repeat profiles of S. bicolor, S. halepense, and their relatives were compared to elucidate the repeats' role in shaping their genomes. The repeat frequencies and profiles of the three diploid accessions (S. bicolor, S. bicolor ssp. verticilliflorum, and S. bicolor var. technicum) and two tetraploid accessions (S. halepense) are similar. However, the polymorphic distribution of the subtelomeric satellites preferentially enriched in the tetraploid S. halepense indicates drastic genome rearrangements after the allopolyploidization event. Verified by CENH3 chromatin immunoprecipitation (ChIP)-sequencing and fluorescence in situ hybridization (FISH) analysis the centromeres of S. bicolor are mainly composed of the abundant satellite SorSat137 (CEN38) and diverse CRMs, Athila of Ty3_gypsy and Ty1_copia-SIRE long terminal repeat (LTR) retroelements. A similar centromere composition was found in S. halepense. The potential contribution of S. bicolor in the formation of tetraploid S. halepense is discussed.
Collapse
Affiliation(s)
- Yi-Tzu Kuo
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
- Department of Agronomy, National Taiwan University, Taipei, Taiwan
| | - Takayoshi Ishii
- Arid Land Research Center, Tottori University, Tottori, Japan
| | - Jörg Fuchs
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
| | - Wei-Hsun Hsieh
- Department of Agronomy, National Taiwan University, Taipei, Taiwan
| | - Andreas Houben
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
- *Correspondence: Andreas Houben,
| | - Yann-Rong Lin
- Department of Agronomy, National Taiwan University, Taipei, Taiwan
- World Vegetable Center, Tainan, Taiwan
- Yann-Rong Lin,
| |
Collapse
|
5
|
Sultana N, Menzel G, Heitkam T, Kojima KK, Bao W, Serçe S. Bioinformatic and Molecular Analysis of Satellite Repeat Diversity in Vaccinium Genomes. Genes (Basel) 2020; 11:E527. [PMID: 32397417 PMCID: PMC7290377 DOI: 10.3390/genes11050527] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 05/06/2020] [Accepted: 05/06/2020] [Indexed: 12/11/2022] Open
Abstract
Bioinformatic and molecular characterization of satellite repeats was performed to understand the impact of their diversification on Vaccinium genome evolution. Satellite repeat diversity was evaluated in four cultivated and wild species, including the diploid species Vaccinium myrtillus and Vaccinium uliginosum, as well as the tetraploid species Vaccinium corymbosum and Vaccinium arctostaphylos. We comparatively characterized six satellite repeat families using in total 76 clones with 180 monomers. We observed that the monomer units of VaccSat1, VaccSat2, VaccSat5, and VaccSat6 showed a higher order repeat (HOR) structure, likely originating from the organization of two adjacent subunits with differing similarity, length and size. Moreover, VaccSat1, VaccSat3, VaccSat6, and VaccSat7 were found to have sequence similarity to parts of transposable elements. We detected satellite-typical tandem organization for VaccSat1 and VaccSat2 in long arrays, while VaccSat5 and VaccSat6 distributed in multiple sites over all chromosomes of tetraploid V. corymbosum, presumably in long arrays. In contrast, very short arrays of VaccSat3 and VaccSat7 are dispersedly distributed over all chromosomes in the same species, likely as internal parts of transposable elements. We provide a comprehensive overview on satellite species specificity in Vaccinium, which are potentially useful as molecular markers to address the taxonomic complexity of the genus, and provide information for genome studies of this genus.
Collapse
Affiliation(s)
- Nusrat Sultana
- Faculty of Life and Earth Sciences, Jagannath University, Dhaka 1100, Bangladesh
- Faculty of Biology, Technische Universität Dresden, D-01062 Dresden, Germany; (G.M.); (T.H.)
| | - Gerhard Menzel
- Faculty of Biology, Technische Universität Dresden, D-01062 Dresden, Germany; (G.M.); (T.H.)
| | - Tony Heitkam
- Faculty of Biology, Technische Universität Dresden, D-01062 Dresden, Germany; (G.M.); (T.H.)
| | - Kenji K. Kojima
- Genetic Information Research Institute, Cupertino, CA 95014, USA; (K.K.K.); (W.B.)
| | - Weidong Bao
- Genetic Information Research Institute, Cupertino, CA 95014, USA; (K.K.K.); (W.B.)
| | - Sedat Serçe
- Department of Agricultural Genetic Engineering, Ayhan Şahenk Faculty of Agricultural Sciences and Technologies, Niğde Ömer Halisdemir University, 51240 Niğde, Turkey;
| |
Collapse
|
6
|
Vondrak T, Ávila Robledillo L, Novák P, Koblížková A, Neumann P, Macas J. Characterization of repeat arrays in ultra-long nanopore reads reveals frequent origin of satellite DNA from retrotransposon-derived tandem repeats. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 101:484-500. [PMID: 31559657 PMCID: PMC7004042 DOI: 10.1111/tpj.14546] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 09/09/2019] [Accepted: 09/12/2019] [Indexed: 05/21/2023]
Abstract
Amplification of monomer sequences into long contiguous arrays is the main feature distinguishing satellite DNA from other tandem repeats, yet it is also the main obstacle in its investigation because these arrays are in principle difficult to assemble. Here we explore an alternative, assembly-free approach that utilizes ultra-long Oxford Nanopore reads to infer the length distribution of satellite repeat arrays, their association with other repeats and the prevailing sequence periodicities. Using the satellite DNA-rich legume plant Lathyrus sativus as a model, we demonstrated this approach by analyzing 11 major satellite repeats using a set of nanopore reads ranging from 30 to over 200 kb in length and representing 0.73× genome coverage. We found surprising differences between the analyzed repeats because only two of them were predominantly organized in long arrays typical for satellite DNA. The remaining nine satellites were found to be derived from short tandem arrays located within LTR-retrotransposons that occasionally expanded in length. While the corresponding LTR-retrotransposons were dispersed across the genome, this array expansion occurred mainly in the primary constrictions of the L. sativus chromosomes, which suggests that these genome regions are favourable for satellite DNA accumulation.
Collapse
Affiliation(s)
- Tihana Vondrak
- Biology CentreCzech Academy of SciencesBranišovská 31České BudějoviceCZ‐37005Czech Republic
- Faculty of ScienceUniversity of South BohemiaČeské BudějoviceCzech Republic
| | - Laura Ávila Robledillo
- Biology CentreCzech Academy of SciencesBranišovská 31České BudějoviceCZ‐37005Czech Republic
- Faculty of ScienceUniversity of South BohemiaČeské BudějoviceCzech Republic
| | - Petr Novák
- Biology CentreCzech Academy of SciencesBranišovská 31České BudějoviceCZ‐37005Czech Republic
| | - Andrea Koblížková
- Biology CentreCzech Academy of SciencesBranišovská 31České BudějoviceCZ‐37005Czech Republic
| | - Pavel Neumann
- Biology CentreCzech Academy of SciencesBranišovská 31České BudějoviceCZ‐37005Czech Republic
| | - Jiří Macas
- Biology CentreCzech Academy of SciencesBranišovská 31České BudějoviceCZ‐37005Czech Republic
| |
Collapse
|
7
|
Ruiz-Ruano FJ, Navarro-Domínguez B, Camacho JPM, Garrido-Ramos MA. Characterization of the satellitome in lower vascular plants: the case of the endangered fern Vandenboschia speciosa. ANNALS OF BOTANY 2019; 123:587-599. [PMID: 30357311 PMCID: PMC6417484 DOI: 10.1093/aob/mcy192] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 10/04/2018] [Indexed: 06/08/2023]
Abstract
BACKGROUND AND AIMS Vandenboschia speciosa is a highly vulnerable fern species, with a large genome (10.5 Gb). Haploid gametophytes and diploid sporophytes are perennial, can reproduce vegetatively, and certain populations are composed only of independent gametophytes. These features make this fern a good model: (1) for high-throughput analysis of satellite DNA (satDNA) to investigate possible evolutionary trends in satDNA sequence features; (2) to determine the relative contribution of satDNA and other repetitive DNAs to its large genome; and (3) to analyse whether the reproduction mode or phase alternation between long-lasting haploid and diploid stages influences satDNA abundance or divergence. METHODS We analysed the repetitive fraction of the genome of this species in three different populations (one comprised only of independent gametophytes) using Illumina sequencing and bioinformatic analysis with RepeatExplorer and satMiner. KEY RESULTS The satellitome of V. speciosa is composed of 11 satDNA families, most of them showing a short repeat length and being A + T rich. Some satDNAs had complex repeats composed of sub-repeats, showing high similarity to shorter satDNAs. Three families had particular structural features and highly conserved motifs. SatDNA only amounts to approx. 0.4 % of its genome. Likewise, microsatellites do not represent more than 2 %, but transposable elements (TEs) represent approx. 50 % of the sporophytic genomes. We found high resemblance in satDNA abundance and divergence between both gametophyte and sporophyte samples from the same population and between populations. CONCLUSIONS (1) Longer (and older) satellites in V. speciosa have a higher A + T content and evolve from shorter ones and, in some cases, microsatellites were a source of new satDNAs; (2) the satellitome does not explain the huge genome size in this species while TEs are the major repetitive component of the V. speciosa genome and mostly contribute to its large genome; and (3) reproduction mode or phase alternation between gametophytes and sporophytes does not entail accumulation or divergence of satellites.
Collapse
Affiliation(s)
- F J Ruiz-Ruano
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Granada, Spain
| | - B Navarro-Domínguez
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Granada, Spain
| | - J P M Camacho
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Granada, Spain
| | - M A Garrido-Ramos
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Granada, Spain
| |
Collapse
|
8
|
Natural History of a Satellite DNA Family: From the Ancestral Genome Component to Species-Specific Sequences, Concerted and Non-Concerted Evolution. Int J Mol Sci 2019; 20:ijms20051201. [PMID: 30857296 PMCID: PMC6429384 DOI: 10.3390/ijms20051201] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Revised: 03/04/2019] [Accepted: 03/06/2019] [Indexed: 12/20/2022] Open
Abstract
Satellite DNA (satDNA) is the most variable fraction of the eukaryotic genome. Related species share a common ancestral satDNA library and changing of any library component in a particular lineage results in interspecific differences. Although the general developmental trend is clear, our knowledge of the origin and dynamics of satDNAs is still fragmentary. Here, we explore whole genome shotgun Illumina reads using the RepeatExplorer (RE) pipeline to infer satDNA family life stories in the genomes of Chenopodium species. The seven diploids studied represent separate lineages and provide an example of a species complex typical for angiosperms. Application of the RE pipeline allowed by similarity searches a determination of the satDNA family with a basic monomer of ~40 bp and to trace its transformation from the reconstructed ancestral to the species-specific sequences. As a result, three types of satDNA family evolutionary development were distinguished: (i) concerted evolution with mutation and recombination events; (ii) concerted evolution with a trend toward increased complexity and length of the satellite monomer; and (iii) non-concerted evolution, with low levels of homogenization and multidirectional trends. The third type is an example of entire repeatome transformation, thus producing a novel set of satDNA families, and genomes showing non-concerted evolution are proposed as a significant source for genomic diversity.
Collapse
|
9
|
Yang S, Qin X, Cheng C, Li Z, Lou Q, Li J, Chen J. Organization and evolution of four differentially amplified tandem repeats in the Cucumis hystrix genome. PLANTA 2017; 246:749-761. [PMID: 28668977 DOI: 10.1007/s00425-017-2716-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Accepted: 05/29/2017] [Indexed: 05/18/2023]
Abstract
Three subtelomeric satellites and one interstitial 5S rDNA were characterized in Cucumis hystrix, and the pericentromeric signals of two C. hystrix subtelomeric satellites along C. sativus chromosomes supported the hypothesis of chromosome fusion in Cucumis. Tandem repeats are chromosome structural fractions consisting of highly repetitive sequences organized in large tandem arrays in most eukaryotes. Differentiation of tandem repeats directly affects the chromosome structure, which contributes to species formation and evolution. Cucumis hystrix (2n = 2x = 24) is the only wild Cucumis species grouped into the same subgenus with C. sativus (2n = 2x = 14), hence its phylogenetic position confers a vital role for C. hystrix to understand the chromosome evolution in Cucumis. However, our knowledge of C. hystrix tandem repeats is insufficient for a detailed understanding of the chromosome evolution in Cucumis. Based on de novo tandem repeat characterization using bioinformatics and in situ hybridization (ISH), we identified and characterized four differentially amplified tandem repeats, Cucumis hystrix satellite 1-3 (CuhySat1-CuhySat3) located at the subtelomeric regions of all chromosomes, and Cucumis hystrix 5S (Cuhy5S) located at the interstitial regions of one single chromosome pair. Comparative ISH mapping using CuhySat1-3 and Cuhy5S revealed high homology of tandem repeats between C. hystrix and C. sativus. Intriguingly, we found signal distribution variations of CuhySat2 and CuhySat3 on C. sativus chromosomes. In comparison to their subtelomeric signal distribution on C. hystrix chromosomes, CuhySat3 showed a pericentromeric signal distribution and CuhySat2 showed both subtelomeric and pericentromeric signal distributions on C. sativus chromosomes. This detailed characterization of four C. hystrix tandem repeats significantly widens our knowledge of the C. hystrix chromosome structure, and the observed signal distribution variations will be helpful for understanding the chromosome evolution of Cucumis.
Collapse
Affiliation(s)
- Shuqiong Yang
- State Key Lab of Crop Genetics and Germplasm Enhancement, College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China
| | - Xiaodong Qin
- State Key Lab of Crop Genetics and Germplasm Enhancement, College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China
| | - Chunyan Cheng
- State Key Lab of Crop Genetics and Germplasm Enhancement, College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China
| | - Ziang Li
- State Key Lab of Crop Genetics and Germplasm Enhancement, College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China
| | - Qunfeng Lou
- State Key Lab of Crop Genetics and Germplasm Enhancement, College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China
| | - Ji Li
- State Key Lab of Crop Genetics and Germplasm Enhancement, College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China
| | - Jinfeng Chen
- State Key Lab of Crop Genetics and Germplasm Enhancement, College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China.
| |
Collapse
|
10
|
Garrido-Ramos MA. Satellite DNA: An Evolving Topic. Genes (Basel) 2017; 8:genes8090230. [PMID: 28926993 PMCID: PMC5615363 DOI: 10.3390/genes8090230] [Citation(s) in RCA: 222] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Revised: 09/12/2017] [Accepted: 09/13/2017] [Indexed: 12/22/2022] Open
Abstract
Satellite DNA represents one of the most fascinating parts of the repetitive fraction of the eukaryotic genome. Since the discovery of highly repetitive tandem DNA in the 1960s, a lot of literature has extensively covered various topics related to the structure, organization, function, and evolution of such sequences. Today, with the advent of genomic tools, the study of satellite DNA has regained a great interest. Thus, Next-Generation Sequencing (NGS), together with high-throughput in silico analysis of the information contained in NGS reads, has revolutionized the analysis of the repetitive fraction of the eukaryotic genomes. The whole of the historical and current approaches to the topic gives us a broad view of the function and evolution of satellite DNA and its role in chromosomal evolution. Currently, we have extensive information on the molecular, chromosomal, biological, and population factors that affect the evolutionary fate of satellite DNA, knowledge that gives rise to a series of hypotheses that get on well with each other about the origin, spreading, and evolution of satellite DNA. In this paper, I review these hypotheses from a methodological, conceptual, and historical perspective and frame them in the context of chromosomal organization and evolution.
Collapse
Affiliation(s)
- Manuel A Garrido-Ramos
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, 18071 Granada, Spain.
| |
Collapse
|
11
|
Novák P, Ávila Robledillo L, Koblížková A, Vrbová I, Neumann P, Macas J. TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads. Nucleic Acids Res 2017. [PMID: 28402514 DOI: 10.1093/nar/gkx257.] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Satellite DNA is one of the major classes of repetitive DNA, characterized by tandemly arranged repeat copies that form contiguous arrays up to megabases in length. This type of genomic organization makes satellite DNA difficult to assemble, which hampers characterization of satellite sequences by computational analysis of genomic contigs. Here, we present tandem repeat analyzer (TAREAN), a novel computational pipeline that circumvents this problem by detecting satellite repeats directly from unassembled short reads. The pipeline first employs graph-based sequence clustering to identify groups of reads that represent repetitive elements. Putative satellite repeats are subsequently detected by the presence of circular structures in their cluster graphs. Consensus sequences of repeat monomers are then reconstructed from the most frequent k-mers obtained by decomposing read sequences from corresponding clusters. The pipeline performance was successfully validated by analyzing low-pass genome sequencing data from five plant species where satellite DNA was previously experimentally characterized. Moreover, novel satellite repeats were predicted for the genome of Vicia faba and three of these repeats were verified by detecting their sequences on metaphase chromosomes using fluorescence in situ hybridization.
Collapse
Affiliation(s)
- Petr Novák
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Laura Ávila Robledillo
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Andrea Koblížková
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Iva Vrbová
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Pavel Neumann
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Jirí Macas
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| |
Collapse
|
12
|
Novák P, Ávila Robledillo L, Koblížková A, Vrbová I, Neumann P, Macas J. TAREAN: a computational tool for identification and characterization of satellite DNA from unassembled short reads. Nucleic Acids Res 2017; 45:e111. [PMID: 28402514 PMCID: PMC5499541 DOI: 10.1093/nar/gkx257] [Citation(s) in RCA: 174] [Impact Index Per Article: 24.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Revised: 03/23/2017] [Accepted: 04/04/2017] [Indexed: 12/21/2022] Open
Abstract
Satellite DNA is one of the major classes of repetitive DNA, characterized by tandemly arranged repeat copies that form contiguous arrays up to megabases in length. This type of genomic organization makes satellite DNA difficult to assemble, which hampers characterization of satellite sequences by computational analysis of genomic contigs. Here, we present tandem repeat analyzer (TAREAN), a novel computational pipeline that circumvents this problem by detecting satellite repeats directly from unassembled short reads. The pipeline first employs graph-based sequence clustering to identify groups of reads that represent repetitive elements. Putative satellite repeats are subsequently detected by the presence of circular structures in their cluster graphs. Consensus sequences of repeat monomers are then reconstructed from the most frequent k-mers obtained by decomposing read sequences from corresponding clusters. The pipeline performance was successfully validated by analyzing low-pass genome sequencing data from five plant species where satellite DNA was previously experimentally characterized. Moreover, novel satellite repeats were predicted for the genome of Vicia faba and three of these repeats were verified by detecting their sequences on metaphase chromosomes using fluorescence in situ hybridization.
Collapse
Affiliation(s)
- Petr Novák
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Laura Ávila Robledillo
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Andrea Koblížková
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Iva Vrbová
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Pavel Neumann
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| | - Jirí Macas
- Institute of Plant Molecular Biology, Biology Centre CAS, Ceské Budejovice CZ-37005, Czech Republic
| |
Collapse
|
13
|
Evolutionary dynamics of two satellite DNA families in rock lizards of the genus Iberolacerta (Squamata, Lacertidae): different histories but common traits. Chromosome Res 2016; 23:441-61. [PMID: 26384818 DOI: 10.1007/s10577-015-9489-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Satellite DNAs compose a large portion of all higher eukaryotic genomes. The turnover of these highly repetitive sequences is an important element in genome organization and evolution. However, information about the structure and dynamics of reptilian satellite DNA is still scarce. Two satellite DNA families, HindIII and TaqI, have been previously characterized in four species of the genus Iberolacerta. These families showed different chromosomal locations, abundances, and evolutionary rates. Here, we extend the study of both satellite DNAs (satDNAs) to the remaining Iberolacerta species, with the aim to investigate the patterns of variability and factors influencing the evolution of these repetitive sequences. Our results revealed disparate patterns but also common traits in the evolutionary histories of these satellite families: (i) each satellite DNA is made up of a library of monomer variants or subfamilies shared by related species; (ii) species-specific profiles of satellite repeats are shaped by expansions and/or contractions of different variants from the library; (iii) different turnover rates, even among closely related species, result in great differences in overall sequence homogeneity and in concerted or non-concerted evolution patterns, which may not reflect the phylogenetic relationships among taxa. Contrasting turnover rates are possibly related to genomic constraints such as karyotype architecture and the interspersed organization of diverging repeat variants in satellite arrays. Moreover, rapid changes in copy number, especially in the centromeric HindIII satDNA, may have been associated with chromosomal rearrangements and even contributed to speciation within Iberolacerta.
Collapse
|
14
|
Pavlek M, Gelfand Y, Plohl M, Meštrović N. Genome-wide analysis of tandem repeats in Tribolium castaneum genome reveals abundant and highly dynamic tandem repeat families with satellite DNA features in euchromatic chromosomal arms. DNA Res 2015; 22:387-401. [PMID: 26428853 PMCID: PMC4675708 DOI: 10.1093/dnares/dsv021] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Accepted: 08/26/2015] [Indexed: 12/31/2022] Open
Abstract
Although satellite DNAs are well-explored components of heterochromatin and centromeres, little is known about emergence, dispersal and possible impact of comparably structured tandem repeats (TRs) on the genome-wide scale. Our bioinformatics analysis of assembled Tribolium castaneum genome disclosed significant contribution of TRs in euchromatic chromosomal arms and clear predominance of satellite DNA-typical 170 bp monomers in arrays of ≥5 repeats. By applying different experimental approaches, we revealed that the nine most prominent TR families Cast1-Cast9 extracted from the assembly comprise ∼4.3% of the entire genome and reside almost exclusively in euchromatic regions. Among them, seven families that build ∼3.9% of the genome are based on ∼170 and ∼340 bp long monomers. Results of phylogenetic analyses of 2500 monomers originating from these families show high-sequence dynamics, evident by extensive exchanges between arrays on non-homologous chromosomes. In addition, our analysis shows that concerted evolution acts more efficiently on longer than on shorter arrays. Efficient genome-wide distribution of nine TR families implies the role of transposition only in expansion of the most dispersed family, and involvement of other mechanisms is anticipated. Despite similarities in sequence features, FISH experiments indicate high-level compartmentalization of centromeric and euchromatic tandem repeats.
Collapse
Affiliation(s)
- Martina Pavlek
- Ruđer Bošković Institute, Bijenička 54, Zagreb HR-10002, Croatia
| | - Yevgeniy Gelfand
- Laboratory for Biocomputing and Informatics, Boston University, Boston, MA 02215, USA
| | - Miroslav Plohl
- Ruđer Bošković Institute, Bijenička 54, Zagreb HR-10002, Croatia
| | | |
Collapse
|
15
|
Garrido-Ramos MA. Satellite DNA in Plants: More than Just Rubbish. Cytogenet Genome Res 2015; 146:153-170. [PMID: 26202574 DOI: 10.1159/000437008] [Citation(s) in RCA: 107] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/20/2015] [Indexed: 11/19/2022] Open
Abstract
For decades, satellite DNAs have been the hidden part of genomes. Initially considered as junk DNA, there is currently an increasing appreciation of the functional significance of satellite DNA repeats and of their sequences. Satellite DNA families accumulate in the heterochromatin in different parts of the eukaryotic chromosomes, mainly in pericentromeric and subtelomeric regions, but they also span the functional centromere. Tandem repeat sequences may spread from subtelomeric to interstitial loci, leading to the formation of chromosome-specific loci or to the accumulation in equilocal sites in different chromosomes. They also appear as the main components of the heterochromatin in the sex-specific region of sex chromosomes. Satellite DNA, required for chromosome organization, also plays a role in pairing and segregation. Some satellite repeats are transcribed and can participate in the formation and maintenance of heterochromatin structure and in the modulation of gene expression. In addition to the identification of the different satellite DNA families, their characteristics and location, we are interested in determining their impact on the genomes, by identifying the mechanisms leading to their appearance and amplification as well as in understanding how they change over time, the factors affecting these changes, and the influence exerted by the evolutionary history of the organisms. On the other hand, satellite DNA sequences are rapidly evolving sequences that may cause reproductive barriers between organisms and promote speciation. The accumulation of experimental data collected in recent years and the emergence of new approaches based on next-generation sequencing and high-throughput genome analysis are opening new perspectives that are changing our understanding of satellite DNA. This review examines recent data to provide a timely update on the overall information gathered about this part of the genome, focusing on the advances in the knowledge of its origin, its evolution, and its potential functional roles.
Collapse
|
16
|
Čížková J, Hřibová E, Humplíková L, Christelová P, Suchánková P, Doležel J. Molecular analysis and genomic organization of major DNA satellites in banana (Musa spp.). PLoS One 2013; 8:e54808. [PMID: 23372772 PMCID: PMC3553004 DOI: 10.1371/journal.pone.0054808] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2012] [Accepted: 12/17/2012] [Indexed: 02/03/2023] Open
Abstract
Satellite DNA sequences consist of tandemly arranged repetitive units up to thousands nucleotides long in head-to-tail orientation. The evolutionary processes by which satellites arise and evolve include unequal crossing over, gene conversion, transposition and extra chromosomal circular DNA formation. Large blocks of satellite DNA are often observed in heterochromatic regions of chromosomes and are a typical component of centromeric and telomeric regions. Satellite-rich loci may show specific banding patterns and facilitate chromosome identification and analysis of structural chromosome changes. Unlike many other genomes, nuclear genomes of banana (Musa spp.) are poor in satellite DNA and the information on this class of DNA remains limited. The banana cultivars are seed sterile clones originating mostly from natural intra-specific crosses within M. acuminata (A genome) and inter-specific crosses between M. acuminata and M. balbisiana (B genome). Previous studies revealed the closely related nature of the A and B genomes, including similarities in repetitive DNA. In this study we focused on two main banana DNA satellites, which were previously identified in silico. Their genomic organization and molecular diversity was analyzed in a set of nineteen Musa accessions, including representatives of A, B and S (M. schizocarpa) genomes and their inter-specific hybrids. The two DNA satellites showed a high level of sequence conservation within, and a high homology between Musa species. FISH with probes for the satellite DNA sequences, rRNA genes and a single-copy BAC clone 2G17 resulted in characteristic chromosome banding patterns in M. acuminata and M. balbisiana which may aid in determining genomic constitution in interspecific hybrids. In addition to improving the knowledge on Musa satellite DNA, our study increases the number of cytogenetic markers and the number of individual chromosomes, which can be identified in Musa.
Collapse
Affiliation(s)
- Jana Čížková
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Eva Hřibová
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Lenka Humplíková
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Pavla Christelová
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Pavla Suchánková
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| | - Jaroslav Doležel
- Institute of Experimental Botany, Centre of the Region Haná for Biotechnological and Agricultural Research, Olomouc, Czech Republic
| |
Collapse
|
17
|
Rosato M, Galián JA, Rosselló JA. Amplification, contraction and genomic spread of a satellite DNA family (E180) in Medicago (Fabaceae) and allied genera. ANNALS OF BOTANY 2012; 109:773-82. [PMID: 22186276 PMCID: PMC3286279 DOI: 10.1093/aob/mcr309] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
BACKGROUND AND AIMS Satellite DNA is a genomic component present in virtually all eukaryotic organisms. The turnover of highly repetitive satellite DNA is an important element in genome organization and evolution in plants. Here we assess the presence and physical distribution of the repetitive DNA E180 family in Medicago and allied genera. Our goals were to gain insight into the karyotype evolution of Medicago using satellite DNA markers, and to evaluate the taxonomic and phylogenetic signal of a satellite DNA family in a genus hypothesized to have a complex evolutionary history. METHODS Seventy accessions from Medicago, Trigonella, Melilotus and Trifolium were analysed by PCR to assess the presence of the repetitive E180 family, and fluorescence in situ hybridization (FISH) was used for physical mapping in somatic chromosomes. KEY RESULTS The E180 repeat unit was PCR-amplified in 37 of 40 taxa in Medicago, eight of 12 species of Trigonella, six of seven species of Melilotus and in two of 11 Trifolium species. Examination of the mitotic chromosomes revealed that only 13 Medicago and two Trigonella species showed FISH signals using the E180 probe. Stronger hybridization signals were observed in subtelomeric and interstitial loci than in the pericentromeric loci, suggesting this satellite family has a preferential genomic location. Not all 13 Medicago species that showed FISH localization of the E180 repeat were phylogenetically related. However, nine of these species belong to the phylogenetically derived clade including the M. sativa and M. arborea complexes. CONCLUSIONS The use of the E180 family as a phylogenetic marker in Medicago should be viewed with caution. Its amplification appears to have been produced through recurrent and independent evolutionary episodes in both annual and perennial Medicago species as well as in basal and derived clades.
Collapse
Affiliation(s)
- Marcela Rosato
- Jardín Botánico, Universidad de Valencia, c/Quart 80, E-46008, Valencia, Spain
| | - José A. Galián
- Jardín Botánico, Universidad de Valencia, c/Quart 80, E-46008, Valencia, Spain
| | - Josep A. Rosselló
- Jardín Botánico, Universidad de Valencia, c/Quart 80, E-46008, Valencia, Spain
- Marimurtra Bot. Garden, Carl Faust Fdn., PO Box 112, E-17300 Blanes, Catalonia, Spain
- For correspondence. E-mail
| |
Collapse
|
18
|
Zhao X, Lu J, Zhang Z, Hu J, Huang S, Jin W. Comparison of the distribution of the repetitive DNA sequences in three variants of Cucumis sativus reveals their phylogenetic relationships. J Genet Genomics 2011; 38:39-45. [PMID: 21338951 DOI: 10.1016/j.jcg.2010.12.005] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2010] [Revised: 12/20/2010] [Accepted: 12/24/2010] [Indexed: 01/08/2023]
Abstract
Repetitive DNA sequences with variability in copy number or/and sequence polymorphism can be employed as useful molecular markers to study phylogenetics and identify species/chromosomes when combined with fluorescence in situ hybridization (FISH). Cucumis sativus has three variants, Cucumis sativus L. var. sativus, Cucumis sativus L. var. hardwickii and Cucumis sativus L. var. xishuangbannesis. The phylogenetics among these three variants has not been well explored using cytological landmarks. Here, we concentrate on the organization and distribution of highly repetitive DNA sequences in cucumbers, with emphasis on the differences between cultivar and wild cucumber. The diversity of chromosomal karyotypes in cucumber and its relatives was detected in our study. Thereby, sequential FISH with three sets of multi-probe cocktails (combined repetitive DNA with chromosome-specific fosmid clones as probes) were conducted on the same metaphase cell, which helped us to simultaneously identify each of the 7 metaphase chromosomes of wild cucumber C. sativus var. hardwickii. A standardized karyotype of somatic metaphase chromosomes was constructed. Our data also indicated that the relationship between cultivar cucumber and C. s. var. xishuangbannesis was closer than that of C. s. var. xishuangbannesis and C. s. var. hardwickii.
Collapse
Affiliation(s)
- Xin Zhao
- National Maize Improvement Center of China, Key Laboratory of Crop Genetic Improvement and Genome of Ministry of Agriculture, Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing 100193, China
| | | | | | | | | | | |
Collapse
|
19
|
Organization and evolution of subtelomeric satellite repeats in the potato genome. G3-GENES GENOMES GENETICS 2011; 1:85-92. [PMID: 22384321 PMCID: PMC3276127 DOI: 10.1534/g3.111.000125] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2011] [Accepted: 05/03/2011] [Indexed: 12/30/2022]
Abstract
Subtelomeric domains immediately adjacent to telomeres represent one of the most dynamic and rapidly evolving regions in eukaryotic genomes. A common feature associated with subtelomeric regions in different eukaryotes is the presence of long arrays of tandemly repeated satellite sequences. However, studies on molecular organization and evolution of subtelomeric repeats are rare. We isolated two subtelomeric repeats, CL14 and CL34, from potato (Solanum tuberosum). The CL14 and CL34 repeats are organized as independent long arrays, up to 1-3 Mb, of 182 bp and 339 bp monomers, respectively. The CL14 and CL34 repeat arrays are directly connected with the telomeric repeats at some chromosomal ends. The CL14 repeat was detected at the subtelomeric regions among highly diverged Solanum species, including tomato (Solanum lycopersicum). In contrast, CL34 was only found in potato and its closely related species. Interestingly, the CL34 repeat array was always proximal to the telomeres when both CL14 and CL34 were found at the same chromosomal end. In addition, the CL34 repeat family showed more sequence variability among monomers compared with the CL14 repeat family. We conclude that the CL34 repeat family emerged recently from the subtelomeric regions of potato chromosomes and is rapidly evolving. These results provide further evidence that subtelomeric domains are among the most dynamic regions in eukaryotic genomes.
Collapse
|
20
|
Kuhn GCS, Schwarzacher T, Heslop-Harrison JS. The non-regular orbit: three satellite DNAs in Drosophila martensis (buzzatii complex, repleta group) followed three different evolutionary pathways. Mol Genet Genomics 2010; 284:251-62. [PMID: 20683615 DOI: 10.1007/s00438-010-0564-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2010] [Accepted: 07/20/2010] [Indexed: 11/29/2022]
Abstract
The genome of species from the buzzatii cluster (buzzatii complex, repleta group) is hosted by a number of satellite DNAs (satDNAs) showing contrasting structural characteristics, genomic organization and evolution, such as pBuM-alpha (~190 bp repeats), pBuM-alpha/beta (~370 bp repeats) and the DBC-150 (~150 bp repeats). In the present study, we aimed to investigate the evolution of these three satDNAs by looking for homologous sequences in the genome of the closest outgroup species: Drosophila martensis (buzzatii complex). After PCR, we isolated and sequenced 9 alpha, 8 alpha/beta and 11 DBC-150 sequences from this species. The results were compared to all pBuM and DBC-150 sequences available in literature. After D. martensis split from the buzzatii cluster some 6 Mya, the three satDNAs evolved differently in the genome of D. martensis by: (1) maintenance of a collection of major types of ancestral repeats in the genome (alpha); (2) fixation for a single major type of ancestral repeats (alpha/beta) or (3) fixation for new divergent species-specific repeat types (DBC-150). Curiously, D. seriema and D. martensis, although belonging to different and allopatric clusters, became independently fixed for the same major type of alpha/beta ancestral repeats, illustrating a rare case of parallelism in satDNA evolution. The contrasting pictures illustrate the diversity of evolutionary pathways a satDNA can follow, defining a "non-regular orbit" with outcomes difficult to predict.
Collapse
Affiliation(s)
- Gustavo C S Kuhn
- Departamento de Genética e Evolução, Universidade Federal de São Carlos, Via Washington Luís, Km 235, São Carlos, SP 13565-905, Brazil.
| | | | | |
Collapse
|
21
|
Macas J, Neumann P, Novák P, Jiang J. Global sequence characterization of rice centromeric satellite based on oligomer frequency analysis in large-scale sequencing data. ACTA ACUST UNITED AC 2010; 26:2101-8. [PMID: 20616383 DOI: 10.1093/bioinformatics/btq343] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
MOTIVATION Satellite DNA makes up significant portion of many eukaryotic genomes, yet it is relatively poorly characterized even in extensively sequenced species. This is, in part, due to methodological limitations of traditional methods of satellite repeat analysis, which are based on multiple alignments of monomer sequences. Therefore, we employed an alternative, alignment-free, approach utilizing k-mer frequency statistics, which is in principle more suitable for analyzing large sets of satellite repeat data, including sequence reads from next generation sequencing technologies. RESULTS k-mer frequency spectra were determined for two sets of rice centromeric satellite CentO sequences, including 454 reads from ChIP-sequencing of CENH3-bound DNA (7.6 Mb) and the whole genome Sanger sequencing reads (5.8 Mb). k-mer frequencies were used to identify the most conserved sequence regions and to reconstruct consensus sequences of complete monomers. Reconstructed consensus sequences as well as the assessment of overall divergence of k-mer spectra revealed high similarity of the two datasets, suggesting that CentO sequences associated with functional centromeres (CENH3-bound) do not significantly differ from the total population of CentO, which includes both centromeric and pericentromeric repeat arrays. On the other hand, considerable differences were revealed when these methods were used for comparison of CentO populations between individual chromosomes of the rice genome assembly, demonstrating preferential sequence homogenization of the clusters within the same chromosome. k-mer frequencies were also successfully used to identify and characterize smRNAs derived from CentO repeats.
Collapse
Affiliation(s)
- Jirí Macas
- Institute of Plant Molecular Biology, Biology Centre ASCR, Branisovska 31, CZ-37005, Ceske Budejovice, Czech Republic.
| | | | | | | |
Collapse
|
22
|
Koukalova B, Moraes AP, Renny-Byfield S, Matyasek R, Leitch AR, Kovarik A. Fall and rise of satellite repeats in allopolyploids of Nicotiana over c. 5 million years. THE NEW PHYTOLOGIST 2010; 186:148-60. [PMID: 19968801 DOI: 10.1111/j.1469-8137.2009.03101.x] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Allopolyploids represent natural experiments in which DNA sequences from different species are combined into a single nucleus and then coevolve, enabling us to follow the parental genomes, their interactions and evolution over time. Here, we examine the fate of satellite DNA over 5 million yr of divergence in plant genus Nicotiana (family Solanaceae). We isolated subtelomeric, tandemly repeated satellite DNA from Nicotiana diploid and allopolyploid species and analysed patterns of inheritance and divergence by sequence analysis, Southern blot hybridization and fluorescent in situ hybridization (FISH). We observed that parental satellite sequences redistribute around the genome in allopolyploids of Nicotiana section Polydicliae, formed c. 1 million yr ago (Mya), and that new satellite repeats evolved and amplified in section Repandae, which was formed c. 5 Mya. In some cases that process involved the complete replacement of parental satellite sequences. The rate of satellite repeat replacement is faster than theoretical predictions assuming the mechanism involved is unequal recombination and crossing-over. Instead we propose that this mechanism occurs with the deletion of large chromatin blocks and reamplification, perhaps via rolling circle replication.
Collapse
Affiliation(s)
- Blazena Koukalova
- Institute of Biophysics, Academy of Sciences of the Czech Republic, CZ-612 65 Brno, Czech Republic
| | | | | | | | | | | |
Collapse
|
23
|
Macas J, Koblízková A, Navrátilová A, Neumann P. Hypervariable 3' UTR region of plant LTR-retrotransposons as a source of novel satellite repeats. Gene 2009; 448:198-206. [PMID: 19563868 DOI: 10.1016/j.gene.2009.06.014] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2009] [Revised: 06/17/2009] [Accepted: 06/19/2009] [Indexed: 11/15/2022]
Abstract
The repetitive sequence PisTR-A has an unusual organization in the pea (Pisum sativum) genome, being present both as short dispersed repeats as well as long arrays of tandemly arranged satellite DNA. Cloning, sequencing and FISH analysis of both PisTR-A variants revealed that the former occurs in the genome embedded within the sequence of Ty3/gypsy-like Ogre elements, whereas the latter forms homogenized arrays of satellite repeats at several genomic loci. The Ogre elements carry the PisTR-A sequences in their 3' untranslated region (UTR) separating the gag-pol region from the 3' LTR. This region was found to be highly variable among pea Ogre elements, and includes a number of other tandem repeats along with or instead of PisTR-A. Bioinformatic analysis of LTR-retrotransposons mined from available plant genomic sequence data revealed that the frequent occurrence of variable tandem repeats within 3' UTRs is a typical feature of the Tat lineage of plant retrotransposons. Comparison of these repeats to known plant satellite sequences uncovered two other instances of satellites with sequence similarity to a Tat-like retrotransposon 3' UTR regions. These observations suggest that some retrotransposons may significantly contribute to satellite DNA evolution by generating a library of short repeat arrays that can subsequently be dispersed through the genome and eventually further amplified and homogenized into novel satellite repeats.
Collapse
Affiliation(s)
- Jirí Macas
- Biology Centre ASCR, Institute of Plant Molecular Biology, Branisovská 31, Ceské Budejovice, CZ-37005, Czech Republic.
| | | | | | | |
Collapse
|
24
|
Plohl M, Petrović V, Luchetti A, Ricci A, Satović E, Passamonti M, Mantovani B. Long-term conservation vs high sequence divergence: the case of an extraordinarily old satellite DNA in bivalve mollusks. Heredity (Edinb) 2009; 104:543-51. [PMID: 19844270 DOI: 10.1038/hdy.2009.141] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
The ubiquity of satellite DNA (satDNA) sequences has raised much controversy over the abundance of divergent monomer variants and the long-time nucleotide sequence stability observed for many satDNA families. In this work, we describe the satDNA BIV160, characterized in nine species of the three main bivalve clades (Protobranchia, Pteriomorphia and Heteroconchia). BIV160 monomers are similar in repeat size and nucleotide sequence to satDNAs described earlier in oysters and in the clam Donax trunculus. The broad distribution of BIV160 satDNA indicates that similar variants existed in the ancestral bivalve species that lived about 540 million years ago; this makes BIV160 the most ancient satDNA described so far. In the species examined, monomer variants are distributed in quite a complex pattern. This pattern includes (i) species characterized by a specific group of variants, (ii) species that share distinct group(s) of variants and (iii) species with both specific and shared types. The evolutionary scenario suggested by these data reconciles sequence uniformity in homogenization-maintained satDNA arrays with the genomic richness of divergent monomer variants formed by diversification of the same ancestral satDNA sequence. Diversified repeats can continue to evolve in a non-concerted manner and behave as independent amplification-contraction units in the framework of a 'library of satDNA variants' representing a permanent source of monomers that can be amplified into novel homogeneous satDNA arrays. On the whole, diversification of satDNA monomers and copy number fluctuations provide a highly dynamic genomic environment able to form and displace satDNA sequence variants rapidly in evolution.
Collapse
Affiliation(s)
- M Plohl
- Department of Molecular Biology, Ruder Bosković Institute, Zagreb, Croatia.
| | | | | | | | | | | | | |
Collapse
|
25
|
Navrátilová A, Koblížková A, Macas J. Survey of extrachromosomal circular DNA derived from plant satellite repeats. BMC PLANT BIOLOGY 2008; 8:90. [PMID: 18721471 PMCID: PMC2543021 DOI: 10.1186/1471-2229-8-90] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2008] [Accepted: 08/22/2008] [Indexed: 05/19/2023]
Abstract
BACKGROUND Satellite repeats represent one of the most dynamic components of higher plant genomes, undergoing rapid evolutionary changes of their nucleotide sequences and abundance in a genome. However, the exact molecular mechanisms driving these changes and their eventual regulation are mostly unknown. It has been proposed that amplification and homogenization of satellite DNA could be facilitated by extrachromosomal circular DNA (eccDNA) molecules originated by recombination-based excision from satellite repeat arrays. While the models including eccDNA are attractive for their potential to explain rapid turnover of satellite DNA, the existence of satellite repeat-derived eccDNA has not yet been systematically studied in a wider range of plant genomes. RESULTS We performed a survey of eccDNA corresponding to nine different families and three subfamilies of satellite repeats in ten species from various genera of higher plants (Arabidopsis, Oryza, Pisum, Secale, Triticum and Vicia). The repeats selected for this study differed in their monomer length, abundance, and chromosomal localization in individual species. Using two-dimensional agarose gel electrophoresis followed by Southern blotting, eccDNA molecules corresponding to all examined satellites were detected. EccDNA occurred in the form of nicked circles ranging from hundreds to over eight thousand nucleotides in size. Within this range the circular molecules occurred preferentially in discrete size intervals corresponding to multiples of monomer or higher-order repeat lengths. CONCLUSION This work demonstrated that satellite repeat-derived eccDNA is common in plant genomes and thus it can be seriously considered as a potential intermediate in processes driving satellite repeat evolution. The observed size distribution of circular molecules suggests that they are most likely generated by molecular mechanisms based on homologous recombination requiring long stretches of sequence similarity.
Collapse
Affiliation(s)
- Alice Navrátilová
- Biology Centre ASCR, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, CZ-37005, Czech Republic
| | - Andrea Koblížková
- Biology Centre ASCR, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, CZ-37005, Czech Republic
| | - Jiří Macas
- Biology Centre ASCR, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, CZ-37005, Czech Republic
| |
Collapse
|
26
|
Sequence analysis, chromosomal distribution and long-range organization show that rapid turnover of new and old pBuM satellite DNA repeats leads to different patterns of variation in seven species of the Drosophila buzzatii cluster. Chromosome Res 2008; 16:307-24. [PMID: 18266060 DOI: 10.1007/s10577-007-1195-1] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2007] [Revised: 12/07/2007] [Accepted: 12/07/2007] [Indexed: 10/22/2022]
Abstract
We aimed to study patterns of variation and factors influencing the evolutionary dynamics of a satellite DNA, pBuM, in all seven Drosophila species from the buzzatii cluster (repleta group). We analyzed 117 alpha pBuM-1 (monomer length 190 bp) and 119 composite alpha/beta (370 bp) pBuM-2 repeats and determined the chromosome location and long-range organization on DNA fibers of major sequence variants. Such combined methodologies in the study of satDNAs have been used in very few organisms. In most species, concerted evolution is linked to high copy number of pBuM repeats. Species presenting low-abundance and scattered distributed pBuM repeats did not undergo concerted evolution and maintained part of the ancestral inter-repeat variability. The alpha and alpha/beta repeats colocalized in heterochromatic regions and were distributed on multiple chromosomes, with notable differences between species. High-resolution FISH revealed array sizes of a few kilobases to over 0.7 Mb and mutual arrangements of alpha and alpha/beta repeats along the same DNA fibers, but with considerable changes in the amount of each variant across species. From sequence, chromosomal and phylogenetic data, we could infer that homogenization and amplification events involved both new and ancestral pBuM variants. Altogether, the data on the structure and organization of the pBuM satDNA give insights into genome evolution including mechanisms that contribute to concerted evolution and diversification.
Collapse
|
27
|
Macas J, Neumann P, Navrátilová A. Repetitive DNA in the pea (Pisum sativum L.) genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula. BMC Genomics 2007; 8:427. [PMID: 18031571 PMCID: PMC2206039 DOI: 10.1186/1471-2164-8-427] [Citation(s) in RCA: 221] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2007] [Accepted: 11/21/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Extraordinary size variation of higher plant nuclear genomes is in large part caused by differences in accumulation of repetitive DNA. This makes repetitive DNA of great interest for studying the molecular mechanisms shaping architecture and function of complex plant genomes. However, due to methodological constraints of conventional cloning and sequencing, a global description of repeat composition is available for only a very limited number of higher plants. In order to provide further data required for investigating evolutionary patterns of repeated DNA within and between species, we used a novel approach based on massive parallel sequencing which allowed a comprehensive repeat characterization in our model species, garden pea (Pisum sativum). RESULTS Analysis of 33.3 Mb sequence data resulted in quantification and partial sequence reconstruction of major repeat families occurring in the pea genome with at least thousands of copies. Our results showed that the pea genome is dominated by LTR-retrotransposons, estimated at 140,000 copies/1C. Ty3/gypsy elements are less diverse and accumulated to higher copy numbers than Ty1/copia. This is in part due to a large population of Ogre-like retrotransposons which alone make up over 20% of the genome. In addition to numerous types of mobile elements, we have discovered a set of novel satellite repeats and two additional variants of telomeric sequences. Comparative genome analysis revealed that there are only a few repeat sequences conserved between pea and soybean genomes. On the other hand, all major families of pea mobile elements are well represented in M. truncatula. CONCLUSION We have demonstrated that even in a species with a relatively large genome like pea, where a single 454-sequencing run provided only 0.77% coverage, the generated sequences were sufficient to reconstruct and analyze major repeat families corresponding to a total of 35-48% of the genome. These data provide a starting point for further investigations of legume plant genomes based on their global comparative analysis and for the development of more sophisticated approaches for data mining.
Collapse
Affiliation(s)
- Jiří Macas
- Biology Centre ASCR, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, CZ-37005, Czech Republic
| | - Pavel Neumann
- Biology Centre ASCR, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, CZ-37005, Czech Republic
| | - Alice Navrátilová
- Biology Centre ASCR, Institute of Plant Molecular Biology, Branišovská 31, České Budějovice, CZ-37005, Czech Republic
| |
Collapse
|
28
|
Macas J, Neumann P, Navrátilová A. Repetitive DNA in the pea (Pisum sativum L.) genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula. BMC Genomics 2007. [PMID: 18031571 DOI: 10.1186/1471‐2164‐8‐427] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Extraordinary size variation of higher plant nuclear genomes is in large part caused by differences in accumulation of repetitive DNA. This makes repetitive DNA of great interest for studying the molecular mechanisms shaping architecture and function of complex plant genomes. However, due to methodological constraints of conventional cloning and sequencing, a global description of repeat composition is available for only a very limited number of higher plants. In order to provide further data required for investigating evolutionary patterns of repeated DNA within and between species, we used a novel approach based on massive parallel sequencing which allowed a comprehensive repeat characterization in our model species, garden pea (Pisum sativum). RESULTS Analysis of 33.3 Mb sequence data resulted in quantification and partial sequence reconstruction of major repeat families occurring in the pea genome with at least thousands of copies. Our results showed that the pea genome is dominated by LTR-retrotransposons, estimated at 140,000 copies/1C. Ty3/gypsy elements are less diverse and accumulated to higher copy numbers than Ty1/copia. This is in part due to a large population of Ogre-like retrotransposons which alone make up over 20% of the genome. In addition to numerous types of mobile elements, we have discovered a set of novel satellite repeats and two additional variants of telomeric sequences. Comparative genome analysis revealed that there are only a few repeat sequences conserved between pea and soybean genomes. On the other hand, all major families of pea mobile elements are well represented in M. truncatula. CONCLUSION We have demonstrated that even in a species with a relatively large genome like pea, where a single 454-sequencing run provided only 0.77% coverage, the generated sequences were sufficient to reconstruct and analyze major repeat families corresponding to a total of 35-48% of the genome. These data provide a starting point for further investigations of legume plant genomes based on their global comparative analysis and for the development of more sophisticated approaches for data mining.
Collapse
Affiliation(s)
- Jirí Macas
- Biology Centre ASCR, Institute of Plant Molecular Biology, Branisovská 31, Ceské Budejovice, CZ-37005, Czech Republic.
| | | | | |
Collapse
|
29
|
Lee HR, Neumann P, Macas J, Jiang J. Transcription and Evolutionary Dynamics of the Centromeric Satellite Repeat CentO in Rice. Mol Biol Evol 2006; 23:2505-20. [PMID: 16987952 DOI: 10.1093/molbev/msl127] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Satellite DNA is a major component of centromeric heterochromatin in most multicellular eukaryotes, where it is typically organized into megabase-sized tandem arrays. It has recently been demonstrated that small interfering RNAs (siRNAs) processed from centromeric satellite repeats can be involved in epigenetic chromatin modifications which appear to underpin centromere function. However, the structural organization and evolution of the centromeric satellite DNA is still poorly understood. We analyzed the centromeric satellite repeat arrays from rice chromosomes 1 and 8 and identified higher order structures and local homogenization of the CentO repeats in these 2 centromeres. We also cloned the CentO repeats from the CENH3-associated nucleosomes by a chromatin immunoprecipitation (ChIP)-based method. Sequence variability analysis of the ChIPed CentO repeats revealed a single variable domain within the repeat. We detected transcripts derived from both strands of the CentO repeats. The CentO transcripts are processed into siRNA, suggesting a potential role of this satellite repeat family in epigenetic chromatin modification.
Collapse
Affiliation(s)
- Hye-Ran Lee
- Department of Horticulture, University of Wisconsin-Madison, Madison, WI, USA
| | | | | | | |
Collapse
|