1
|
Scalabrin S, Magris G, Liva M, Vitulo N, Vidotto M, Scaglione D, Del Terra L, Ruosi MR, Navarini L, Pellegrino G, Berny Mier Y Teran JC, Toniutti L, Suggi Liverani F, Cerutti M, Di Gaspero G, Morgante M. A chromosome-scale assembly reveals chromosomal aberrations and exchanges generating genetic diversity in Coffea arabica germplasm. Nat Commun 2024; 15:463. [PMID: 38263403 PMCID: PMC10805892 DOI: 10.1038/s41467-023-44449-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 12/13/2023] [Indexed: 01/25/2024] Open
Abstract
In order to better understand the mechanisms generating genetic diversity in the recent allotetraploid species Coffea arabica, here we present a chromosome-level assembly obtained with long read technology. Two genomic compartments with different structural and functional properties are identified in the two homoeologous genomes. The resequencing data from a large set of accessions reveals low intraspecific diversity in the center of origin of the species. Across a limited number of genomic regions, diversity increases in some cultivated genotypes to levels similar to those observed within one of the progenitor species, Coffea canephora, presumably as a consequence of introgressions deriving from the so-called Timor hybrid. It also reveals that, in addition to few, early-occurring exchanges between homoeologous chromosomes, there are numerous recent chromosomal aberrations including aneuploidies, deletions, duplications and exchanges. These events are still polymorphic in the germplasm and could represent a fundamental source of genetic variation in such a lowly variable species.
Collapse
Affiliation(s)
| | - Gabriele Magris
- Istituto di Genomica Applicata, 33100, Udine, Italy
- Department of Agricultural, Food, Environmental and Animal Sciences, University of Udine, 33100, Udine, Italy
| | - Mario Liva
- IGA Technology Services, 33100, Udine, Italy
- Istituto di Genomica Applicata, 33100, Udine, Italy
- Department of Agricultural, Food, Environmental and Animal Sciences, University of Udine, 33100, Udine, Italy
| | - Nicola Vitulo
- Department of Biotechnology, University of Verona, 37134, Verona, Italy
| | | | | | | | | | | | | | | | - Lucile Toniutti
- World Coffee Research, Portland, 97225, OR, USA
- CIRAD, UMR AGAP Institut, 97130, Capesterre-Belle-Eau, Guadeloupe, France
- UMR AGAP Institut, University of Montpellier, CIRAD, INRAE, Institut Agro, 34060, Montpellier, France
| | | | | | | | - Michele Morgante
- Istituto di Genomica Applicata, 33100, Udine, Italy.
- Department of Agricultural, Food, Environmental and Animal Sciences, University of Udine, 33100, Udine, Italy.
| |
Collapse
|
2
|
Wang M, Meng G, Yang Y, Wang X, Xie R, Dong C. Telomere-to-Telomere Genome Assembly of Tibetan Medicinal Mushroom Ganoderma leucocontextum and the First Copia Centromeric Retrotransposon in Macro-Fungi Genome. J Fungi (Basel) 2023; 10:15. [PMID: 38248925 PMCID: PMC10817607 DOI: 10.3390/jof10010015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Revised: 12/18/2023] [Accepted: 12/25/2023] [Indexed: 01/23/2024] Open
Abstract
A complete telomere-to-telomere (T2T) genome has been a longstanding goal in the field of genomic research. By integrating high-coverage and precise long-read sequencing data using multiple assembly strategies, we present here the first T2T gap-free genome assembly of Ganoderma leucocontextum strain GL72, a Tibetan medicinal mushroom. The T2T genome, with a size of 46.69 Mb, consists 13 complete nuclear chromosomes and typical telomeric repeats (CCCTAA)n were detected at both ends of 13 chromosomes. The high mapping rate, uniform genome coverage, a complete BUSCOs of 99.7%, and base accuracy exceeding 99.999% indicate that this assembly represents the highest level of completeness and quality. Regions characterized by distinct structural attributes, including highest Hi-C interaction intensity, high repeat content, decreased gene density, low GC content, and minimal or no transcription levels across all chromosomes may represent potential centromeres. Sequence analysis revealed the first Copia centromeric retrotransposon in macro-fungi genome. Phylogenomic analysis identified that G. leucocontextum and G. tsugae diverged from the other Ganoderma species approximately 9.8-17.9 MYA. The prediction of secondary metabolic clusters confirmed the capability of this fungus to produce a substantial quantity of metabolites. This T2T gap-free genome will contribute to the genomic 'dark matter' elucidation and server as a great reference for genetics, genomics, and evolutionary studies of G. leucocontextum.
Collapse
Affiliation(s)
- Miao Wang
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; (M.W.); (G.M.); (Y.Y.); (X.W.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Guoliang Meng
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; (M.W.); (G.M.); (Y.Y.); (X.W.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Ying Yang
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; (M.W.); (G.M.); (Y.Y.); (X.W.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaofang Wang
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; (M.W.); (G.M.); (Y.Y.); (X.W.)
| | - Rong Xie
- Institute of Vegetable Sciences, Tibet Academy of Agricultural and Animal Husbandry Sciences, Lhasa 850000, China;
| | - Caihong Dong
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; (M.W.); (G.M.); (Y.Y.); (X.W.)
| |
Collapse
|
3
|
Ding W, Zhu Y, Han J, Zhang H, Xu Z, Khurshid H, Liu F, Hasterok R, Shen X, Wang K. Characterization of centromeric DNA of Gossypium anomalum reveals sequence-independent enrichment dynamics of centromeric repeats. Chromosome Res 2023; 31:12. [PMID: 36971835 DOI: 10.1007/s10577-023-09721-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 02/20/2023] [Accepted: 03/04/2023] [Indexed: 03/29/2023]
Abstract
Centromeres in eukaryotes are composed of highly repetitive DNAs, which evolve rapidly and are thought to achieve a favorable structure in mature centromeres. However, how the centromeric repeat evolves into an adaptive structure is largely unknown. We characterized the centromeric sequences of Gossypium anomalum through chromatin immunoprecipitation against CENH3 antibodies. We revealed that the G. anomalum centromeres contained only retrotransposon-like repeats but were depleted in long arrays of satellites. These retrotransposon-like centromeric repeats were present in the African-Asian and Australian lineage species, suggesting that they might have arisen in the common ancestor of these diploid species. Intriguingly, we observed a substantial increase and decrease in copy numbers among African-Asian and Australian lineages, respectively, for the retrotransposon-derived centromeric repeats without apparent structure or sequence variation in cotton. This result indicates that the sequence content is not a decisive aspect of the adaptive evolution of centromeric repeats or at least retrotransposon-like centromeric repeats. In addition, two active genes with potential roles in gametogenesis or flowering were identified in CENH3 nucleosome-binding regions. Our results provide new insights into the constitution of centromeric repetitive DNA and the adaptive evolution of centromeric repeats in plants.
Collapse
Affiliation(s)
- Wenjie Ding
- School of Life Sciences, Nantong University, Nantong, 226019, China
| | - Yuanbin Zhu
- School of Life Sciences, Nantong University, Nantong, 226019, China
- College of Agriculture, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Jinlei Han
- School of Life Sciences, Nantong University, Nantong, 226019, China
| | - Hui Zhang
- School of Life Sciences, Nantong University, Nantong, 226019, China
| | - Zhenzhen Xu
- Key Laboratory of Cotton and Rapeseed (Nanjing), Ministry of Agriculture and Rural Affairs, the Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China
| | - Haris Khurshid
- Oilseeds Research Program, National Agricultural Research Centre, Islamabad, 44500, Pakistan
| | - Fang Liu
- State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang, 455000, China
| | - Robert Hasterok
- Plant Cytogenetics and Molecular Biology Group, Institute of Biology, Biotechnology and Environmental Protection, Faculty of Natural Sciences, University of Silesia in Katowice, Katowice, 40-032, Poland.
| | - Xinlian Shen
- Key Laboratory of Cotton and Rapeseed (Nanjing), Ministry of Agriculture and Rural Affairs, the Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China.
| | - Kai Wang
- School of Life Sciences, Nantong University, Nantong, 226019, China.
| |
Collapse
|
4
|
Orozco-Arias S, Gaviria-Orrego S, Tabares-Soto R, Isaza G, Guyot R. InpactorDB: A Plant LTR Retrotransposon Reference Library. Methods Mol Biol 2023; 2703:31-44. [PMID: 37646935 DOI: 10.1007/978-1-0716-3389-2_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
LTR retrotransposons (LTR-RT) are major components of plant genomes. These transposable elements participate in the structure and evolution of genes and genomes through their mobility and their copy number amplification. For example, they are commonly used as evolutionary markers in genetic, genomic, and cytogenetic approaches. However, the plant research community is faced with the near absence of free availability of full-length, curated, and lineage-level classified LTR retrotransposon reference sequences. In this chapter, we will introduce InpactorDB, an LTR retrotransposon sequence database of 181 plant species representing 98 plant families for a total of 67,241 non-redundant elements. We will introduce how to use newly sequenced genomes to identify and classify LTR-RTs in a similar way with a standardized procedure using the Inpactor tool. InpactorDB is freely available at https://inpactordb.github.io .
Collapse
Affiliation(s)
- Simon Orozco-Arias
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia
- Department of Systems and Informatics, Universidad de Caldas, Manizales, Caldas, Colombia
| | - Simon Gaviria-Orrego
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia
| | - Reinel Tabares-Soto
- Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia
- Department of Systems and Informatics, Universidad de Caldas, Manizales, Caldas, Colombia
| | - Gustavo Isaza
- Department of Systems and Informatics, Universidad de Caldas, Manizales, Caldas, Colombia
| | - Romain Guyot
- Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia.
- Institut de Recherche pour le Développement, CIRAD, University of Montpellier, Montpellier, France.
| |
Collapse
|
5
|
Kirov I, Merkulov P, Polkhovskaya E, Konstantinov Z, Kazancev M, Saenko K, Polkhovskiy A, Dudnikov M, Garibyan T, Demurin Y, Soloviev A. Epigenetic Stress and Long-Read cDNA Sequencing of Sunflower ( Helianthus annuus L.) Revealed the Origin of the Plant Retrotranscriptome. PLANTS (BASEL, SWITZERLAND) 2022; 11:plants11243579. [PMID: 36559691 PMCID: PMC9784723 DOI: 10.3390/plants11243579] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 12/13/2022] [Accepted: 12/13/2022] [Indexed: 06/12/2023]
Abstract
Transposable elements (TEs) contribute not only to genome diversity but also to transcriptome diversity in plants. To unravel the sources of LTR retrotransposon (RTE) transcripts in sunflower, we exploited a recently developed transposon activation method ('TEgenesis') along with long-read cDNA Nanopore sequencing. This approach allows for the identification of 56 RTE transcripts from different genomic loci including full-length and non-autonomous RTEs. Using the mobilome analysis, we provided a new set of expressed and transpositional active sunflower RTEs for future studies. Among them, a Ty3/Gypsy RTE called SUNTY3 exhibited ongoing transposition activity, as detected by eccDNA analysis. We showed that the sunflower genome contains a diverse set of non-autonomous RTEs encoding a single RTE protein, including the previously described TR-GAG (terminal repeat with the GAG domain) as well as new categories, TR-RT-RH, TR-RH, and TR-INT-RT. Our results demonstrate that 40% of the loci for RTE-related transcripts (nonLTR-RTEs) lack their LTR sequences and resemble conventional eucaryotic genes encoding RTE-related proteins with unknown functions. It was evident based on phylogenetic analysis that three nonLTR-RTEs encode GAG (HadGAG1-3) fused to a host protein. These HadGAG proteins have homologs found in other plant species, potentially indicating GAG domestication. Ultimately, we found that the sunflower retrotranscriptome originated from the transcription of active RTEs, non-autonomous RTEs, and gene-like RTE transcripts, including those encoding domesticated proteins.
Collapse
Affiliation(s)
- Ilya Kirov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia
- Moscow Institute of Physics and Technology, 141701 Dolgoprudny, Russia
| | - Pavel Merkulov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia
- Moscow Institute of Physics and Technology, 141701 Dolgoprudny, Russia
| | - Ekaterina Polkhovskaya
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia
| | - Zakhar Konstantinov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia
| | - Mikhail Kazancev
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia
- Moscow Institute of Physics and Technology, 141701 Dolgoprudny, Russia
| | - Ksenia Saenko
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia
- Federal Research Center of Biological Plant Protection, 350039 Krasnodar, Russia
| | - Alexander Polkhovskiy
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia
- Moscow Institute of Physics and Technology, 141701 Dolgoprudny, Russia
- Skolkovo Institute of Science and Technology, 121205 Moscow, Russia
| | - Maxim Dudnikov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia
- Moscow Institute of Physics and Technology, 141701 Dolgoprudny, Russia
| | - Tsovinar Garibyan
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia
| | - Yakov Demurin
- Pustovoit All-Russia Research Institute of Oilseed Crops, Filatova St. 17, 350038 Krasnodar, Russia
| | - Alexander Soloviev
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia
| |
Collapse
|
6
|
Papolu PK, Ramakrishnan M, Mullasseri S, Kalendar R, Wei Q, Zou L, Ahmad Z, Vinod KK, Yang P, Zhou M. Retrotransposons: How the continuous evolutionary front shapes plant genomes for response to heat stress. FRONTIERS IN PLANT SCIENCE 2022; 13:1064847. [PMID: 36570931 PMCID: PMC9780303 DOI: 10.3389/fpls.2022.1064847] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Accepted: 11/21/2022] [Indexed: 05/28/2023]
Abstract
Long terminal repeat retrotransposons (LTR retrotransposons) are the most abundant group of mobile genetic elements in eukaryotic genomes and are essential in organizing genomic architecture and phenotypic variations. The diverse families of retrotransposons are related to retroviruses. As retrotransposable elements are dispersed and ubiquitous, their "copy-out and paste-in" life cycle of replicative transposition leads to new genome insertions without the excision of the original element. The overall structure of retrotransposons and the domains responsible for the various phases of their replication is highly conserved in all eukaryotes. The two major superfamilies of LTR retrotransposons, Ty1/Copia and Ty3/Gypsy, are distinguished and dispersed across the chromosomes of higher plants. Members of these superfamilies can increase in copy number and are often activated by various biotic and abiotic stresses due to retrotransposition bursts. LTR retrotransposons are important drivers of species diversity and exhibit great variety in structure, size, and mechanisms of transposition, making them important putative actors in genome evolution. Additionally, LTR retrotransposons influence the gene expression patterns of adjacent genes by modulating potential small interfering RNA (siRNA) and RNA-directed DNA methylation (RdDM) pathways. Furthermore, comparative and evolutionary analysis of the most important crop genome sequences and advanced technologies have elucidated the epigenetics and structural and functional modifications driven by LTR retrotransposon during speciation. However, mechanistic insights into LTR retrotransposons remain obscure in plant development due to a lack of advancement in high throughput technologies. In this review, we focus on the key role of LTR retrotransposons response in plants during heat stress, the role of centromeric LTR retrotransposons, and the role of LTR retrotransposon markers in genome expression and evolution.
Collapse
Affiliation(s)
- Pradeep K. Papolu
- State Key Laboratory of Subtropical Silviculture, Bamboo Industry Institute, Zhejiang A&F University, Hangzhou, Zhejiang, China
| | - Muthusamy Ramakrishnan
- State Key Laboratory of Subtropical Silviculture, Bamboo Industry Institute, Zhejiang A&F University, Hangzhou, Zhejiang, China
- Co-Innovation Center for Sustainable Forestry in Southern China, Bamboo Research Institute, Key Laboratory of National Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Biology and the Environment, Nanjing Forestry University, Nanjing, Jiangsu, China
| | - Sileesh Mullasseri
- Department of Zoology, St. Albert’s College (Autonomous), Kochi, Kerala, India
| | - Ruslan Kalendar
- Helsinki Institute of Life Science HiLIFE, Biocenter 3, University of Helsinki, Helsinki, Finland
- National Laboratory Astana, Nazarbayev University, Astana, Kazakhstan
| | - Qiang Wei
- Co-Innovation Center for Sustainable Forestry in Southern China, Bamboo Research Institute, Key Laboratory of National Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Biology and the Environment, Nanjing Forestry University, Nanjing, Jiangsu, China
| | - Long−Hai Zou
- State Key Laboratory of Subtropical Silviculture, Bamboo Industry Institute, Zhejiang A&F University, Hangzhou, Zhejiang, China
| | - Zishan Ahmad
- Co-Innovation Center for Sustainable Forestry in Southern China, Bamboo Research Institute, Key Laboratory of National Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Biology and the Environment, Nanjing Forestry University, Nanjing, Jiangsu, China
| | | | - Ping Yang
- State Key Laboratory of Subtropical Silviculture, Bamboo Industry Institute, Zhejiang A&F University, Hangzhou, Zhejiang, China
- Zhejiang Provincial Collaborative Innovation Center for Bamboo Resources and High-Efficiency Utilization, Zhejiang A&F University, Hangzhou, Zhejiang, China
| | - Mingbing Zhou
- State Key Laboratory of Subtropical Silviculture, Bamboo Industry Institute, Zhejiang A&F University, Hangzhou, Zhejiang, China
- Zhejiang Provincial Collaborative Innovation Center for Bamboo Resources and High-Efficiency Utilization, Zhejiang A&F University, Hangzhou, Zhejiang, China
| |
Collapse
|
7
|
Sattler MC, de Oliveira SC, Mendonça MAC, Clarindo WR. Coffea cytogenetics: from the first karyotypes to the meeting with genomics. PLANTA 2022; 255:112. [PMID: 35501619 DOI: 10.1007/s00425-022-03898-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 04/11/2022] [Indexed: 06/14/2023]
Abstract
Coffea karyotype organization and evolution has been uncovered by classical cytogenetics and cytogenomics. We revisit these discoveries and present new karyotype data. Coffea possesses ~ 124 species, including C. arabica and C. canephora responsible for commercial coffee production. We reviewed the Coffea cytogenetics, from the first chromosome counting, encompassing the karyotype characterization, chromosome DNA content, and mapping of chromosome portions and DNA sequences, until the integration with genomics. We also showed new data about Coffea karyotype. The 2n chromosome number evidenced the diploidy of almost all Coffea, and the C. arabica tetraploidy, as well as the polyploidy of other hybrids. Since then, other genomic similarities and divergences among the Coffea have been shown by karyotype morphology, nuclear and chromosomal C-value, AT and GC rich chromosome portions, and repetitive sequence and gene mapping. These cytogenomic data allowed us to know and understand the phylogenetic relations in Coffea, as well as their ploidy level and genomic origin, highlighting the relatively recent allopolyploidy. In addition to the euploidy, the role of the mobile elements in Coffea diversification is increasingly more evident, and the comparative analysis of their structure and distribution on the genome of different species is in the spotlight for future research. An integrative look at all these data is fundamental for a deeper understanding of Coffea karyotype evolution, including the key role of polyploidy in C. arabica origin. The 'Híbrido de Timor', a recent natural allotriploid, is also in the spotlight for its potential as a source of resistance genes and model for plant polyploidy research. Considering this, we also present some unprecedented results about the exciting evolutionary history of these polyploid Coffea.
Collapse
Affiliation(s)
- Mariana Cansian Sattler
- Laboratório de Citogenética e Citometria, Departamento de Biologia Geral, Universidade Federal de Viçosa, Viçosa, MG, ZIP 36.570-900, Brazil.
| | - Stéfanie Cristina de Oliveira
- Laboratório de Citogenética e Cultura de Tecidos Vegetais, Campus de Alegre, Universidade Federal Do Espírito Santo, Alegre, ES, ZIP 29.500-000, Brazil
| | | | - Wellington Ronildo Clarindo
- Laboratório de Citogenética e Citometria, Departamento de Biologia Geral, Universidade Federal de Viçosa, Viçosa, MG, ZIP 36.570-900, Brazil
| |
Collapse
|
8
|
Cintra LA, Souza TBD, Parteka LM, Barreto LM, Pereira LFP, Gaeta ML, Guyot R, Vanzela ALL. An 82 bp tandem repeat family typical of 3' non-coding end of Gypsy/TAT LTR retrotransposons is conserved in Coffea spp. pericentromeres. Genome 2021; 65:137-151. [PMID: 34727516 DOI: 10.1139/gen-2021-0045] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Coffea spp. chromosomes are very small and accumulate a variety of repetitive DNA families around the centromeres. However, the proximal regions of Coffea chromosomes remain poorly understood, especially regarding the nature and organisation of the sequences. Taking advantage of the genome sequences of C. arabica (2n = 44), C. canephora, and C. eugenioides (C. arabica progenitors with 2n = 22) and good coverage genome sequencing of dozens of other wild Coffea spp., repetitive DNA sequences were identified, and the genomes were compared to decipher particularities of pericentromeric structures. The searches revealed a short tandem repeat (82 bp length) typical of Gypsy/TAT LTR retrotransposons, named Coffea_sat11. This repeat organises clusters with fragments of other transposable elements, comprising regions of non-coding RNA production. Cytogenomic analyses showed that Coffea_sat11 extends from the pericentromeres towards the middle of the chromosomal arms. This arrangement was observed in the allotetraploid C. arabica chromosomes, as well as in its progenitors. This study improves our understanding of the role of the Gypsy/TAT LTR retrotransposon lineage in the organisation of Coffea pericentromeres, as well as the conservation of Coffea_sat11 within the genus. The relationships between fragments of other transposable elements and the functional aspects of these sequences on the pericentromere chromatin were also evaluated. Highlights: A scattered short tandem repeat, typical of Gypsy/TAT LTR retrotransposons, associated with several fragments of other transposable elements, accumulates in the pericentromeres of Coffea chromosomes. This arrangement is preserved in all clades of the genus and appears to have a strong regulatory role in the organisation of chromatin around centromeres.
Collapse
Affiliation(s)
- Leonardo Adabo Cintra
- Laboratório de Citogenética e Diversidade Vegetal, Departamento de Biologia Geral, Centro de Ciências Biológicas, Universidade Estadual de Londrina, Londrina, 86097-570, Paraná, Brazil.,Programa de Pós-graduação em Genética e Biologia Molecular, Centro de Ciências Biológicas, Universidade Estadual de Londrina, Londrina, 86097-570, Paraná, Brazil
| | - Thaíssa Boldieri de Souza
- Laboratório de Citogenética e Diversidade Vegetal, Departamento de Biologia Geral, Centro de Ciências Biológicas, Universidade Estadual de Londrina, Londrina, 86097-570, Paraná, Brazil.,Programa de Pós-graduação em Genética e Biologia Molecular, Centro de Ciências Biológicas, Universidade Estadual de Londrina, Londrina, 86097-570, Paraná, Brazil
| | - Letícia Maria Parteka
- Laboratório de Citogenética e Diversidade Vegetal, Departamento de Biologia Geral, Centro de Ciências Biológicas, Universidade Estadual de Londrina, Londrina, 86097-570, Paraná, Brazil.,Programa de Pós-graduação em Genética e Biologia Molecular, Centro de Ciências Biológicas, Universidade Estadual de Londrina, Londrina, 86097-570, Paraná, Brazil
| | - Lucas Mesquita Barreto
- Laboratório de Citogenética e Diversidade Vegetal, Departamento de Biologia Geral, Centro de Ciências Biológicas, Universidade Estadual de Londrina, Londrina, 86097-570, Paraná, Brazil.,Programa de Pós-graduação em Genética e Biologia Molecular, Centro de Ciências Biológicas, Universidade Estadual de Londrina, Londrina, 86097-570, Paraná, Brazil
| | | | - Marcos Letaif Gaeta
- Laboratório de Citogenética e Diversidade Vegetal, Departamento de Biologia Geral, Centro de Ciências Biológicas, Universidade Estadual de Londrina, Londrina, 86097-570, Paraná, Brazil
| | - Romain Guyot
- Institut de Recherche pour le Développement, CIRAD, Université Montpellier, 34394, Montpellier, France.,Department of Electronics and Automation, Universidad Autónoma de Manizales, 170002, Manizales, Caldas, Colombia
| | - André Luís Laforga Vanzela
- Laboratório de Citogenética e Diversidade Vegetal, Departamento de Biologia Geral, Centro de Ciências Biológicas, Universidade Estadual de Londrina, Londrina, 86097-570, Paraná, Brazil
| |
Collapse
|
9
|
Orozco-Arias S, Jaimes PA, Candamil MS, Jiménez-Varón CF, Tabares-Soto R, Isaza G, Guyot R. InpactorDB: A Classified Lineage-Level Plant LTR Retrotransposon Reference Library for Free-Alignment Methods Based on Machine Learning. Genes (Basel) 2021; 12:genes12020190. [PMID: 33525408 PMCID: PMC7910972 DOI: 10.3390/genes12020190] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Revised: 01/21/2021] [Accepted: 01/22/2021] [Indexed: 12/04/2022] Open
Abstract
Long terminal repeat (LTR) retrotransposons are mobile elements that constitute the major fraction of most plant genomes. The identification and annotation of these elements via bioinformatics approaches represent a major challenge in the era of massive plant genome sequencing. In addition to their involvement in genome size variation, LTR retrotransposons are also associated with the function and structure of different chromosomal regions and can alter the function of coding regions, among others. Several sequence databases of plant LTR retrotransposons are available for public access, such as PGSB and RepetDB, or restricted access such as Repbase. Although these databases are useful to identify LTR-RTs in new genomes by similarity, the elements of these databases are not fully classified to the lineage (also called family) level. Here, we present InpactorDB, a semi-curated dataset composed of 130,439 elements from 195 plant genomes (belonging to 108 plant species) classified to the lineage level. This dataset has been used to train two deep neural networks (i.e., one fully connected and one convolutional) for the rapid classification of these elements. In lineage-level classification approaches, we obtain up to 98% performance, indicated by the F1-score, precision and recall scores.
Collapse
Affiliation(s)
- Simon Orozco-Arias
- Department of Computer Science, Universidad Autónoma de Manizales, 170002 Manizales, Colombia; (P.A.J.); (M.S.C.)
- Department of Systems and Informatics, Universidad de Caldas, 170002 Manizales, Colombia;
- Correspondence: (S.O.-A.); (R.G.)
| | - Paula A. Jaimes
- Department of Computer Science, Universidad Autónoma de Manizales, 170002 Manizales, Colombia; (P.A.J.); (M.S.C.)
| | - Mariana S. Candamil
- Department of Computer Science, Universidad Autónoma de Manizales, 170002 Manizales, Colombia; (P.A.J.); (M.S.C.)
| | | | - Reinel Tabares-Soto
- Department of Electronics and Automation, Universidad Autónoma de Manizales, 170002 Manizales, Colombia;
| | - Gustavo Isaza
- Department of Systems and Informatics, Universidad de Caldas, 170002 Manizales, Colombia;
| | - Romain Guyot
- Department of Electronics and Automation, Universidad Autónoma de Manizales, 170002 Manizales, Colombia;
- Institut de Recherche pour le Développement, CIRAD, University of Montpellier, 34394 Montpellier, France
- Correspondence: (S.O.-A.); (R.G.)
| |
Collapse
|
10
|
Valencia JB, Mesa J, León JG, Madriñán S, Cortés AJ. Climate Vulnerability Assessment of the Espeletia Complex on Páramo Sky Islands in the Northern Andes. Front Ecol Evol 2020. [DOI: 10.3389/fevo.2020.565708] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
|
11
|
Measuring Performance Metrics of Machine Learning Algorithms for Detecting and Classifying Transposable Elements. Processes (Basel) 2020. [DOI: 10.3390/pr8060638] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Because of the promising results obtained by machine learning (ML) approaches in several fields, every day is more common, the utilization of ML to solve problems in bioinformatics. In genomics, a current issue is to detect and classify transposable elements (TEs) because of the tedious tasks involved in bioinformatics methods. Thus, ML was recently evaluated for TE datasets, demonstrating better results than bioinformatics applications. A crucial step for ML approaches is the selection of metrics that measure the realistic performance of algorithms. Each metric has specific characteristics and measures properties that may be different from the predicted results. Although the most commonly used way to compare measures is by using empirical analysis, a non-result-based methodology has been proposed, called measure invariance properties. These properties are calculated on the basis of whether a given measure changes its value under certain modifications in the confusion matrix, giving comparative parameters independent of the datasets. Measure invariance properties make metrics more or less informative, particularly on unbalanced, monomodal, or multimodal negative class datasets and for real or simulated datasets. Although several studies applied ML to detect and classify TEs, there are no works evaluating performance metrics in TE tasks. Here, we analyzed 26 different metrics utilized in binary, multiclass, and hierarchical classifications, through bibliographic sources, and their invariance properties. Then, we corroborated our findings utilizing freely available TE datasets and commonly used ML algorithms. Based on our analysis, the most suitable metrics for TE tasks must be stable, even using highly unbalanced datasets, multimodal negative class, and training datasets with errors or outliers. Based on these parameters, we conclude that the F1-score and the area under the precision-recall curve are the most informative metrics since they are calculated based on other metrics, providing insight into the development of an ML application.
Collapse
|
12
|
de Assis R, Baba VY, Cintra LA, Gonçalves LSA, Rodrigues R, Vanzela ALL. Genome relationships and LTR-retrotransposon diversity in three cultivated Capsicum L. (Solanaceae) species. BMC Genomics 2020; 21:237. [PMID: 32183698 PMCID: PMC7076952 DOI: 10.1186/s12864-020-6618-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Accepted: 02/24/2020] [Indexed: 01/08/2023] Open
Abstract
Background Plant genomes are rich in repetitive sequences, and transposable elements (TEs) are the most accumulated of them. This mobile fraction can be distinguished as Class I (retrotransposons) and Class II (transposons). Retrotransposons that are transposed using an intermediate RNA and that accumulate in a “copy-and-paste” manner were screened in three genomes of peppers (Solanaceae). The present study aimed to understand the genome relationships among Capsicum annuum, C. chinense, and C. baccatum, based on a comparative analysis of the function, diversity and chromosome distribution of TE lineages in the Capsicum karyotypes. Due to the great commercial importance of pepper in natura, as a spice or as an ornamental plant, these genomes have been widely sequenced, and all of the assemblies are available in the SolGenomics group. These sequences were used to compare all repetitive fractions from a cytogenomic point of view. Results The qualification and quantification of LTR-retrotransposons (LTR-RT) families were contrasted with molecular cytogenetic data, and the results showed a strong genome similarity between C. annuum and C. chinense as compared to C. baccatum. The Gypsy superfamily is more abundant than Copia, especially for Tekay/Del lineage members, including a high representation in C. annuum and C. chinense. On the other hand, C. baccatum accumulates more Athila/Tat sequences. The FISH results showed retrotransposons differentially scattered along chromosomes, except for CRM lineage sequences, which mainly have a proximal accumulation associated with heterochromatin bands. Conclusions The results confirm a close genomic relationship between C. annuum and C. chinense in comparison to C. baccatum. Centromeric GC-rich bands may be associated with the accumulation regions of CRM elements, whereas terminal and subterminal AT- and GC-rich bands do not correspond to the accumulation of the retrotransposons in the three Capsicum species tested.
Collapse
Affiliation(s)
- Rafael de Assis
- Laboratório de Citogenética e Diversidade Vegetal, Universidade Estadual de Londrina, 86057-970, Londrina, Paraná, Brazil
| | - Viviane Yumi Baba
- Departamento de Agronomia, Universidade Estadual de Londrina, 86057-970, Londrina, Paraná, Brazil
| | - Leonardo Adabo Cintra
- Laboratório de Citogenética e Diversidade Vegetal, Universidade Estadual de Londrina, 86057-970, Londrina, Paraná, Brazil
| | | | - Rosana Rodrigues
- Laboratório de Melhoramento Genético Vegetal, Universidade Estadual do Norte Fluminense Darcy Ribeiro, Campos dos Goytacazes, Rio de Janeiro, 28013-602, Brazil
| | - André Luís Laforga Vanzela
- Laboratório de Citogenética e Diversidade Vegetal, Universidade Estadual de Londrina, 86057-970, Londrina, Paraná, Brazil.
| |
Collapse
|
13
|
Orozco-Arias S, Isaza G, Guyot R, Tabares-Soto R. A systematic review of the application of machine learning in the detection and classification of transposable elements. PeerJ 2019; 7:e8311. [PMID: 31976169 PMCID: PMC6967008 DOI: 10.7717/peerj.8311] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Accepted: 11/28/2019] [Indexed: 12/16/2022] Open
Abstract
Background Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting and classifying TEs, none have achieved reliable results on different types of TEs. Machine learning (ML) techniques can automatically extract hidden patterns and novel information from labeled or non-labeled data and have been applied to solving several scientific problems. Methodology We followed the Systematic Literature Review (SLR) process, applying the six stages of the review protocol from it, but added a previous stage, which aims to detect the need for a review. Then search equations were formulated and executed in several literature databases. Relevant publications were scanned and used to extract evidence to answer research questions. Results Several ML approaches have already been tested on other bioinformatics problems with promising results, yet there are few algorithms and architectures available in literature focused specifically on TEs, despite representing the majority of the nuclear DNA of many organisms. Only 35 articles were found and categorized as relevant in TE or related fields. Conclusions ML is a powerful tool that can be used to address many problems. Although ML techniques have been used widely in other biological tasks, their utilization in TE analyses is still limited. Following the SLR, it was possible to notice that the use of ML for TE analyses (detection and classification) is an open problem, and this new field of research is growing in interest.
Collapse
Affiliation(s)
- Simon Orozco-Arias
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia.,Department of Systems and Informatics, Universidad de Caldas, Manizales, Caldas, Colombia
| | - Gustavo Isaza
- Department of Systems and Informatics, Universidad de Caldas, Manizales, Caldas, Colombia
| | - Romain Guyot
- Institut de Recherche pour le Développement, CIRAD, University of Montpellier, Montpellier, France.,Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia
| | - Reinel Tabares-Soto
- Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia
| |
Collapse
|
14
|
Van-Lume B, Mata-Sucre Y, Báez M, Ribeiro T, Huettel B, Gagnon E, Leitch IJ, Pedrosa-Harand A, Lewis GP, Souza G. Evolutionary convergence or homology? Comparative cytogenomics of Caesalpinia group species (Leguminosae) reveals diversification in the pericentromeric heterochromatic composition. PLANTA 2019; 250:2173-2186. [PMID: 31696317 DOI: 10.1007/s00425-019-03287-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 09/25/2019] [Indexed: 05/02/2023]
Abstract
We demonstrated by cytogenomic analysis that the proximal heterochromatin of the Northeast Brazilian species of Caesalpinia group is enriched with phylogenetically conserved Ty3/Gypsy-Tekay RT, but diverge in the presence of Ty3/Gypsy-Athila RT and satDNA. The Caesalpinia Group includes 225 species and 27 monophyletic genera of which four occur in Northeastern Brazil: Erythrostemon (1 sp.), Cenostigma (7 spp.), Libidibia (1 sp.), and Paubrasilia (1 sp.). The last three genera are placed in different clades in the Caesalpinia Group phylogeny, and yet they are characterized by having a numerically stable karyotype 2n = 24 (16 M+8A) and GC-rich heterochromatic bands (chromomycin A3 positive/CMA+ bands) in the proximal chromosome regions. To characterize the composition of their heterochromatin and test for the homology of these chromosomal regions, genomic DNA was extracted from Cenostigma microphyllum, Libidibia ferrea, and Paubrasilia echinata, and sequenced at low coverage using the Illumina platform. The genomic repetitive fractions were characterized using a Galaxy/RepeatExplorer-Elixir platform. The most abundant elements of each genome were chromosomally located by fluorescent in situ hybridization (FISH) and compared to the CMA+ heterochromatin distribution. The repetitive fraction of the genomes of C. microphyllum, L. ferrea, and P. echinata were estimated to be 41.70%, 38.44%, and 72.51%, respectively. Ty3/Gypsy retrotransposons (RT), specifically the Tekay lineage, were the most abundant repeats in each of the three genomes. FISH mapping revealed species-specific patterns for the Tekay elements in the proximal regions of the chromosomes, co-localized with CMA+ bands. Other species-specific patterns were observed, e.g., for the Ty3/Gypsy RT Athila elements which were found in all the proximal heterochromatin of L. ferrea or restricted to the acrocentric chromosomes of C. microphyllum. This Athila labeling co-localized with satellite DNAs (satDNAs). Although the Caesalpinia Group diverged around 55 Mya, our results suggest an ancestral colonization of Tekay RT in the proximal heterochromatin. Thus, the present-day composition of the pericentromeric heterochromatin in these Northeast Brazilian species is a combination of the maintenance of an ancestral Tekay distribution with a species-specific accumulation of other repeats.
Collapse
Affiliation(s)
- Brena Van-Lume
- Laboratory of Plant Cytogenetics and Evolution, Department of Botany, Federal University of Pernambuco, Rua Nelson Chaves S/N, Cidade Universitária, Recife, PE, 50670-420, Brazil
| | - Yennifer Mata-Sucre
- Laboratory of Plant Cytogenetics and Evolution, Department of Botany, Federal University of Pernambuco, Rua Nelson Chaves S/N, Cidade Universitária, Recife, PE, 50670-420, Brazil
| | - Mariana Báez
- Laboratory of Plant Cytogenetics and Evolution, Department of Botany, Federal University of Pernambuco, Rua Nelson Chaves S/N, Cidade Universitária, Recife, PE, 50670-420, Brazil
| | - Tiago Ribeiro
- Laboratory of Plant Cytogenetics and Evolution, Department of Botany, Federal University of Pernambuco, Rua Nelson Chaves S/N, Cidade Universitária, Recife, PE, 50670-420, Brazil
- Department of Botany and Ecology, Institute of Biosciences, Federal University of Mato Grosso, Av. Fernando Correa da Costa, 2.367, Boa Esperança, Cuiabá, MT, 78060-900, Brazil
| | | | - Edeline Gagnon
- Royal Botanic Garden Edinburgh, 20A Inverleith Row, Edinburgh, EH3 5NZ, UK
| | - Ilia J Leitch
- Department of Comparative Plant and Fungal Biology, Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AB, UK
| | - Andrea Pedrosa-Harand
- Laboratory of Plant Cytogenetics and Evolution, Department of Botany, Federal University of Pernambuco, Rua Nelson Chaves S/N, Cidade Universitária, Recife, PE, 50670-420, Brazil
| | - Gwilym P Lewis
- Department of Comparative Plant and Fungal Biology, Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AB, UK
| | - Gustavo Souza
- Laboratory of Plant Cytogenetics and Evolution, Department of Botany, Federal University of Pernambuco, Rua Nelson Chaves S/N, Cidade Universitária, Recife, PE, 50670-420, Brazil.
| |
Collapse
|
15
|
Orozco-Arias S, Isaza G, Guyot R. Retrotransposons in Plant Genomes: Structure, Identification, and Classification through Bioinformatics and Machine Learning. Int J Mol Sci 2019; 20:E3837. [PMID: 31390781 PMCID: PMC6696364 DOI: 10.3390/ijms20153837] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 07/31/2019] [Accepted: 08/02/2019] [Indexed: 01/26/2023] Open
Abstract
Transposable elements (TEs) are genomic units able to move within the genome of virtually all organisms. Due to their natural repetitive numbers and their high structural diversity, the identification and classification of TEs remain a challenge in sequenced genomes. Although TEs were initially regarded as "junk DNA", it has been demonstrated that they play key roles in chromosome structures, gene expression, and regulation, as well as adaptation and evolution. A highly reliable annotation of these elements is, therefore, crucial to better understand genome functions and their evolution. To date, much bioinformatics software has been developed to address TE detection and classification processes, but many problematic aspects remain, such as the reliability, precision, and speed of the analyses. Machine learning and deep learning are algorithms that can make automatic predictions and decisions in a wide variety of scientific applications. They have been tested in bioinformatics and, more specifically for TEs, classification with encouraging results. In this review, we will discuss important aspects of TEs, such as their structure, importance in the evolution and architecture of the host, and their current classifications and nomenclatures. We will also address current methods and their limitations in identifying and classifying TEs.
Collapse
Affiliation(s)
- Simon Orozco-Arias
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales 170001, Colombia
- Department of Systems and Informatics, Universidad de Caldas, Manizales 170001, Colombia
| | - Gustavo Isaza
- Department of Systems and Informatics, Universidad de Caldas, Manizales 170001, Colombia
| | - Romain Guyot
- Department of Electronics and Automatization, Universidad Autónoma de Manizales, Manizales 170001, Colombia.
- Institut de Recherche pour le Développement, CIRAD, University Montpellier, 34000 Montpellier, France.
| |
Collapse
|
16
|
Suguiyama VF, Vasconcelos LAB, Rossi MM, Biondo C, de Setta N. The population genetic structure approach adds new insights into the evolution of plant LTR retrotransposon lineages. PLoS One 2019; 14:e0214542. [PMID: 31107873 PMCID: PMC6527191 DOI: 10.1371/journal.pone.0214542] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Accepted: 03/14/2019] [Indexed: 12/30/2022] Open
Abstract
Long terminal repeat retrotransposons (LTR-RTs) in plant genomes differ in abundance, structure and genomic distribution, reflecting the large number of evolutionary lineages. Elements within lineages can be considered populations, in which each element is an individual in its genomic environment. In this way, it would be reasonable to apply microevolutionary analyses to understand transposable element (TE) evolution, such as those used to study the genetic structure of natural populations. Here, we applied a Bayesian method to infer genetic structure of populations together with classical phylogenetic and dating tools to analyze LTR-RT evolution using the monocot Setaria italica as a model species. In contrast to a phylogeny, the Bayesian clusterization method identifies populations by assigning individuals to one or more clusters according to the most probabilistic scenario of admixture, based on genetic diversity patterns. In this work, each LTR-RT insertion was considered to be one individual and each LTR-RT lineage was considered to be a single species. Nine evolutionary lineages of LTR-RTs were identified in the S. italica genome that had different genetic structures with variable numbers of clusters and levels of admixture. Comprehensive analysis of the phylogenetic, clusterization and time of insertion data allowed us to hypothesize that admixed elements represent sequences that harbor ancestral polymorphic sequence signatures. In conclusion, application of microevolutionary concepts in genome evolution studies is suitable as a complementary approach to phylogenetic analyses to address the evolutionary history and functional features of TEs.
Collapse
Affiliation(s)
- Vanessa Fuentes Suguiyama
- Centro de Ciências Naturais e Humanas, Universidade Federal do ABC, São Bernardo do Campo, SP, Brazil
| | | | - Maria Magdalena Rossi
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo, São Paulo, SP, Brazil
| | - Cibele Biondo
- Centro de Ciências Naturais e Humanas, Universidade Federal do ABC, São Bernardo do Campo, SP, Brazil
| | - Nathalia de Setta
- Centro de Ciências Naturais e Humanas, Universidade Federal do ABC, São Bernardo do Campo, SP, Brazil
- * E-mail:
| |
Collapse
|
17
|
Pamponét VCC, Souza MM, Silva GS, Micheli F, de Melo CAF, de Oliveira SG, Costa EA, Corrêa RX. Low coverage sequencing for repetitive DNA analysis in Passiflora edulis Sims: citogenomic characterization of transposable elements and satellite DNA. BMC Genomics 2019; 20:262. [PMID: 30940088 PMCID: PMC6444444 DOI: 10.1186/s12864-019-5576-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Accepted: 02/28/2019] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND The cytogenomic study of repetitive regions is fundamental for the understanding of morphofunctional mechanisms and genome evolution. Passiflora edulis a species of relevant agronomic value, this work had its genome sequenced by next generation sequencing and bioinformatics analysis performed by RepeatExplorer pipeline. The clusters allowed the identification and characterization of repetitive elements (predominant contributors to most plant genomes). The aim of this study was to identify, characterize and map the repetitive DNA of P. edulis, providing important cytogenomic markers, especially sequences associated with the centromere. RESULTS Three clusters of satellite DNAs (69, 118 and 207) and seven clusters of Long Terminal Repeat (LTR) retrotransposons of the superfamilies Ty1/Copy and Ty3/Gypsy and families Angela, Athila, Chromovirus and Maximus-Sire (6, 11, 36, 43, 86, 94 and 135) were characterized and analyzed. The chromosome mapping of satellite DNAs showed two hybridization sites co-located in the 5S rDNA region (PeSat_1), subterminal hybridizations (PeSat_3) and hybridization in four sites, co-located in the 45S rDNA region (PeSat_2). Most of the retroelements hybridizations showed signals scattered in the chromosomes, diverging in abundance, and only the cluster 6 presented pericentromeric regions marking. No satellite DNAs and retroelement associated with centromere was observed. CONCLUSION P. edulis has a highly repetitive genome, with the predominance of Ty3/Gypsy LTR retrotransposon. The satellite DNAs and LTR retrotransposon characterized are promising markers for investigation of the evolutionary patterns and genetic distinction of species and hybrids of Passiflora.
Collapse
MESH Headings
- Chromosome Mapping
- Chromosomes, Plant
- DNA, Plant/genetics
- DNA, Plant/metabolism
- DNA, Satellite/classification
- DNA, Satellite/genetics
- High-Throughput Nucleotide Sequencing
- In Situ Hybridization, Fluorescence
- Passiflora/genetics
- Phylogeny
- RNA, Ribosomal/genetics
- RNA, Ribosomal, 5S/genetics
- Retroelements/genetics
- Sequence Analysis, DNA
Collapse
Affiliation(s)
- Vanessa Carvalho Cayres Pamponét
- Departamento de Ciências Biológicas, Universidade Estadual de Santa Cruz (UESC), km 16, Salobrinho, Ilhéus, Bahia CEP 45662-900 Brazil
| | - Margarete Magalhães Souza
- Departamento de Ciências Biológicas, Universidade Estadual de Santa Cruz (UESC), km 16, Salobrinho, Ilhéus, Bahia CEP 45662-900 Brazil
| | - Gonçalo Santos Silva
- Departamento de Ciências Biológicas, Universidade Estadual de Santa Cruz (UESC), km 16, Salobrinho, Ilhéus, Bahia CEP 45662-900 Brazil
| | - Fabienne Micheli
- Departamento de Ciências Biológicas, Universidade Estadual de Santa Cruz (UESC), km 16, Salobrinho, Ilhéus, Bahia CEP 45662-900 Brazil
- CIRAD, UMR AGAP, F-34398 Montpellier, France
| | - Cláusio Antônio Ferreira de Melo
- Departamento de Ciências Biológicas, Universidade Estadual de Santa Cruz (UESC), km 16, Salobrinho, Ilhéus, Bahia CEP 45662-900 Brazil
| | - Sarah Gomes de Oliveira
- Departamento de Botânica, Instituto de Biociências, Universidade de São Paulo (USP), Rua do Matão, 14 – Butantã, São Paulo, SP CEP 05508-090 Brazil
| | - Eduardo Almeida Costa
- Núcleo de Biologia Computacional e Gestão de Informações Biotecnológicas (NBCGIB), Universidade Estadual de Santa Cruz (UESC), km 16, Salobrinho, Ilhéus, Bahia CEP 45662-900 Brazil
| | - Ronan Xavier Corrêa
- Departamento de Ciências Biológicas, Universidade Estadual de Santa Cruz (UESC), km 16, Salobrinho, Ilhéus, Bahia CEP 45662-900 Brazil
| |
Collapse
|
18
|
de Souza TB, Chaluvadi SR, Johnen L, Marques A, González-Elizondo MS, Bennetzen JL, Vanzela ALL. Analysis of retrotransposon abundance, diversity and distribution in holocentric Eleocharis (Cyperaceae) genomes. ANNALS OF BOTANY 2018; 122:279-290. [PMID: 30084890 PMCID: PMC6070107 DOI: 10.1093/aob/mcy066] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Accepted: 04/18/2018] [Indexed: 05/23/2023]
Abstract
BACKGROUND AND AIMS Long terminal repeat-retrotransposons (LTR-RTs) comprise a large portion of plant genomes, with massive repeat blocks distributed across the chromosomes. Eleocharis species have holocentric chromosomes, and show a positive correlation between chromosome numbers and the amount of nuclear DNA. To evaluate the role of LTR-RTs in karyotype diversity in members of Eleocharis (subgenus Eleocharis), the occurrence and location of different members of the Copia and Gypsy superfamilies were compared, covering interspecific variations in ploidy levels (considering chromosome numbers), DNA C-values and chromosomal arrangements. METHODS The DNA C-value was estimated by flow cytometry. Genomes of Eleocharis elegans and E. geniculata were partially sequenced using Illumina MiSeq assemblies, which were a source for searching for conserved proteins of LTR-RTs. POL domains were used for recognition, comparing families and for probe production, considering different families of Copia and Gypsy superfamilies. Probes were obtained by PCR and used in fluorescence in situ hybridization (FISH) against chromosomes of seven Eleocharis species. KEY RESULTS A positive correlation between ploidy levels and the amount of nuclear DNA was observed, but with significant variations between samples with the same ploidy levels, associated with repetitive DNA fractions. LTR-RTs were abundant in E. elegans and E. geniculata genomes, with a predominance of Copia Sirevirus and Gypsy Athila/Tat clades. FISH using LTR-RT probes exhibited scattered and clustered signals, but with differences in the chromosomal locations of Copia and Gypsy. The diversity in LTR-RT locations suggests that there is no typical chromosomal distribution pattern for retrotransposons in holocentric chromosomes, except the CRM family with signals distributed along chromatids. CONCLUSIONS These data indicate independent fates for each LTR-RT family, including accumulation between and within chromosomes and genomes. Differential activity and small changes in LTR-RTs suggest a secondary role in nuclear DNA variation, when compared with ploidy changes.
Collapse
Affiliation(s)
- Thaíssa B de Souza
- Laboratory of Cytogenetics and Plant Diversity, Department of General Biology, Center for Biological Sciences, State University of Londrina, Londrina, Paraná, Brazil
| | | | - Lucas Johnen
- Laboratory of Cytogenetics and Plant Diversity, Department of General Biology, Center for Biological Sciences, State University of Londrina, Londrina, Paraná, Brazil
| | - André Marques
- Laboratory of Genetic Resources, Campus Arapiraca, Federal University of Alagoas, Arapiraca, Brazil
| | | | | | - André L L Vanzela
- Laboratory of Cytogenetics and Plant Diversity, Department of General Biology, Center for Biological Sciences, State University of Londrina, Londrina, Paraná, Brazil
| |
Collapse
|