1
|
Abstract
Advances in genomic sequencing have recently offered vast opportunities for biological exploration, unraveling the evolution and improving our understanding of Earth biodiversity. Due to distinct plant species characteristics in terms of genome size, ploidy and heterozygosity, transposable elements (TEs) are common characteristics of many genomes. TEs are ubiquitous and dispersed repetitive DNA sequences that frequently impact the evolution and composition of the genome, mainly due to their redundancy and rearrangements. For this study, we provided an atlas of TE data by employing an easy-to-use portal ( APTE website ). To our knowledge, this is the most extensive and standardized analysis of TEs in plant genomes. We evaluated 67 plant genomes assembled at chromosome scale, recovering a total of 49,802,023 TE records, representing a total of 47,992,091,043 (~47,62%) base pairs (bp) of the total genomic space. We observed that new types of TEs were identified and annotated compared to other data repositories. By establishing a standardized catalog of TE annotation on 67 genomes, new hypotheses, exploration of TE data and their influences on the genomes may allow a better understanding of their function and processes. All original code and an example of how we developed the TE annotation strategy is available on GitHub ( Extended data).
Collapse
Affiliation(s)
- Daniel Longhi Fernandes Pedro
- Department of Computer Science; Bioinformatics and Pattern Recognition Group, Graduation Program in Bioinformatics, Federal University of Technology - Paraná (UTFPR), Cornélio Procópio, Paraná, 86300000, Brazil
| | - Tharcisio Soares Amorim
- Department of Computer Science; Bioinformatics and Pattern Recognition Group, Graduation Program in Bioinformatics, Federal University of Technology - Paraná (UTFPR), Cornélio Procópio, Paraná, 86300000, Brazil
| | - Alessandro Varani
- Departament of Agricultural and Environmental Biotechnology, School of Agricultural and Veterinary Sciences, São Paulo State University (UNESP), Jaboticabal, São Paulo, 14884-900, Brazil
| | - Romain Guyot
- Institut de Recherche pour le Développement, IRD, University of Montpellier, Montpellier, France
- Department of Electronics and Automatization, Universidad Autónoma de Manizales, Manizales, Colombia
| | - Douglas Silva Domingues
- Department of Computer Science; Bioinformatics and Pattern Recognition Group, Graduation Program in Bioinformatics, Federal University of Technology - Paraná (UTFPR), Cornélio Procópio, Paraná, 86300000, Brazil
- Group of Genomics and Transcriptomes in Plants, Institute of Biosciences, São Paulo State University (UNESP), Rio Claro, São Paulo, 13506-900, Brazil
| | - Alexandre Rossi Paschoal
- Department of Computer Science; Bioinformatics and Pattern Recognition Group, Graduation Program in Bioinformatics, Federal University of Technology - Paraná (UTFPR), Cornélio Procópio, Paraná, 86300000, Brazil
| |
Collapse
|