1
|
Kosushkin S, Korchagin V, Vergun A, Ryskov A. Interspecific Comparison of Orthologous Short Interspersed Elements Loci Using Whole-Genome Data. Genes (Basel) 2023; 14:2089. [PMID: 38003031 PMCID: PMC10670947 DOI: 10.3390/genes14112089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 11/08/2023] [Accepted: 11/15/2023] [Indexed: 11/26/2023] Open
Abstract
The polymorphism of SINE-containing loci reflects the evolutionary processes that occurred both during the period before the divergence of the taxa and after it. Orthologous loci containing SINE in two or more genomes indicate the relatedness of the taxa, while different copies may have a specific set of mutations and degree of difference. Polymorphic insertion can be interpreted with a high degree of confidence as a shared derived character in the phylogenetic reconstruction of the history of the taxon. The computational comparison of the entire set of SINE-containing loci between genomes is a challenging task, and we propose to consider it in detail using the genomes of representatives of squamate reptiles (lizards) as an example. Our approach allows us to extract copies of SINE from the genomes, find pairwise orthologous loci by using flanking genomic sequences, and analyze the resulting sets of loci for the presence or absence of SINE, the degree of similarity of the flanks, and the similarity of the SINE themselves. The workflow we propose allows us to efficiently extract and analyze orthologous SINE loci for the downstream analysis, as shown in our comparison of species- and genus-level taxa in lacertid lizards.
Collapse
Affiliation(s)
- Sergei Kosushkin
- Laboratory of Genome Organization, Institute of Gene Biology of the Russian Academy of Sciences, Vavilova Str., 34/5, Moscow 119334, Russia; (V.K.)
| | - Vitaly Korchagin
- Laboratory of Genome Organization, Institute of Gene Biology of the Russian Academy of Sciences, Vavilova Str., 34/5, Moscow 119334, Russia; (V.K.)
| | - Andrey Vergun
- Laboratory of Genome Organization, Institute of Gene Biology of the Russian Academy of Sciences, Vavilova Str., 34/5, Moscow 119334, Russia; (V.K.)
- Department of Biochemistry, Molecular Biology and Genetics, Moscow Pedagogical State University, 1/1 M. Pirogovskaya Str., Moscow 119991, Russia
| | - Alexey Ryskov
- Laboratory of Genome Organization, Institute of Gene Biology of the Russian Academy of Sciences, Vavilova Str., 34/5, Moscow 119334, Russia; (V.K.)
| |
Collapse
|
2
|
Doronina L, Feigin CY, Schmitz J. Reunion of Australasian Possums by Shared SINE Insertions. Syst Biol 2022; 71:1045-1053. [PMID: 35289914 PMCID: PMC9366447 DOI: 10.1093/sysbio/syac025] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 03/09/2022] [Accepted: 03/11/2022] [Indexed: 11/29/2022] Open
Abstract
Although first posited to be of a single origin, the two superfamilies of phalangeriform marsupial possums (Phalangeroidea: brushtail possums and cuscuses and Petauroidea: possums and gliders) have long been considered, based on multiple sequencing studies, to have evolved from two separate origins. However, previous data from these sequence analyses suggested a variety of conflicting trees. Therefore, we reinvestigated these relationships by screening $\sim$200,000 orthologous short interspersed element (SINE) loci across the newly available whole-genome sequences of phalangeriform species and their relatives. Compared to sequence data, SINE presence/absence patterns are evolutionarily almost neutral molecular markers of the phylogenetic history of species. Their random and highly complex genomic insertion ensures their virtually homoplasy-free nature and enables one to compare hundreds of shared unique orthologous events to determine the true species tree. Here, we identify 106 highly reliable phylogenetic SINE markers whose presence/absence patterns within multiple Australasian possum genomes unexpectedly provide the first significant evidence for the reunification of Australasian possums into one monophyletic group. Together, our findings indicate that nucleotide homoplasy and ancestral incomplete lineage sorting have most likely driven the conflicting signal distributions seen in previous sequence-based studies. [Ancestral incomplete lineage sorting; possum genomes; possum monophyly; retrophylogenomics; SINE presence/absence.].
Collapse
Affiliation(s)
- Liliya Doronina
- Institute of Experimental Pathology (ZMBE), University of Münster, Von-Esmarch-Str. 56, D-48149 Münster, Germany
| | - Charles Y Feigin
- Department of Molecular Biology, Princeton University, 119 Lewis Thomas Laboratory, Washington Road, Princeton, NJ 08544-1014, USA
- School of BioSciences, The University of Melbourne, BioSciences 4, Royal Pde, Parkville, VIC 3010, Australia
| | - Jürgen Schmitz
- Institute of Experimental Pathology (ZMBE), University of Münster, Von-Esmarch-Str. 56, D-48149 Münster, Germany
| |
Collapse
|
3
|
Doronina L, Reising O, Clawson H, Churakov G, Schmitz J. Euarchontoglires Challenged by Incomplete Lineage Sorting. Genes (Basel) 2022; 13:774. [PMID: 35627160 PMCID: PMC9141288 DOI: 10.3390/genes13050774] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 04/08/2022] [Accepted: 04/20/2022] [Indexed: 11/17/2022] Open
Abstract
Euarchontoglires, once described as Supraprimates, comprise primates, colugos, tree shrews, rodents, and lagomorphs in a clade that evolved about 90 million years ago (mya) from a shared ancestor with Laurasiatheria. The rapid speciation of groups within Euarchontoglires, and the subsequent inherent incomplete marker fixation in ancestral lineages, led to challenged attempts at phylogenetic reconstructions, particularly for the phylogenetic position of tree shrews. To resolve this conundrum, we sampled genome-wide presence/absence patterns of transposed elements (TEs) from all representatives of Euarchontoglires. This specific marker system has the advantage that phylogenetic diagnostic characters can be extracted in a nearly unbiased fashion genome-wide from reference genomes. Their insertions are virtually free of homoplasy. We simultaneously employed two computational tools, the genome presence/absence compiler (GPAC) and 2-n-way, to find a maximum of diagnostic insertions from more than 3 million TE positions. From 361 extracted diagnostic TEs, 132 provide significant support for the current resolution of Primatomorpha (Primates plus Dermoptera), 94 support the union of Euarchonta (Primates, Dermoptera, plus Scandentia), and 135 marker insertion patterns support a variety of alternative phylogenetic scenarios. Thus, whole genome-level analysis and a virtually homoplasy-free marker system offer an opportunity to finally resolve the notorious phylogenetic challenges that nature produces in rapidly diversifying groups.
Collapse
Affiliation(s)
- Liliya Doronina
- Institute of Experimental Pathology, ZMBE, University of Münster, 48149 Münster, Germany; (O.R.); (G.C.); (J.S.)
| | - Olga Reising
- Institute of Experimental Pathology, ZMBE, University of Münster, 48149 Münster, Germany; (O.R.); (G.C.); (J.S.)
| | - Hiram Clawson
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA;
| | - Gennady Churakov
- Institute of Experimental Pathology, ZMBE, University of Münster, 48149 Münster, Germany; (O.R.); (G.C.); (J.S.)
| | - Jürgen Schmitz
- Institute of Experimental Pathology, ZMBE, University of Münster, 48149 Münster, Germany; (O.R.); (G.C.); (J.S.)
- EvoPAD-RTG, University of Münster, 48149 Münster, Germany
| |
Collapse
|
4
|
Abstract
To effectively analyze the increasing amounts of available genomic data, improved comparative analytical tools that are accessible to and applicable by a broad scientific community are essential. We built the “2-n-way” software suite to provide a fundamental and innovative processing framework for revealing and comparing inserted elements among various genomes. The suite comprises two user-friendly web-based modules. The 2-way module generates pairwise whole-genome alignments of target and query species. The resulting genome coordinates of blocks (matching sequences) and gaps (missing sequences) from multiple 2-ways are then transferred to the n-way module and sorted into projects, in which user-defined coordinates from reference species are projected to the block/gap coordinates of orthologous loci in query species to provide comparative information about presence (blocks) or absence (gaps) patterns of targeted elements over many entire genomes and phylogroups. Thus, the 2-n-way software suite is ideal for performing multidirectional, non-ascertainment-biased screenings to extract all possible presence/absence data of user-relevant elements in orthologous sequences. To highlight its applicability and versatility, we used 2-n-way to expose approximately 100 lost introns in vertebrates, analyzed thousands of potential phylogenetically informative bat and whale retrotransposons, and novel human exons as well as thousands of human polymorphic retrotransposons.
Collapse
|
5
|
Doronina L, Reising O, Clawson H, Ray DA, Schmitz J. True Homoplasy of Retrotransposon Insertions in Primates. Syst Biol 2019; 68:482-493. [PMID: 30445649 DOI: 10.1093/sysbio/syy076] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Revised: 11/05/2018] [Accepted: 11/13/2018] [Indexed: 01/24/2023] Open
Abstract
How reliable are the presence/absence insertion patterns of the supposedly homoplasy-free retrotransposons, which were randomly inserted in the quasi infinite genomic space? To systematically examine this question in an up-to-date, multigenome comparison, we screened millions of primate transposed Alu SINE elements for incidences of homoplasious precise insertions and deletions. In genome-wide analyses, we identified and manually verified nine cases of precise parallel Alu insertions of apparently identical elements at orthologous positions in two ape lineages and twelve incidences of precise deletions of previously established SINEs. Correspondingly, eight precise parallel insertions and no exact deletions were detected in a comparison of lemuriform primate and human insertions spanning the range of primate diversity. With an overall frequency of homoplasious Alu insertions of only 0.01% (for human-chimpanzee-rhesus macaque) and 0.02-0.04% (for human-bushbaby-lemurs) and precise Alu deletions of 0.001-0.002% (for human-chimpanzee-rhesus macaque), real homoplasy is not considered to be a quantitatively relevant source of evolutionary noise. Thus, presence/absence patterns of Alu retrotransposons and, presumably, all LINE1-mobilized elements represent indeed the virtually homoplasy-free markers they are considered to be. Therefore, ancestral incomplete lineage sorting and hybridization remain the only serious sources of conflicting presence/absence patterns of retrotransposon insertions, and as such are detectable and quantifiable. [Homoplasy; precise deletions; precise parallel insertions; primates; retrotransposons.].
Collapse
Affiliation(s)
- Liliya Doronina
- Institute of Experimental Pathology (ZMBE), University of Münster, Von-Esmarch-Str. 56, D-48149 Münster, Germany
| | - Olga Reising
- Institute of Experimental Pathology (ZMBE), University of Münster, Von-Esmarch-Str. 56, D-48149 Münster, Germany
| | - Hiram Clawson
- Department of Biomolecular Engineering, University of California, 1156 High Street, Santa Cruz, CA, USA
| | - David A Ray
- Department of Biological Sciences, Texas Tech University, 2901 Main Street, Lubbock, TX, USA
| | - Jürgen Schmitz
- Institute of Experimental Pathology (ZMBE), University of Münster, Von-Esmarch-Str. 56, D-48149 Münster, Germany
| |
Collapse
|
6
|
Transposable Elements: Classification, Identification, and Their Use As a Tool For Comparative Genomics. Methods Mol Biol 2019; 1910:177-207. [PMID: 31278665 DOI: 10.1007/978-1-4939-9074-0_6] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Most genomes are populated by hundreds of thousands of sequences originated from mobile elements. On the one hand, these sequences present a real challenge in the process of genome analysis and annotation. On the other hand, they are very interesting biological subjects involved in many cellular processes. Here we present an overview of transposable elements biodiversity, and we discuss different approaches to transposable elements detection and analyses.
Collapse
|
7
|
Sun C, Hu Z, Zheng T, Lu K, Zhao Y, Wang W, Shi J, Wang C, Lu J, Zhang D, Li Z, Wei C. RPAN: rice pan-genome browser for ∼3000 rice genomes. Nucleic Acids Res 2016; 45:597-605. [PMID: 27940610 PMCID: PMC5314802 DOI: 10.1093/nar/gkw958] [Citation(s) in RCA: 95] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2016] [Revised: 10/02/2016] [Accepted: 10/24/2016] [Indexed: 11/14/2022] Open
Abstract
A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k.
Collapse
Affiliation(s)
- Chen Sun
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China.,Shanghai Center for Bioinformation Technology, Shanghai 201203, China
| | - Zhiqiang Hu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China.,Shanghai Center for Bioinformation Technology, Shanghai 201203, China
| | - Tianqing Zheng
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Kuangchen Lu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yue Zhao
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Wensheng Wang
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Jianxin Shi
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Chunchao Wang
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Jinyuan Lu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Dabing Zhang
- Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China.,School of Agriculture, Food and Wine, University of Adelaide, Waite Campus, Urrbrae, SA 5064, Australia
| | - Zhikang Li
- Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Chaochun Wei
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China .,Shanghai Center for Bioinformation Technology, Shanghai 201203, China
| |
Collapse
|
8
|
Genome sequence of the basal haplorrhine primate Tarsius syrichta reveals unusual insertions. Nat Commun 2016; 7:12997. [PMID: 27708261 PMCID: PMC5059674 DOI: 10.1038/ncomms12997] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 08/17/2016] [Indexed: 12/28/2022] Open
Abstract
Tarsiers are phylogenetically located between the most basal strepsirrhines and the most derived anthropoid primates. While they share morphological features with both groups, they also possess uncommon primate characteristics, rendering their evolutionary history somewhat obscure. To investigate the molecular basis of such attributes, we present here a new genome assembly of the Philippine tarsier (Tarsius syrichta), and provide extended analyses of the genome and detailed history of transposable element insertion events. We describe the silencing of Alu monomers on the lineage leading to anthropoids, and recognize an unexpected abundance of long terminal repeat-derived and LINE1-mobilized transposed elements (Tarsius interspersed elements; TINEs). For the first time in mammals, we identify a complete mitochondrial genome insertion within the nuclear genome, then reveal tarsier-specific, positive gene selection and posit population size changes over time. The genomic resources and analyses presented here will aid efforts to more fully understand the ancient characteristics of primate genomes. Tarsiers occupy a key node between strepsirrhines and anthropoids in the primate phylogeny. Here, Warren and colleagues present the genome of Tarsius syrichta, including a survey of transposable elements, an unusual mitochondrial insertion, and evidence for positive gene selection.
Collapse
|
9
|
Noll A, Raabe CA, Churakov G, Brosius J, Schmitz J. Ancient traces of tailless retropseudogenes in therian genomes. Genome Biol Evol 2015; 7:889-900. [PMID: 25724209 PMCID: PMC5322556 DOI: 10.1093/gbe/evv040] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Transposable elements, once described by Barbara McClintock as controlling genetic units, not only occupy the largest part of our genome but are also a prominent moving force of genomic plasticity and innovation. They usually replicate and reintegrate into genomes silently, sometimes causing malfunctions or misregulations, but occasionally millions of years later, a few may evolve into new functional units. Retrotransposons make their way into the genome following reverse transcription of RNA molecules and chromosomal insertion. In therian mammals, long interspersed elements 1 (LINE1s) self-propagate but also coretropose many RNAs, including mRNAs and small RNAs that usually exhibit an oligo(A) tail. The revitalization of specific LINE1 elements in the mammalian lineage about 150 Ma parallels the rise of many other nonautonomous mobilized genomic elements. We previously identified and described hundreds of tRNA-derived retropseudogenes missing characteristic oligo(A) tails consequently termed tailless retropseudogenes. Additional analyses now revealed hundreds of thousands of tailless retropseudogenes derived from nearly all types of RNAs. We extracted 2,402 perfect tailless sequences (with discernible flanking target site duplications) originating from tRNAs, spliceosomal RNAs, 5S rRNAs, 7SK RNAs, mRNAs, and others. Interestingly, all are truncated at one or more defined positions that coincide with internal single-stranded regions. 5S ribosomal and U2 spliceosomal RNAs were analyzed in the context of mammalian phylogeny to discern the origin of the therian LINE1 retropositional system that evolved in our 150-Myr-old ancestor.
Collapse
Affiliation(s)
- Angela Noll
- Institute of Experimental Pathology, ZMBE, University of Münster, Germany
| | - Carsten A Raabe
- Institute of Experimental Pathology, ZMBE, University of Münster, Germany
| | - Gennady Churakov
- Institute of Experimental Pathology, ZMBE, University of Münster, Germany Institute of Evolution and Biodiversity, University of Münster, Germany
| | - Jürgen Brosius
- Institute of Experimental Pathology, ZMBE, University of Münster, Germany Institute of Evolutionary and Medical Genomics, Brandenburg Medical School, Neuruppin, Germany
| | - Jürgen Schmitz
- Institute of Experimental Pathology, ZMBE, University of Münster, Germany
| |
Collapse
|