1
|
Mikina W, Hałakuc P, Milanowski R. Transposon-derived introns as an element shaping the structure of eukaryotic genomes. Mob DNA 2024; 15:15. [PMID: 39068498 PMCID: PMC11282704 DOI: 10.1186/s13100-024-00325-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 07/23/2024] [Indexed: 07/30/2024] Open
Abstract
The widely accepted hypothesis postulates that the first spliceosomal introns originated from group II self-splicing introns. However, it is evident that not all spliceosomal introns in the nuclear genes of modern eukaryotes are inherited through vertical transfer of intronic sequences. Several phenomena contribute to the formation of new introns but their most common origin seems to be the insertion of transposable elements. Recent analyses have highlighted instances of mass gains of new introns from transposable elements. These events often coincide with an increase or change in the spliceosome's tolerance to splicing signals, including the acceptance of noncanonical borders. Widespread acquisitions of transposon-derived introns occur across diverse evolutionary lineages, indicating convergent processes. These events, though independent, likely require a similar set of conditions. These conditions include the presence of transposon elements with features enabling their removal at the RNA level as introns and/or the existence of a splicing mechanism capable of excising unusual sequences that would otherwise not be recognized as introns by standard splicing machinery. Herein we summarize those mechanisms across different eukaryotic lineages.
Collapse
Affiliation(s)
- Weronika Mikina
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Żwirki i Wigury 101, Warsaw, 02‑089, Poland
| | - Paweł Hałakuc
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Żwirki i Wigury 101, Warsaw, 02‑089, Poland
| | - Rafał Milanowski
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Żwirki i Wigury 101, Warsaw, 02‑089, Poland.
| |
Collapse
|
2
|
Panaro MA, Calvello R, Miniero DV, Mitolo V, Cianciulli A. Imaging Intron Evolution. Methods Protoc 2022; 5:mps5040053. [PMID: 35893579 PMCID: PMC9326662 DOI: 10.3390/mps5040053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 06/13/2022] [Accepted: 06/21/2022] [Indexed: 11/16/2022] Open
Abstract
Intron evolution may be readily imaged through the combined use of the “dot plot” function of the NCBI BLAST, aligning two sequences at a time, and the Vertebrate “Multiz” alignment and conservation tool of the UCSC Genome Browser. With the NCBI BLAST, an ideal alignment of two highly conserved sequences generates a diagonal straight line in the plot from the lower left corner to the upper right corner. Gaps in this line correspond to non-conserved sections. In addition, the dot plot of the alignment of a sequence with the same sequence after the removal of the Transposable Elements (TEs) can be observed along the diagonal gaps that correspond to the sites of TE insertion. The UCSC Genome Browser can graph, along the entire sequence of a single gene, the level of overall conservation in vertebrates. This level can be compared with the conservation level of the gene in one or more selected vertebrate species. As an example, we show the graphic analysis of the intron conservation in two genes: the mitochondrial solute carrier 21 (SLC25A21) and the growth hormone receptor (GHR), whose coding sequences are conserved through vertebrates, while their introns show dramatic changes in nucleotide composition and even length. In the SLC25A21, a few short but significant nucleotide sequences are conserved in zebrafish, Xenopus and humans, and the rate of conservation steadily increases from chicken/human to mouse/human alignments. In the GHR, a less conserved gene, the earlier indication of intron conservation is a small signal in chicken/human alignment. The UCSC tool may simultaneously display the conservation level of a gene in different vertebrates, with reference to the level of overall conservation in Vertebrates. It is shown that, at least in SLC25A21, the sites of higher conservation are not always coincident in chicken and zebrafish nor are the sites of higher vertebrate conservation.
Collapse
|
3
|
The Identification and Characterization of Endopolygalacturonases in a South African Isolate of Phytophthora cinnamomi. Microorganisms 2022; 10:microorganisms10051061. [PMID: 35630501 PMCID: PMC9146145 DOI: 10.3390/microorganisms10051061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2022] [Revised: 05/16/2022] [Accepted: 05/17/2022] [Indexed: 02/01/2023] Open
Abstract
Phytophthora cinnamomi is an economically important plant pathogen that has caused devastating losses to the avocado industry worldwide. To facilitate penetration and successful colonization of the host plant, pathogens have been reported to secrete polygalacturonases (PGs). Although a large PG gene family has been reported in P. cinnamomi, in-depth bioinformatics analyses and characterization of these genes is still lacking. In this study we used bioinformatics tools and molecular biology techniques to identify and characterize endopolygalacturonases in the genome of a South African P. cinnamomi isolate, GKB4. We identified 37 PGs, with 19 characteristics of full-length PGs. Although eight PcPGs were induced in planta during infection, only three showed significant up- and down-regulation when compared with in vitro mycelial growth, suggesting their possible roles in infection. The phylogenetic analysis of PcPGs showed both gain and loss of introns in the evolution of PGs in P. cinnamomi. Furthermore, 17 PGs were related to characterized PGs from oomycete species, providing insight on possible function. This study provides new data on endoPGs in P. cinnamomi and the evolution of introns in PcPG genes. We also provide a baseline for future functional characterization of PGs suspected to contribute to P. cinnamomi pathogenicity/virulence in avocado.
Collapse
|
4
|
Lim CS, Weinstein BN, Roy SW, Brown CM. Analysis of fungal genomes reveals commonalities of intron gain or loss and functions in intron-poor species. Mol Biol Evol 2021; 38:4166-4186. [PMID: 33772558 PMCID: PMC8476143 DOI: 10.1093/molbev/msab094] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Previous evolutionary reconstructions have concluded that early eukaryotic ancestors including both the last common ancestor of eukaryotes and of all fungi had intron-rich genomes. By contrast, some extant eukaryotes have few introns, underscoring the complex histories of intron–exon structures, and raising the question as to why these few introns are retained. Here, we have used recently available fungal genomes to address a variety of questions related to intron evolution. Evolutionary reconstruction of intron presence and absence using 263 diverse fungal species supports the idea that massive intron reduction through intron loss has occurred in multiple clades. The intron densities estimated in various fungal ancestors differ from zero to 7.6 introns per 1 kb of protein-coding sequence. Massive intron loss has occurred not only in microsporidian parasites and saccharomycetous yeasts, but also in diverse smuts and allies. To investigate the roles of the remaining introns in highly-reduced species, we have searched for their special characteristics in eight intron-poor fungi. Notably, the introns of ribosome-associated genes RPL7 and NOG2 have conserved positions; both intron-containing genes encoding snoRNAs. Furthermore, both the proteins and snoRNAs are involved in ribosome biogenesis, suggesting that the expression of the protein-coding genes and noncoding snoRNAs may be functionally coordinated. Indeed, these introns are also conserved in three-quarters of fungi species. Our study shows that fungal introns have a complex evolutionary history and underappreciated roles in gene expression.
Collapse
Affiliation(s)
- Chun Shen Lim
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Brooke N Weinstein
- Quantitative & Systems Biology, School of Natural Sciences, University of California-Merced, Merced, CA, USA.,Department of Biology, San Francisco State University, San Francisco, CA, USA
| | - Scott W Roy
- Quantitative & Systems Biology, School of Natural Sciences, University of California-Merced, Merced, CA, USA.,Department of Biology, San Francisco State University, San Francisco, CA, USA
| | - Chris M Brown
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| |
Collapse
|
5
|
Wu B, Macielog AI, Hao W. Origin and Spread of Spliceosomal Introns: Insights from the Fungal Clade Zymoseptoria. Genome Biol Evol 2018; 9:2658-2667. [PMID: 29048531 PMCID: PMC5647799 DOI: 10.1093/gbe/evx211] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/29/2017] [Indexed: 12/16/2022] Open
Abstract
Spliceosomal introns are a key feature of eukaryote genome architecture and have been proposed to originate from selfish group II introns from an endosymbiotic bacterium, that is, the ancestor of mitochondria. However, the mechanisms underlying the wide spread of spliceosomal introns across eukaryotic genomes have been obscure. In this study, we characterize the dynamic evolution of spliceosomal introns in the fungal genus Zymoseptoria at different evolutionary scales, that is, within a genome, among conspecific strains within species, and between different species. Within the genome, spliceosomal introns can proliferate in unrelated genes and intergenic regions. Among conspecific strains, spliceosomal introns undergo rapid turnover (gains and losses) and frequent sequence exchange between geographically distinct strains. Furthermore, spliceosomal introns could undergo introgression between distinct species, which can further promote intron invasion and proliferation. The dynamic invasion and proliferation processes of spliceosomal introns resemble the life cycles of mobile selfish (group I/II) introns, and these intron movements, at least in part, account for the dramatic processes of intron gain and intron loss during eukaryotic evolution.
Collapse
Affiliation(s)
- Baojun Wu
- Department of Biology, Clark University, Worcester, MA, USA
| | | | - Weilong Hao
- Department of Biological Sciences, Wayne State University
| |
Collapse
|
6
|
Han J, Zhang L, Wang P, Yang G, Wang S, Li Y, Pan K. Heterogeneity of intron presence/absence in Olifantiella sp. (Bacillariophyta) contributes to the understanding of intron loss. JOURNAL OF PHYCOLOGY 2018; 54:105-113. [PMID: 29120060 DOI: 10.1111/jpy.12605] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2017] [Accepted: 11/01/2017] [Indexed: 06/07/2023]
Abstract
Although hypotheses have been proposed and developed to interpret the origins and functions of introns, substantial controversies remain about the mechanism of intron evolution. The availability of introns in the intermediate state is quite helpful for resolving this debate. In this study, a new strain of diatom (denominated as DB21-1) was isolated and identified as Olifantiella sp., which possesses multiple types of 18S rDNAs (obtained from genomic DNA; lengths ranged from 2,056 bp to 2,988 bp). Based on alignments between 18S rDNAs and 18S rRNA (obtained from cDNA; 1,783 bp), seven intron insertion sites (IISs) located in the 18S rDNA were identified, each of which displayed the polymorphism of intron presence/absence. Specific primers around each IIS were designed to amplify the introns and the results indicated that introns in the same IIS varied in lengths, while terminal sequences were conserved. Our study showed that the process of intron loss happens via a series of successive steps, and each step could derive corresponding introns under intermediate states. Moreover, these results indicate that the mechanism of genomic deletion that occurs at DNA level can also lead to exact intron loss.
Collapse
Affiliation(s)
- Jichang Han
- Laboratory of Applied Microalgae Biology, Ocean University of China, Qingdao, 266003, China
| | - Lin Zhang
- College of Marine, Ningbo University, Ningbo, 315211, China
| | - Pu Wang
- Department of Ecology, Evolution and Behavior, University of Minnesota, St. Paul, Minnesota, 55018, USA
| | - Guanpin Yang
- College of Marine Life Sciences, Ocean University of China, Qingdao, 266003, China
| | - Song Wang
- Laboratory of Applied Microalgae Biology, Ocean University of China, Qingdao, 266003, China
| | - Yuhang Li
- Department of Marine Organism Taxonomy and Phylogeny, Institute of Oceanology, Chinese Academy of Sciences, Qingdao, 266071, China
| | - Kehou Pan
- Laboratory of Applied Microalgae Biology, Ocean University of China, Qingdao, 266003, China
- Function Laboratory for Marine Fisheries Science and Food Production Processes, Qingdao National Laboratory for Marine Science and Technology, Qingdao, 266003, China
| |
Collapse
|
7
|
Bhere KV, Haney RA, Ayoub NA, Garb JE. Gene structure, regulatory control, and evolution of black widow venom latrotoxins. FEBS Lett 2014; 588:3891-7. [PMID: 25217831 DOI: 10.1016/j.febslet.2014.08.034] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2014] [Revised: 08/28/2014] [Accepted: 08/29/2014] [Indexed: 01/21/2023]
Abstract
Black widow venom contains α-latrotoxin, infamous for causing intense pain. Combining 33 kb of Latrodectus hesperus genomic DNA with RNA-Seq, we characterized the α-latrotoxin gene and discovered a paralog, 4.5 kb downstream. Both paralogs exhibit venom gland specific transcription, and may be regulated post-transcriptionally via musashi-like proteins. A 4 kb intron interrupts the α-latrotoxin coding sequence, while a 10 kb intron in the 3' UTR of the paralog may cause non-sense-mediated decay. Phylogenetic analysis confirms these divergent latrotoxins diversified through recent tandem gene duplications. Thus, latrotoxin genes have more complex structures, regulatory controls, and sequence diversity than previously proposed.
Collapse
Affiliation(s)
- Kanaka Varun Bhere
- Department of Biological Sciences, University of Massachusetts Lowell, MA, USA
| | - Robert A Haney
- Department of Biological Sciences, University of Massachusetts Lowell, MA, USA
| | - Nadia A Ayoub
- Department of Biology, Washington and Lee University, Lexington, VA, USA
| | - Jessica E Garb
- Department of Biological Sciences, University of Massachusetts Lowell, MA, USA.
| |
Collapse
|
8
|
Buchmann JP, Löytynoja A, Wicker T, Schulman AH. Analysis of CACTA transposases reveals intron loss as major factor influencing their exon/intron structure in monocotyledonous and eudicotyledonous hosts. Mob DNA 2014; 5:24. [PMID: 25206928 PMCID: PMC4158355 DOI: 10.1186/1759-8753-5-24] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2014] [Accepted: 08/18/2014] [Indexed: 01/20/2023] Open
Abstract
Background CACTA elements are DNA transposons and are found in numerous organisms. Despite their low activity, several thousand copies can be identified in many genomes. CACTA elements transpose using a ‘cut-and-paste’ mechanism, which is facilitated by a DDE transposase. DDE transposases from CACTA elements contain, despite their conserved function, different exon numbers among various CACTA families. While earlier studies analyzed the ancestral history of the DDE transposases, no studies have examined exon loss and gain with a view of mechanisms that could drive the changes. Results We analyzed 64 transposases from different CACTA families among monocotyledonous and eudicotyledonous host species. The annotation of the exon/intron boundaries showed a range from one to six exons. A robust multiple sequence alignment of the 64 transposases based on their protein sequences was created and used for phylogenetic analysis, which revealed eight different clades. We observed that the exon numbers in CACTA transposases are not specific for a host genome. We found that ancient CACTA lineages diverged before the divergence of monocotyledons and eudicotyledons. Most exon/intron boundaries were found in three distinct regions among all the transposases, grouping 63 conserved intron/exon boundaries. Conclusions We propose a model for the ancestral CACTA transposase gene, which consists of four exons, that predates the divergence of the monocotyledons and eudicotyledons. Based on this model, we propose pathways of intron loss or gain to explain the observed variation in exon numbers. While intron loss appears to have prevailed, a putative case of intron gain was nevertheless observed.
Collapse
Affiliation(s)
- Jan P Buchmann
- Institute of Biotechnology, Viikki Biocenter, University of Helsinki, PO Box 65, FIN-00014 Helsinki, Finland ; Present address: Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Center, University of Sydney, Sydney NSW 2006, Australia
| | - Ari Löytynoja
- Institute of Biotechnology, Viikki Biocenter, University of Helsinki, PO Box 65, FIN-00014 Helsinki, Finland
| | - Thomas Wicker
- Institute of Plant Biology, University of Zurich, Zollikerstrasse 107, Zurich, Switzerland
| | - Alan H Schulman
- Institute of Biotechnology, Viikki Biocenter, University of Helsinki, PO Box 65, FIN-00014 Helsinki, Finland ; Biotechnology and Food Research, MTT Agrifood Research Finland, Myllytie 1, FIN-31600 Jokioinen, Finland
| |
Collapse
|
9
|
Behura SK, Severson DW. Association of microsatellite pairs with segmental duplications in insect genomes. BMC Genomics 2013; 14:907. [PMID: 24359442 PMCID: PMC3878106 DOI: 10.1186/1471-2164-14-907] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Accepted: 12/16/2013] [Indexed: 11/30/2022] Open
Abstract
Background Segmental duplications (SDs), also known as low-copy repeats, are DNA sequences of length greater than 1 kb which are duplicated with a high degree of sequence identity (greater than 90%) causing instability in genomes. SDs are generally found in the genome as mosaic forms of duplicated sequences which are generated by a two-step process: first, multiple duplicated sequences are aggregated at specific genomic regions, and then, these primary duplications undergo multiple secondary duplications. However, the mechanism of how duplicated sequences are aggregated in the first place is not well understood. Results By analyzing the distribution of microsatellite sequences among twenty insect species in a genome-wide manner it was found that pairs of microsatellites along with the intervening sequences were duplicated multiple times in each genome. They were found as low copy repeats or segmental duplications when the duplicated loci were greater than 1 kb in length and had greater than 90% sequence similarity. By performing a sliding-window genomic analysis for number of paired microsatellites and number of segmental duplications, it was observed that regions rich in repetitive paired microsatellites tend to get richer in segmental duplication suggesting a “rich-gets-richer” mode of aggregation of the duplicated loci in specific regions of the genome. Results further show that the relationship between number of paired microsatellites and segmental duplications among the species is independent of the known phylogeny suggesting that association of microsatellites with segmental duplications may be a species-specific evolutionary process. It was also observed that the repetitive microsatellite pairs are associated with gene duplications but those sequences are rarely retained in the orthologous genes between species. Although some of the duplicated sequences with microsatellites as termini were found within transposable elements (TEs) of Drosophila, most of the duplications are found in the TE-free and gene-free regions of the genome. Conclusion The study clearly suggests that microsatellites are instrumental in extensive sequence duplications that may contribute to species-specific evolution of genome plasticity in insects.
Collapse
Affiliation(s)
- Susanta K Behura
- Eck Institute for Global Health, Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA.
| | | |
Collapse
|
10
|
Milanowski R, Karnkowska A, Ishikawa T, Zakryś B. Distribution of conventional and nonconventional introns in tubulin (α and β) genes of euglenids. Mol Biol Evol 2013; 31:584-93. [PMID: 24296662 PMCID: PMC3935182 DOI: 10.1093/molbev/mst227] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
The nuclear genomes of euglenids contain three types of introns: conventional spliceosomal introns, nonconventional introns for which a splicing mechanism is unknown (variable noncanonical borders, RNA secondary structure bringing together intron ends), and so-called intermediate introns, which combine features of conventional and nonconventional introns. Analysis of two genes, tubA and tubB, from 20 species of euglenids reveals contrasting distribution patterns of conventional and nonconventional introns--positions of conventional introns are conserved, whereas those of the nonconventional ones are unique to individual species or small groups of closely related taxa. Moreover, in the group of phototrophic euglenids, 11 events of conventional intron loss versus 15 events of nonconventional intron gain were identified. A comparison of all nonconventional intron sequences highlighted the most conserved elements in their sequence and secondary structure. Our results led us to put forward two hypotheses. 1) The first one posits that mutational changes in intron sequence could lead to a change in their excision mechanism--intermediate introns would then be a transitional form between the conventional and nonconventional introns. 2) The second hypothesis concerns the origin of nonconventional introns--because of the presence of inverted repeats near their ends, insertion of MITE-like transposon elements is proposed as a possible source of new introns.
Collapse
Affiliation(s)
- Rafał Milanowski
- Department of Plant Systematics and Geography, Institute of Botany, Faculty of Biology, University of Warsaw, Warsaw, Poland
| | | | | | | |
Collapse
|
11
|
Hong JS, Ryu KH, Kwon SJ, Kim JW, Kim KS, Park KC. Phylogenetics and Gene Structure Dynamics of Polygalacturonase Genes in Aspergillus and Neurospora crassa. THE PLANT PATHOLOGY JOURNAL 2013; 29:234-241. [PMID: 25288950 PMCID: PMC4174808 DOI: 10.5423/ppj.oa.10.2012.0157] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/28/2012] [Revised: 02/22/2013] [Accepted: 03/20/2013] [Indexed: 06/03/2023]
Abstract
Polygalacturonase (PG) gene is a typical gene family present in eukaryotes. Forty-nine PGs were mined from the genomes of Neurospora crassa and five Aspergillus species. The PGs were classified into 3 clades such as clade 1 for rhamno-PGs, clade 2 for exo-PGs and clade 3 for exo- and endo-PGs, which were further grouped into 13 sub-clades based on the polypeptide sequence similarity. In gene structure analysis, a total of 124 introns were present in 44 genes and five genes lacked introns to give an average of 2.5 introns per gene. Intron phase distribution was 64.5% for phase 0, 21.8% for phase 1, and 13.7% for phase 2, respectively. The introns varied in their sequences and their lengths ranged from 20 bp to 424 bp with an average of 65.9 bp, which is approximately half the size of introns in other fungal genes. There were 29 homologous intron blocks and 26 of those were sub-clade specific. Intron losses were counted in 18 introns in which no obvious phase preference for intron loss was observed. Eighteen introns were placed at novel positions, which is considerably higher than those of plant PGs. In an evolutionary sense both intron loss and gain must have taken place for shaping the current PGs in these fungi. Together with the small intron size, low conservation of homologous intron blocks and higher number of novel introns, PGs of fungal species seem to have recently undergone highly dynamic evolution.
Collapse
Affiliation(s)
- Jin-Sung Hong
- Department of Horticultural, Biotechnology and Landscape Architecture, Seoul Women’s University, Seoul 139-774, Korea
- Department of Applied Biology, College of Agriculture and Life sciences, Kangwon National University, Chunchon 200-701, Korea
| | - Ki-Hyun Ryu
- Department of Horticultural, Biotechnology and Landscape Architecture, Seoul Women’s University, Seoul 139-774, Korea
| | - Soon-Jae Kwon
- US Department of Agriculture-Agricultural Research Service, Western Regional Plant Introduction Station, 59 Johnson Hall, Washington State University, Pullman WA 99164, USA
| | - Jin-Won Kim
- Department of Environment Horticulture, University of Seoul, Seoul 130-743, Korea
| | - Kwang-Soo Kim
- Bioenergy Crop Research Center, National Institute of Crop Science, Rural Development Administration, Muan 534-833, Korea
| | - Kyong-Cheul Park
- Institute of Biosciences and Biotechnology, Kangwon National University, Chunchon 200-701, Korea
| |
Collapse
|
12
|
Behura SK, Severson DW. Overlapping genes of Aedes aegypti: evolutionary implications from comparison with orthologs of Anopheles gambiae and other insects. BMC Evol Biol 2013; 13:124. [PMID: 23777277 PMCID: PMC3689595 DOI: 10.1186/1471-2148-13-124] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2012] [Accepted: 06/12/2013] [Indexed: 11/11/2022] Open
Abstract
Background Although gene overlapping is a common feature of prokaryote and mitochondria genomes, such genes have also been identified in many eukaryotes. The overlapping genes in eukaryotes are extensively rearranged even between closely related species. In this study, we investigated retention and rearrangement of positionally overlapping genes between the mosquitoes Aedes aegypti (dengue virus vector) and Anopheles gambiae (malaria vector). The overlapping gene pairs of A. aegypti were further compared with orthologs of other selected insects to conduct several hypothesis driven investigations relating to the evolution and rearrangement of overlapping genes. Results The results show that as much as ~10% of the predicted genes of A. aegypti and A. gambiae are localized in positional overlapping manner. Furthermore, the study shows that differential abundance of introns and simple sequence repeats have significant association with positional rearrangement of overlapping genes between the two species. Gene expression analysis further suggests that antisense transcripts generated from the oppositely oriented overlapping genes are differentially regulated and may have important regulatory functions in these mosquitoes. Our data further shows that synonymous and non-synonymous mutations have differential but non-significant effect on overlapping localization of orthologous genes in other insect genomes. Conclusion Gene overlapping in insects may be a species-specific evolutionary process as evident from non-dependency of gene overlapping with species phylogeny. Based on the results, our study suggests that overlapping genes may have played an important role in genome evolution of insects.
Collapse
Affiliation(s)
- Susanta K Behura
- Eck Institute for Global Health, Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | | |
Collapse
|
13
|
Torriani SFF, Stukenbrock EH, Brunner PC, McDonald BA, Croll D. Evidence for extensive recent intron transposition in closely related fungi. Curr Biol 2011; 21:2017-22. [PMID: 22100062 DOI: 10.1016/j.cub.2011.10.041] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2011] [Revised: 10/26/2011] [Accepted: 10/26/2011] [Indexed: 11/30/2022]
Abstract
Though spliceosomal introns are a major structural component of most eukaryotic genes and intron density varies by more than three orders of magnitude among eukaryotes [1-3], the origins of introns are poorly understood, and only a few cases of unambiguous intron gain are known [4-8]. We utilized population genomic comparisons of three closely related fungi to identify crucial transitory phases of intron gain and loss. We found 74 intron positions showing intraspecific presence-absence polymorphisms (PAPs) for the entire intron. Population genetic analyses identified intron PAPs at different stages of fixation and showed that intron gain or loss was very recent. We found direct support for extensive intron transposition among unrelated genes. A substantial proportion of highly similar introns in the genome either were recently gained or showed a transient phase of intron PAP. We also identified an intron transfer among paralogous genes that created a new intron. Intron loss was due mainly to homologous recombination involving reverse-transcribed mRNA. The large number of intron positions in transient phases of either intron gain or loss shows that intron evolution is much faster than previously thought and provides an excellent model to study molecular mechanisms of intron gain.
Collapse
Affiliation(s)
- Stefano F F Torriani
- Institute of Integrative Biology, Swiss Federal Institute of Technology (ETH Zurich), 8092 Zurich, Switzerland
| | | | | | | | | |
Collapse
|
14
|
Fawcett JA, Rouzé P, Van de Peer Y. Higher intron loss rate in Arabidopsis thaliana than A. lyrata is consistent with stronger selection for a smaller genome. Mol Biol Evol 2011; 29:849-59. [PMID: 21998273 DOI: 10.1093/molbev/msr254] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The number of introns varies considerably among different organisms. This can be explained by the differences in the rates of intron gain and loss. Two factors that are likely to influence these rates are selection for or against introns and the mutation rate that generates the novel intron or the intronless copy. Although it has been speculated that stronger selection for a compact genome might result in a higher rate of intron loss and a lower rate of intron gain, clear evidence is lacking, and the role of selection in determining these rates has not been established. Here, we studied the gain and loss of introns in the two closely related species Arabidopsis thaliana and A. lyrata as it was recently shown that A. thaliana has been undergoing a faster genome reduction driven by selection. We found that A. thaliana has lost six times more introns than A. lyrata since the divergence of the two species but gained very few introns. We suggest that stronger selection for genome reduction probably resulted in the much higher intron loss rate in A. thaliana, although further analysis is required as we could not find evidence that the loss rate increased in A. thaliana as opposed to having decreased in A. lyrata compared with the rate in the common ancestor. We also examined the pattern of the intron gains and losses to better understand the mechanisms by which they occur. Microsimilarity was detected between the splice sites of several gained and lost introns, suggesting that nonhomologous end joining repair of double-strand breaks might be a common pathway not only for intron gain but also for intron loss.
Collapse
|
15
|
Kumar A, Bhandari A, Sinha R, Goyal P, Grapputo A. Spliceosomal intron insertions in genome compacted ray-finned fishes as evident from phylogeny of MC receptors, also supported by a few other GPCRs. PLoS One 2011; 6:e22046. [PMID: 21850219 PMCID: PMC3151243 DOI: 10.1371/journal.pone.0022046] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2010] [Accepted: 06/16/2011] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Insertions of spliceosomal introns are very rare events during evolution of vertebrates and the mechanisms governing creation of novel intron(s) remain obscure. Largely, gene structures of melanocortin (MC) receptors are characterized by intron-less architecture. However, recently a few exceptions have been reported in some fishes. This warrants a systematic survey of MC receptors for understanding intron insertion events during vertebrate evolution. METHODOLOGY/PRINCIPAL FINDINGS We have compiled an extended list of MC receptors from different vertebrate genomes with variations in fishes. Notably, the closely linked MC2Rs and MC5Rs from a group of ray-finned fishes have three and one intron insertion(s), respectively, with conserved positions and intron phase. In both genes, one novel insertion was in the highly conserved DRY motif at the end of helix TM3. Further, the proto-splice site MAG↑R is maintained at intron insertion sites in these two genes. However, the orthologs of these receptors from zebrafish and tetrapods are intron-less, suggesting these introns are simultaneously created in selected fishes. Surprisingly, these novel introns are traceable only in four fish genomes. We found that these fish genomes are severely compacted after the separation from zebrafish. Furthermore, we also report novel intron insertions in P2Y receptors and in CHRM3. Finally, we report ultrasmall introns in MC2R genes from selected fishes. CONCLUSIONS/SIGNIFICANCE The current repository of MC receptors illustrates that fishes have no MC3R ortholog. MC2R, MC5R, P2Y receptors and CHRM3 have novel intron insertions only in ray-finned fishes that underwent genome compaction. These receptors share one intron at an identical position suggestive of being inserted contemporaneously. In addition to repetitive elements, genome compaction is now believed to be a new hallmark that promotes intron insertions, as it requires rapid DNA breakage and subsequent repair processes to gain back normal functionality.
Collapse
Affiliation(s)
- Abhishek Kumar
- Department of Biology, University of Padua, Padova, Italy.
| | | | | | | | | |
Collapse
|
16
|
Cohen NE, Shen R, Carmel L. The role of reverse transcriptase in intron gain and loss mechanisms. Mol Biol Evol 2011; 29:179-86. [PMID: 21804076 DOI: 10.1093/molbev/msr192] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Intron density is highly variable across eukaryotic species. It seems that different lineages have experienced considerably different levels of intron gain and loss events, but the reasons for this are not well known. A large number of mechanisms for intron loss and gain have been suggested, and most of them have at least some level of indirect support. We therefore figured out that the variability in intron density can be a reflection of the fact that different mechanisms are active in different lineages. Quite a number of these putative mechanisms, both for intron loss and for intron gain, postulate that the enzyme reverse transcriptase (RT) has a key role in the process. In this paper, we lay out three predictions whose approval or falsification gives indication for the involvement of RT in intron gain and loss processes. Testing these predictions requires data on the intron gain and loss rates of individual genes along different branches of the eukaryotic phylogenetic tree. So far, such rates could not be computed, and hence, these predictions could not be rigorously evaluated. Here, we use a maximum likelihood algorithm that we have devised in the past, Evolutionary Reconstruction by Expectation Maximization, which allows the estimation of such rates. Using this algorithm, we computed the intron loss and gain rates of more than 300 genes in each branch of the phylogenetic tree of 19 eukaryotic species. Based on that we found only little support for RT activity in intron gain. In contrast, we suggest that RT-mediated intron loss is a mechanism that is very efficient in removing introns, and thus, its levels of activity may be a major determinant of intron number. Moreover, we found that intron gain and loss rates are negatively correlated in intron-poor species but are positively correlated for intron-rich species. One explanation to this is that intron gain and loss mechanisms in intron-rich species (like metazoans) share a common mechanistic component, albeit not a RT.
Collapse
Affiliation(s)
- Noa E Cohen
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, Faculty of Science, The Hebrew University of Jerusalem, Jerusalem, Israel
| | | | | |
Collapse
|
17
|
Meng Q, Chen K, Ma L, Hu S, Yu J. A systematic identification of Kolobok superfamily transposons in Trichomonas vaginalis and sequence analysis on related transposases. J Genet Genomics 2011; 38:63-70. [PMID: 21356525 DOI: 10.1016/j.jcg.2011.01.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2010] [Revised: 12/02/2010] [Accepted: 12/03/2010] [Indexed: 02/03/2023]
Abstract
Transposons are sequence elements widely distributed among genomes of all three kingdoms of life, providing genomic changes and playing significant roles in genome evolution. Trichomonas vaginalis is an excellent model system for transposon study since its genome (~160 Mb) has been sequenced and is composed of ~65% transposons and other repetitive elements. In this study, we primarily report the identification of Kolobok-type transposons (termed tvBac) in T. vaginalis and the results of transposase sequence analysis. We categorized 24 novel subfamilies of the Kolobok element, including one autonomous subfamily and 23 non-autonomous subfamilies. We also identified a novel H2CH motif in tvBac transposases based on multiple sequence alignment. In addition, we supposed that tvBac and Mutator transposons may have evolved independently from a common ancestor according to our phylogenetic analysis. Our results provide basic information for the understanding of the function and evolution of tvBac transposons in particular and other related transposon families in general.
Collapse
Affiliation(s)
- Qingshu Meng
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, China
| | | | | | | | | |
Collapse
|
18
|
Yang Z, Huang J. De novo origin of new genes with introns in Plasmodium vivax. FEBS Lett 2011; 585:641-4. [PMID: 21241695 DOI: 10.1016/j.febslet.2011.01.017] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2010] [Revised: 01/08/2011] [Accepted: 01/11/2011] [Indexed: 11/26/2022]
Abstract
The origin of new genes is critical for organisms adapting to new niches. Here, we present evidence for a recent de novo origin of at least 13 protein-coding genes in the genome of Plasmodium vivax. Although recently de novo originated genes have often been suggested to be initially intronless, five of the genes identified in our analysis contain introns in their coding regions. Further investigations revealed that these introns likely evolved from previously intergenic regions together with the coding sequences. We discuss the potential mechanisms for intron formation in these genes and propose that intronization be considered in the formation of de novo originated genes.
Collapse
Affiliation(s)
- Zefeng Yang
- Department of Biology, East Carolina University, Greenville, NC 27858, USA
| | | |
Collapse
|
19
|
Intron loss mediated structural dynamics and functional differentiation of the polygalacturonase gene family in land plants. Genes Genomics 2010. [DOI: 10.1007/s13258-010-0076-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
20
|
Zhang LY, Yang YF, Niu DK. Evaluation of models of the mechanisms underlying intron loss and gain in Aspergillus fungi. J Mol Evol 2010; 71:364-73. [PMID: 20862581 DOI: 10.1007/s00239-010-9391-6] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2010] [Accepted: 09/08/2010] [Indexed: 11/26/2022]
Abstract
Although intron loss and gain have been widely observed, their mechanisms are still to be determined. In four Aspergillus genomes, we found 204 cases of intron loss and 84 cases of intron gain. Using this data, we tested common hypotheses of intron loss or gain. Statistical analysis showed that adjacent introns tend to be lost simultaneously and small introns were preferentially lost, supporting the model of mRNA-mediated intron loss. The lost introns reside in internal regions of genes, which is inconsistent with the traditional version of the model (partial length cDNAs are reverse transcribed from 3' ends of mRNAs), but consistent with an alternate version (partial length cDNAs are produced by self-primed reverse transcription). The latter version was not supported by examination of the abundance of T-rich segments in mRNAs. Preferential loss of internal introns might be explained by highly efficient recombination at internal regions of genes. Among the 84 cases of intron gain, we found a significantly higher frequency of short direct repeats near exon-intron boundary than in conserved introns, supporting the double-strand break repair model. We also found possible source sequences for two cases of intron gain, one by gene conversion and one by insertion of a mitochondrial sequence during double-strand break repair. Source sequences for most gained introns could not be identified and the possible reasons were discussed. In the four Aspergillus genomes studied, we did not find evidence of frequent parallel intron gains.
Collapse
Affiliation(s)
- Lei-Ying Zhang
- MOE Key Laboratory for Biodiversity Sciences and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| | | | | |
Collapse
|
21
|
Nielsen MG, Gadagkar SR, Gutzwiller L. Tubulin evolution in insects: gene duplication and subfunctionalization provide specialized isoforms in a functionally constrained gene family. BMC Evol Biol 2010; 10:113. [PMID: 20423510 PMCID: PMC2880298 DOI: 10.1186/1471-2148-10-113] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2009] [Accepted: 04/27/2010] [Indexed: 11/26/2022] Open
Abstract
Background The completion of 19 insect genome sequencing projects spanning six insect orders provides the opportunity to investigate the evolution of important gene families, here tubulins. Tubulins are a family of eukaryotic structural genes that form microtubules, fundamental components of the cytoskeleton that mediate cell division, shape, motility, and intracellular trafficking. Previous in vivo studies in Drosophila find a stringent relationship between tubulin structure and function; small, biochemically similar changes in the major alpha 1 or testis-specific beta 2 tubulin protein render each unable to generate a motile spermtail axoneme. This has evolutionary implications, not a single non-synonymous substitution is found in beta 2 among 17 species of Drosophila and Hirtodrosophila flies spanning 60 Myr of evolution. This raises an important question, How do tubulins evolve while maintaining their function? To answer, we use molecular evolutionary analyses to characterize the evolution of insect tubulins. Results Sixty-six alpha tubulins and eighty-six beta tubulin gene copies were retrieved and subjected to molecular evolutionary analyses. Four ancient clades of alpha and beta tubulins are found in insects, a major isoform clade (alpha 1, beta 1) and three minor, tissue-specific clades (alpha 2-4, beta 2-4). Based on a Homarus americanus (lobster) outgroup, these were generated through gene duplication events on major beta and alpha tubulin ancestors, followed by subfunctionalization in expression domain. Strong purifying selection acts on all tubulins, yet maximum pairwise amino acid distances between tubulin paralogs are large (0.464 substitutions/site beta tubulins, 0.707 alpha tubulins). Conversely orthologs, with the exception of reproductive tissue isoforms, show little sequence variation except in the last 15 carboxy terminus tail (CTT) residues, which serve as sites for post-translational modifications (PTMs) and interactions with microtubule-associated proteins. CTT residues overwhelming comprise the co-evolving residues between Drosophila alpha 2 and beta 3 tubulin proteins, indicating CTT specializations can be mediated at the level of the tubulin dimer. Gene duplications post-dating separation of the insect orders are unevenly distributed, most often appearing in major alpha 1 and minor beta 2 clades. More than 40 introns are found in tubulins. Their distribution among tubulins reveals that insertion and deletion events are common, surprising given their potential for disrupting tubulin coding sequence. Compensatory evolution is found in Drosophila beta 2 tubulin cis-regulation, and reveals selective pressures acting to maintain testis expression without the use of previously identified testis cis-regulatory elements. Conclusion Tubulins have stringent structure/function relationships, indicated by strong purifying selection, the loss of many gene duplication products, alpha-beta co-evolution in the tubulin dimer, and compensatory evolution in beta 2 tubulin cis-regulation. They evolve through gene duplication, subfunctionalization in expression domain and divergence of duplication products, largely in CTT residues that mediate interactions with other proteins. This has resulted in the tissue-specific minor insect isoforms, and in particular the highly diverse α3, α4, and β2 reproductive tissue-specific tubulin isoforms, illustrating that even a highly conserved protein family can participate in the adaptive process and respond to sexual selection.
Collapse
Affiliation(s)
- Mark G Nielsen
- Department of Biology, University of Dayton, OH 45467, USA.
| | | | | |
Collapse
|
22
|
Roy SW. Intronization, de-intronization and intron sliding are rare in Cryptococcus. BMC Evol Biol 2009; 9:192. [PMID: 19664208 PMCID: PMC2740785 DOI: 10.1186/1471-2148-9-192] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2008] [Accepted: 08/07/2009] [Indexed: 11/11/2022] Open
Abstract
Background Eukaryotic pre-mRNA gene transcripts are processed by the spliceosome to remove portions of the transcript, called spliceosomal introns. The spliceosome recognizes intron boundaries by the presence of sequence signals (motifs) contained in the actual transcript, thus sequence changes in the genome that affect existing splicing signals or create new signals may lead to changes in transcript splicing patterns. Such changes may lead to previously excluded (intronic) transcript regions being included (exonic) or vice versa. Such changes can affect the encoded protein sequence and/or post-transcriptional regulation, and are thus a potentially important source of genomic and phenotypic novelty. Two recent papers suggest that such changes may be a major force in remodeling of eukaryotic gene structures, however the rate of occurrence of such changes has not been assessed at the genomic level. Results I studied four closely related species of Cryptoccocus fungi. Among 28,256 studied introns, canonical GT/C...AG boundaries are nearly universally conserved across all four species. Among only 40 observed cases of cDNA-confirmed non-conserved intron boundaries, most are likely to involve alternative splicing. I find only five cases of "intronization," intron creation from an internal exonic region by de novo emergence of new splicing boundaries, and no cases of the reverse process, "de-intronization." I find no more than ten clear cases of true movement of an intron boundary of a possibly constitutively spliced intron, and no clear cases of true "intron sliding," in which changes in the positions of both intron boundaries could lead to a movement of the intron position along the coding sequence. Conclusion These results suggest that intronization, de-intronization, and intron boundary movement are rare events in evolution.
Collapse
Affiliation(s)
- Scott W Roy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
23
|
Panaro MA, Cianciulli A, Calvello R, Saccia M, Sisto M, Acquafredda A, Mitolo V. An analysis of the human chemokine CXC receptor 4 gene. Immunopharmacol Immunotoxicol 2009; 31:88-93. [PMID: 18798091 DOI: 10.1080/08923970802372863] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
In this article we analyze some of the structural characteristics of the coding section and the intron of the human chemokine CXC receptor 4 (a 7-transmembrane receptor) pre-mRNA. In the coding sequence the frequencies of the individual nucleotides do not depart significantly from 0.25, while in the intron the frequencies of the As and Gs are significantly lower and higher, respectively, than expected from a random distribution. Analysis of the pattern of association of nucleotides into triplets or couples shows that some triplets or couples occur with frequencies significantly higher or lower than expected when assuming a random association of nucleotides. In particular, in the intron combinations of the same nucleotide are over-represented. 7-or-more nucleotide repeats occur in both the coding section and the intron with frequencies which exceed the confidence limits for a random distribution. For the coding sequence this is possibly explained by the alternans of relatively similar hydrophobic-coding sections and relatively similar intervening intracellular and extracellular hydrophilic-coding sections. 7-or-more nucleotide repeats in reverse order and in reverse/complemented order occur in the intron, but not in the coding section, with frequencies which significantly exceed a random distribution. The numerous intronic repeats in reverse/complemented order may be of relevance for the secondary structure of the intron and might be one important element of the integrated splicing code.
Collapse
Affiliation(s)
- Maria A Panaro
- Department of Human Anatomy and Histology, University of Bari, Italy.
| | | | | | | | | | | | | |
Collapse
|
24
|
Roy SW, Irimia M. Mystery of intron gain: new data and new models. Trends Genet 2008; 25:67-73. [PMID: 19070397 DOI: 10.1016/j.tig.2008.11.004] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2008] [Revised: 11/18/2008] [Accepted: 11/18/2008] [Indexed: 11/19/2022]
Abstract
Despite their ubiquity, the mechanisms and evolutionary forces responsible for the origins of spliceosomal introns remain mysterious. Recent molecular evidence supports the idea that intronic RNAs can reverse splice into RNA transcripts, a crucial step for an influential model of intron gain. However, a paradox attends this model because the rate of intron gain is expected to be orders of magnitude lower than the rate of intron loss in general, in contrast to findings from several lineages. We suggest two possible resolutions to this paradox, based on steric considerations and on the possibility of co-option by specific introns of retroelement transposition pathways, respectively. In addition, we introduce two potential mechanisms for intron creation, based on hybrid RNA-DNA reverse splicing and on template switching errors by reverse transcriptase.
Collapse
Affiliation(s)
- Scott William Roy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, USA.
| | | |
Collapse
|
25
|
Irimia M, Rukov JL, Penny D, Vinther J, Garcia-Fernandez J, Roy SW. Origin of introns by 'intronization' of exonic sequences. Trends Genet 2008; 24:378-81. [PMID: 18597887 DOI: 10.1016/j.tig.2008.05.007] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2007] [Revised: 05/19/2008] [Accepted: 05/20/2008] [Indexed: 11/24/2022]
Abstract
The mechanisms of spliceosomal intron creation have proved elusive. Here we describe a new mechanism: the recruitment of internal exonic sequences ('intronization') in Caenorhabditis species. The numbers of intronization events and introns gained by other mechanisms are similar, suggesting that intronization significantly contributes to recent intron creation in nematodes. Intronization is more common than the reverse process, loss of splicing of retained introns. Finally, these findings link alternative splicing with modern intron creation.
Collapse
Affiliation(s)
- Manuel Irimia
- Departament de Genètica, Facultat de Biologia, Universitat de Barcelona, Barcelona 08028, Spain
| | | | | | | | | | | |
Collapse
|
26
|
Abstract
The quest for evolutionary mechanisms providing separation between the coding (exons) and noncoding (introns) parts of genomic DNA remains an important focus of genetics. This work combines an analysis of the most recent achievements of genomics and fundamental concepts of random processes to provide a novel point of view on genome evolution. Exon sizes in sequenced genomes show a lognormal distribution typical of a random Kolmogoroff fractioning process. This implies that the process of intron incretion may be independent of exon size, and therefore could be dependent on intron–exon boundaries. All genomes examined have two distinctive classes of exons, each with different evolutionary histories. In the framework proposed in this article, these two classes of exons can be derived from a hypothetical ancestral genome by (spontaneous) symmetry breaking. We note that one of these exon classes comprises mostly alternatively spliced exons.
Collapse
Affiliation(s)
- Yaroslav Ryabov
- Department of Chemistry, Purdue University, 560 Oval drive, Box 202, West Lafayette, IN, 47907, USA.
| | | |
Collapse
|
27
|
Sela N, Mersch B, Gal-Mark N, Lev-Maor G, Hotz-Wagenblatt A, Ast G. Comparative analysis of transposed element insertion within human and mouse genomes reveals Alu's unique role in shaping the human transcriptome. Genome Biol 2008; 8:R127. [PMID: 17594509 PMCID: PMC2394776 DOI: 10.1186/gb-2007-8-6-r127] [Citation(s) in RCA: 181] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2007] [Revised: 06/07/2007] [Accepted: 06/27/2007] [Indexed: 01/31/2023] Open
Abstract
Analysis of transposed elements in the human and mouse genomes reveals many effects on the transcriptomes, including a higher level of exonization of Alu elements than other elements. Background Transposed elements (TEs) have a substantial impact on mammalian evolution and are involved in numerous genetic diseases. We compared the impact of TEs on the human transcriptome and the mouse transcriptome. Results We compiled a dataset of all TEs in the human and mouse genomes, identifying 3,932,058 and 3,122,416 TEs, respectively. We than extracted TEs located within human and mouse genes and, surprisingly, we found that 60% of TEs in both human and mouse are located in intronic sequences, even though introns comprise only 24% of the human genome. All TE families in both human and mouse can exonize. TE families that are shared between human and mouse exhibit the same percentage of TE exonization in the two species, but the exonization level of Alu, a primate-specific retroelement, is significantly greater than that of other TEs within the human genome, leading to a higher level of TE exonization in human than in mouse (1,824 exons compared with 506 exons, respectively). We detected a primate-specific mechanism for intron gain, in which Alu insertion into an exon creates a new intron located in the 3' untranslated region (termed 'intronization'). Finally, the insertion of TEs into the first and last exons of a gene is more frequent in human than in mouse, leading to longer exons in human. Conclusion Our findings reveal many effects of TEs on these two transcriptomes. These effects are substantially greater in human than in mouse, which is due to the presence of Alu elements in human.
Collapse
Affiliation(s)
- Noa Sela
- Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Ramat Aviv 69978, Israel
| | - Britta Mersch
- HUSAR Bioinformatics Lab, Department of Molecular Biophysics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld, D-69120 Heidelberg, Germany
| | - Nurit Gal-Mark
- Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Ramat Aviv 69978, Israel
| | - Galit Lev-Maor
- Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Ramat Aviv 69978, Israel
| | - Agnes Hotz-Wagenblatt
- HUSAR Bioinformatics Lab, Department of Molecular Biophysics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld, D-69120 Heidelberg, Germany
| | - Gil Ast
- Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Ramat Aviv 69978, Israel
| |
Collapse
|
28
|
Zhuo D, Madden R, Elela SA, Chabot B. Modern origin of numerous alternatively spliced human introns from tandem arrays. Proc Natl Acad Sci U S A 2007; 104:882-6. [PMID: 17210920 PMCID: PMC1783408 DOI: 10.1073/pnas.0604777104] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Despite the widespread occurrence of spliceosomal introns in the genomes of higher eukaryotes, their origin remains controversial. One model proposes that the duplication of small genomic portions could have provided the boundaries for new introns. If this mechanism has occurred recently, the 5' and 3' boundaries of each resulting intron should display distinctive sequence similarity. Here, we report that the human genome contains an excess of introns with perfect matching sequences at boundaries. One-third of these introns interrupt the protein-coding sequences of known genes. Introns with the best-matching boundaries are invariably found in tandem arrays of direct repeats. Sequence analysis of the arrays indicates that many intron-breeding repeats have disseminated in several genes at different times during human evolution. A comparison with orthologous regions in mouse and chimpanzee suggests a young age for the human introns with the most-similar boundaries. Finally, we show that these human introns are alternatively spliced with exceptionally high frequency. Our study indicates that genomic duplication has been an important mode of intron gain in mammals. The alternative splicing of transcripts containing these intron-breeding repeats may provide the plasticity required for the rapid evolution of new human proteins.
Collapse
Affiliation(s)
- Degen Zhuo
- *Laboratoire de Génomique Fonctionnelle de Sherbrooke
| | | | - Sherif Abou Elela
- *Laboratoire de Génomique Fonctionnelle de Sherbrooke
- Département de Microbiologie et d'Infectiologie, Faculté de Médecine et des Sciences de la Santé, Université de Sherbrooke, Sherbrooke, PQ, Canada J1H 5N4
| | - Benoit Chabot
- *Laboratoire de Génomique Fonctionnelle de Sherbrooke
- Département de Microbiologie et d'Infectiologie, Faculté de Médecine et des Sciences de la Santé, Université de Sherbrooke, Sherbrooke, PQ, Canada J1H 5N4
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
29
|
Abstract
Research into the origins of introns is at a critical juncture in the resolution of theories on the evolution of early life (which came first, RNA or DNA?), the identity of LUCA (the last universal common ancestor, was it prokaryotic- or eukaryotic-like?), and the significance of noncoding nucleotide variation. One early notion was that introns would have evolved as a component of an efficient mechanism for the origin of genes. But alternative theories emerged as well. From the debate between the "introns-early" and "introns-late" theories came the proposal that introns arose before the origin of genetically encoded proteins and DNA, and the more recent "introns-first" theory, which postulates the presence of introns at that early evolutionary stage from a reconstruction of the "RNA world." Here we review seminal and recent ideas about intron origins. Recent discoveries about the patterns and causes of intron evolution make this one of the most hotly debated and exciting topics in molecular evolutionary biology today.
Collapse
Affiliation(s)
- Francisco Rodríguez-Trelles
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697-2525, USA.
| | | | | |
Collapse
|
30
|
Fridmanis D, Fredriksson R, Kapa I, Schiöth HB, Klovins J. Formation of new genes explains lower intron density in mammalian Rhodopsin G protein-coupled receptors. Mol Phylogenet Evol 2006; 43:864-80. [PMID: 17188520 DOI: 10.1016/j.ympev.2006.11.007] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2006] [Revised: 10/06/2006] [Accepted: 11/02/2006] [Indexed: 10/23/2022]
Abstract
Mammalian G protein-coupled receptor (GPCR) genes are characterised by a large proportion of intronless genes or a lower density of introns when compared with GPCRs of invertebrates. It is unclear which mechanisms have influenced intron density in this protein family, which is one of the largest in the mammalian genomes. We used a combination of Hidden Markov Models (HMM) and BLAST searches to establish the comprehensive repertoire of Rhodopsin GPCRs from seven species and performed overall alignments and phylogenetic analysis using the maximum parsimony method for over 1400 receptors in 12 subgroups. We identified 14 different Ancestral Receptor Groups (ARGs) that have members in both vertebrate and invertebrate species. We found that there exists a remarkable difference in the intron density among ancestral and new Rhodopsin GPCRs. The intron density among ARGs members was more than 3.5-fold higher than that within non-ARG members and more than 2-fold higher when considering only the 7TM region. This suggests that the new GPCR genes have been predominantly formed intronless while the ancestral receptors likely accumulated introns during their evolution. Many of the intron positions found in mammalian ARG receptor sequences were found to be present in orthologue invertebrate receptors suggesting that these intron positions are ancient. This analysis also revealed that one intron position is much more frequent than any other position and it is common for a number of phylogenetically different Rhodopsin GPCR groups. This intron position lies within a functionally important, conserved, DRY motif which may form a proto-splice site that could contribute to positional intron insertion. Moreover, we have found that other receptor motifs, similar to DRY, also contain introns between the second and third nucleotide of the arginine codon which also forms a proto-splice site. Our analysis presents compelling evidence that there was not a major loss of introns in mammalian GPCRs and formation of new GPCRs among mammals explains why these have fewer introns compared to invertebrate GPCRs. We also discuss and speculate about the possible role of different RNA- and DNA-based mechanisms of intron insertion and loss.
Collapse
Affiliation(s)
- Davids Fridmanis
- Biomedical Research and Study Centre, University of Latvia, Ratsupites 1, Riga, Latvia
| | | | | | | | | |
Collapse
|
31
|
Roy SW, Penny D. Large-scale intron conservation and order-of-magnitude variation in intron loss/gain rates in apicomplexan evolution. Genome Res 2006; 16:1270-5. [PMID: 16963708 PMCID: PMC1581436 DOI: 10.1101/gr.5410606] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The age of modern introns and the evolutionary forces controlling intron loss and gain remain matters of much debate. In the case of the apicomplexan malaria parasite Plasmodium falciparum, previous studies have shown that while the positions of two thirds of P. falciparum introns are not shared with surveyed non-apicomplexans (leaving open the possibility that they were relatively recently gained), 99.1% are shared with Plasmodium yoelii, which diverged from P. falciparum at least 100 Mya. We show here that 60.6% of P. falciparum intron positions in conserved regions are shared with the distantly related apicomplexan Theileria parva, whereas only 18.2% of introns in the more intron-rich T. parva are shared with P. falciparum. Comparison of 3305 pairs of orthologous genes between T. parva and Theileria annulata showed that 7089/7111 (99.7%) introns in conserved regions are shared between species. These levels of conservation imply significant differences in rates of intron loss and gain through apicomplexan history. Because transposable elements (TEs) and/or (often TE-encoded) reverse transcriptase are implicated in models of intron loss and gain, the observed low rates of intron loss and gain in recent Plasmodium and Theileria evolution are consistent with the lack of known TE in those groups. We suggest that intron loss/gain in some eukaryotic lineages may be concentrated in relatively short episodes coincident with occasional TE invasions.
Collapse
Affiliation(s)
- Scott William Roy
- Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand.
| | | |
Collapse
|
32
|
Okamura K, Feuk L, Marquès-Bonet T, Navarro A, Scherer SW. Frequent appearance of novel protein-coding sequences by frameshift translation. Genomics 2006; 88:690-697. [PMID: 16890400 DOI: 10.1016/j.ygeno.2006.06.009] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2006] [Revised: 06/14/2006] [Accepted: 06/19/2006] [Indexed: 11/23/2022]
Abstract
Genomic duplication, followed by divergence, contributes to organismal evolution. Several mechanisms, such as exon shuffling and alternative splicing, are responsible for novel gene functions, but they generate homologous domains and do not usually lead to drastic innovation. Major novelties can potentially be introduced by frameshift mutations and this idea can explain the creation of novel proteins. Here, we employ a strategy using simulated protein sequences and identify 470 human and 108 mouse frameshift events that originate new gene segments. No obvious interspecies overlap was observed, suggesting high rates of acquisition of evolutionary events. This inference is supported by a deficiency of TpA dinucleotides in the protein-coding sequences, which decreases the occurrence of translational termination, even on the complementary strand. Increased usage of the TGA codon as the termination signal in newer genes also supports our inference. This suggests that tolerated frameshift changes are a prevalent mechanism for the rapid emergence of new genes and that protein-coding sequences can be derived from existing or ancestral exons rather than from events that result in noncoding sequences becoming exons.
Collapse
Affiliation(s)
- Kohji Okamura
- The Centre for Applied Genomics, Program in Genetics and Genomic Biology, The Hospital for Sick Children, Toronto, Canada ON M5G 1L7; Department of Molecular and Medical Genetics, University of Toronto, Toronto, Canada ON M5S 1A8
| | - Lars Feuk
- The Centre for Applied Genomics, Program in Genetics and Genomic Biology, The Hospital for Sick Children, Toronto, Canada ON M5G 1L7; Department of Molecular and Medical Genetics, University of Toronto, Toronto, Canada ON M5S 1A8
| | - Tomàs Marquès-Bonet
- Unitat de Biologia Evolutiva, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Arcadi Navarro
- Unitat de Biologia Evolutiva, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Stephen W Scherer
- The Centre for Applied Genomics, Program in Genetics and Genomic Biology, The Hospital for Sick Children, Toronto, Canada ON M5G 1L7; Department of Molecular and Medical Genetics, University of Toronto, Toronto, Canada ON M5S 1A8.
| |
Collapse
|
33
|
Roy SW, Hartl DL. Very little intron loss/gain in Plasmodium: intron loss/gain mutation rates and intron number. Genome Res 2006; 16:750-6. [PMID: 16702411 PMCID: PMC1473185 DOI: 10.1101/gr.4845406] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We compared intron positions in conserved regions of 3479 orthologous gene pairs from Plasmodium falciparum and Plasmodium yoelii, which likely diverged >or=100 million years ago (Mya). Only 27 out of 2212 positions were specific to one of the two species. Intron presence in related species shows that at least 19 and possibly 26 of the changes are due to intron loss, depending on phylogeny. The implied intron loss and gain rates are much lower than previously estimated for nematodes, arthropods, fungi, and plants, and are comparable only with the rates in vertebrates. That all observed changes were exact, occurring without loss or gain of flanking coding sequence, suggests intron loss via an mRNA intermediate, as does a nonsignificant trend toward loss of introns at adjacent positions. Many of the intron changes occurred in genes encoding proteins involved in nucleic acid-related processes, as previously found for intron gains in nematodes. Two changes occurred in the chloroquine resistance transporter, suggesting a role for positive selection in intron loss in Plasmodium. The dearth of intron loss and gain could be explained by the lack of known transposable elements in Plasmodium, since transposable elements and/or reverse transcriptase are thought to be necessary for both processes. The observed pattern suggests that the availability of stochastic intron loss and gain mutations can be a major determinant of changes in intron number.
Collapse
Affiliation(s)
- Scott William Roy
- Department of Organismic and Evolutionary Biology, Harvard, Cambridge, Massachusetts 02138, USA.
| | | |
Collapse
|
34
|
Abstract
The origins and importance of spliceosomal introns comprise one of the longest-abiding mysteries of molecular evolution. Considerable debate remains over several aspects of the evolution of spliceosomal introns, including the timing of intron origin and proliferation, the mechanisms by which introns are lost and gained, and the forces that have shaped intron evolution. Recent important progress has been made in each of these areas. Patterns of intron-position correspondence between widely diverged eukaryotic species have provided insights into the origins of the vast differences in intron number between eukaryotic species, and studies of specific cases of intron loss and gain have led to progress in understanding the underlying molecular mechanisms and the forces that control intron evolution.
Collapse
Affiliation(s)
- Scott William Roy
- Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand.
| | | |
Collapse
|
35
|
Rodríguez-Trelles F, Tarrío R, Ayala FJ. Models of spliceosomal intron proliferation in the face of widespread ectopic expression. Gene 2006; 366:201-8. [PMID: 16288838 DOI: 10.1016/j.gene.2005.09.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2005] [Revised: 08/04/2005] [Accepted: 09/02/2005] [Indexed: 11/27/2022]
Abstract
It is now certain that today living organisms can acquire new spliceosomal introns in their genes. The proposed sources of spliceosomal introns are exons, transposons, and other introns, including spliceosomal and group II self-splicing introns. Spliceosomal introns are thought to be the most likely source, because the inserted sequence would immediately be endowed with the essential set of intron recognition sequences, thereby preventing the deleterious effects associated with incorrect splicing. The most obvious spliceosomal intron duplication pathways involve an RNA transcript intermediate step. Therefore, for a spliceosomal intron to be originated by duplication, either the source gene from which the novel intron is derived, or that gene and the recipient gene, which contains the novel intron, would need to be expressed in the germ line. Intron proliferation surveys indicate that putative intron duplicate-containing genes do not always match detectable expression in the germ line, which casts doubt on the generality of the duplication model. However, judging mechanisms of intron gain (or loss) from present-day gene expression profiles could be erroneous, if expression patterns were different at the time the introns arose. In fact, this may likely be so in most cases. Ectopic expression, i.e., the expression of genes at times and locations where the target gene is not known to have a function, is a much more common phenomenon than previously realized. We conclude with a speculation on a possible interplay between spliceosomal introns and ectopic expression at the origin of multicellularity.
Collapse
Affiliation(s)
- Francisco Rodríguez-Trelles
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697-2525, USA.
| | | | | |
Collapse
|