1
|
Larue GE, Roy SW. Where the minor things are: a pan-eukaryotic survey suggests neutral processes may explain much of minor intron evolution. Nucleic Acids Res 2023; 51:10884-10908. [PMID: 37819006 PMCID: PMC10639083 DOI: 10.1093/nar/gkad797] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 09/12/2023] [Accepted: 09/19/2023] [Indexed: 10/13/2023] Open
Abstract
Spliceosomal introns are gene segments removed from RNA transcripts by ribonucleoprotein machineries called spliceosomes. In some eukaryotes a second 'minor' spliceosome is responsible for processing a tiny minority of introns. Despite its seemingly modest role, minor splicing has persisted for roughly 1.5 billion years of eukaryotic evolution. Identifying minor introns in over 3000 eukaryotic genomes, we report diverse evolutionary histories including surprisingly high numbers in some fungi and green algae, repeated loss, as well as general biases in their positional and genic distributions. We estimate that ancestral minor intron densities were comparable to those of vertebrates, suggesting a trend of long-term stasis. Finally, three findings suggest a major role for neutral processes in minor intron evolution. First, highly similar patterns of minor and major intron evolution contrast with both functionalist and deleterious model predictions. Second, observed functional biases among minor intron-containing genes are largely explained by these genes' greater ages. Third, no association of intron splicing with cell proliferation in a minor intron-rich fungus suggests that regulatory roles are lineage-specific and thus cannot offer a general explanation for minor splicing's persistence. These data constitute the most comprehensive view of minor introns and their evolutionary history to date, and provide a foundation for future studies of these remarkable genetic elements.
Collapse
Affiliation(s)
- Graham E Larue
- Quantitative and Systems Biology Graduate Program, University of California Merced, Merced, CA 95343, USA
| | - Scott W Roy
- Department of Molecular and Cell Biology, University of California Merced, Merced, CA 95343, USA
- Department of Biology, San Francisco State University, San Francisco, CA 94132, USA
| |
Collapse
|
2
|
Larue GE, Eliáš M, Roy SW. Expansion and transformation of the minor spliceosomal system in the slime mold Physarum polycephalum. Curr Biol 2021; 31:3125-3131.e4. [PMID: 34015249 DOI: 10.1016/j.cub.2021.04.050] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 03/14/2021] [Accepted: 04/20/2021] [Indexed: 12/25/2022]
Abstract
Spliceosomal introns interrupt nuclear genes and are removed from RNA transcripts ("spliced") by machinery called spliceosomes. Although the vast majority of spliceosomal introns are removed by the so-called major (or "U2") spliceosome, diverse eukaryotes also contain a rare second form, the minor ("U12") spliceosome, and associated ("U12-type") introns.1-3 In all characterized species, U12-type introns are distinguished by several features, including being rare in the genome (∼0.5% of all introns),4-6 containing extended evolutionarily conserved splicing motifs,4,5,7,8 being generally ancient,9,10 and being inefficiently spliced.11-13 Here, we report a remarkable exception in the slime mold Physarum polycephalum. The P. polycephalum genome contains >20,000 U12-type introns-25 times more than any other species-enriched in a diversity of non-canonical splice boundaries as well as transformed splicing signals that appear to have co-evolved with the spliceosome due to massive gain of efficiently spliced U12-type introns. These results reveal an unappreciated dynamism of minor spliceosomal introns and spliceosomal introns in general.
Collapse
Affiliation(s)
- Graham E Larue
- Department of Molecular and Cell Biology, University of California, Merced, Merced, CA 95343, USA.
| | - Marek Eliáš
- Department of Biology and Ecology Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| | - Scott W Roy
- Department of Molecular and Cell Biology, University of California, Merced, Merced, CA 95343, USA; Department of Biology, San Francisco State University, San Francisco, CA 94132, USA.
| |
Collapse
|
3
|
Rathore OS, Silva RD, Ascensão-Ferreira M, Matos R, Carvalho C, Marques B, Tiago MN, Prudêncio P, Andrade RP, Roignant JY, Barbosa-Morais NL, Martinho RG. NineTeen Complex-subunit Salsa is required for efficient splicing of a subset of introns and dorsal-ventral patterning. RNA (NEW YORK, N.Y.) 2020; 26:1935-1956. [PMID: 32963109 PMCID: PMC7668242 DOI: 10.1261/rna.077446.120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 09/07/2020] [Indexed: 06/11/2023]
Abstract
The NineTeen Complex (NTC), also known as pre-mRNA-processing factor 19 (Prp19) complex, regulates distinct spliceosome conformational changes necessary for splicing. During Drosophila midblastula transition, splicing is particularly sensitive to mutations in NTC-subunit Fandango, which suggests differential requirements of NTC during development. We show that NTC-subunit Salsa, the Drosophila ortholog of human RNA helicase Aquarius, is rate-limiting for splicing of a subset of small first introns during oogenesis, including the first intron of gurken Germline depletion of Salsa and splice site mutations within gurken first intron impair both adult female fertility and oocyte dorsal-ventral patterning, due to an abnormal expression of Gurken. Supporting causality, the fertility and dorsal-ventral patterning defects observed after Salsa depletion could be suppressed by the expression of a gurken construct without its first intron. Altogether, our results suggest that one of the key rate-limiting functions of Salsa during oogenesis is to ensure the correct expression and efficient splicing of the first intron of gurken mRNA. Retention of gurken first intron compromises the function of this gene most likely because it undermines the correct structure and function of the transcript 5'UTR.
Collapse
Affiliation(s)
- Om Singh Rathore
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
| | - Rui D Silva
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
| | - Mariana Ascensão-Ferreira
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal
| | - Ricardo Matos
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
| | - Célia Carvalho
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal
| | - Bruno Marques
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
| | - Margarida N Tiago
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
| | - Pedro Prudêncio
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal
| | - Raquel P Andrade
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
- Department of Medicine and Biomedical Sciences and Algarve Biomedical Center, Universidade do Algarve, 8005-139 Faro, Portugal
| | - Jean-Yves Roignant
- Center for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne, CH-1015 Lausanne, Switzerland
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, 55128 Mainz, Germany
| | - Nuno L Barbosa-Morais
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal
| | - Rui Gonçalo Martinho
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal
- Department of Medical Sciences and Institute for Biomedicine (iBiMED), Universidade de Aveiro, 3810-193 Aveiro, Portugal
| |
Collapse
|
4
|
Krzyzanowski PM, Sircoulomb F, Yousif F, Normand J, La Rose J, E Francis K, Suarez F, Beck T, McPherson JD, Stein LD, Rottapel RK. Regional perturbation of gene transcription is associated with intrachromosomal rearrangements and gene fusion transcripts in high grade ovarian cancer. Sci Rep 2019; 9:3590. [PMID: 30837567 PMCID: PMC6401071 DOI: 10.1038/s41598-019-39878-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Accepted: 01/30/2019] [Indexed: 01/10/2023] Open
Abstract
Genomic rearrangements are a hallmark of cancer biology and progression, allowing cells to rapidly transform through alterations in regulatory structures, changes in expression patterns, reprogramming of signaling pathways, and creation of novel transcripts via gene fusion events. Though functional gene fusions encoding oncogenic proteins are the most dramatic outcomes of genomic rearrangements, we investigated the relationship between rearrangements evidenced by fusion transcripts and local expression changes in cancer using transcriptome data alone. 9,953 gene fusion predictions from 418 primary serious ovarian cancer tumors were analyzed, identifying depletions of gene fusion breakpoints within coding regions of fused genes as well as an N-terminal enrichment of breakpoints within fused genes. We identified 48 genes with significant fusion-associated upregulation and furthermore demonstrate that significant regional overexpression of intact genes in patient transcriptomes occurs within 1 megabase of 78 novel gene fusions that function as central markers of these regions. We reveal that cancer transcriptomes select for gene fusions that preserve protein and protein domain coding potential. The association of gene fusion transcripts with neighboring gene overexpression supports rearrangements as mechanism through which cancer cells remodel their transcriptomes and identifies a new way to utilize gene fusions as indicators of regional expression changes in diseased cells with only transcriptomic data.
Collapse
Affiliation(s)
- Paul M Krzyzanowski
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada.
| | - Fabrice Sircoulomb
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada
| | - Fouad Yousif
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada
| | - Josee Normand
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Jose La Rose
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada
| | - Kyle E Francis
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada
| | - Fernando Suarez
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada
| | - Tim Beck
- Human Longevity Inc., San Diego, California, USA
| | - John D McPherson
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada.,University of California, Davis Medical Center, Sacramento, California, USA
| | - Lincoln D Stein
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada. .,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
| | - Robert K Rottapel
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada. .,Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada.
| |
Collapse
|
5
|
Architecture and Distribution of Introns in Core Genes of Four Fusarium Species. G3-GENES GENOMES GENETICS 2017; 7:3809-3820. [PMID: 28993438 PMCID: PMC5677156 DOI: 10.1534/g3.117.300344] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Removal of introns from transcribed RNA represents a crucial step during the production of mRNA in eukaryotes. Available whole-genome sequences and expressed sequence tags (ESTs) have increased our knowledge of this process and revealed various commonalities among eukaryotes. However, certain aspects of intron structure and diversity are taxon-specific, which can complicate the accuracy of in silico gene prediction methods. Using core genes, we evaluated the distribution and architecture of Fusarium circinatum spliceosomal introns, and linked these characteristics to the accuracy of the predicted gene models of the genome of this fungus. We also evaluated intron distribution and architecture in F. verticillioides, F. oxysporum, and F. graminearum, and made comparisons with F. circinatum. Results indicated that F. circinatum and the three other Fusarium species have canonical 5′ and 3′ splice sites, but with subtle differences that are apparently not shared with those of other fungal genera. The polypyrimidine tract of Fusarium introns was also found to be highly divergent among species and genes. Furthermore, the conserved adenosine nucleoside required during the first step of splicing is contained within unique branch site motifs in certain Fusarium introns. Data generated here show that introns of F. circinatum, as well as F. verticillioides, F. oxysporum, and F. graminearum, are characterized by a number of unique features such as the CTHAH and ACCAT motifs of the branch site. Incorporation of such information into genome annotation software will undoubtedly improve the accuracy of gene prediction methods used for Fusarium species and related fungi.
Collapse
|
6
|
Bonnet A, Grosso AR, Elkaoutari A, Coleno E, Presle A, Sridhara SC, Janbon G, Géli V, de Almeida SF, Palancade B. Introns Protect Eukaryotic Genomes from Transcription-Associated Genetic Instability. Mol Cell 2017; 67:608-621.e6. [PMID: 28757210 DOI: 10.1016/j.molcel.2017.07.002] [Citation(s) in RCA: 81] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Revised: 05/19/2017] [Accepted: 06/30/2017] [Indexed: 12/31/2022]
Abstract
Transcription is a source of genetic instability that can notably result from the formation of genotoxic DNA:RNA hybrids, or R-loops, between the nascent mRNA and its template. Here we report an unexpected function for introns in counteracting R-loop accumulation in eukaryotic genomes. Deletion of endogenous introns increases R-loop formation, while insertion of an intron into an intronless gene suppresses R-loop accumulation and its deleterious impact on transcription and recombination in yeast. Recruitment of the spliceosome onto the mRNA, but not splicing per se, is shown to be critical to attenuate R-loop formation and transcription-associated genetic instability. Genome-wide analyses in a number of distant species differing in their intron content, including human, further revealed that intron-containing genes and the intron-richest genomes are best protected against R-loop accumulation and subsequent genetic instability. Our results thereby provide a possible rationale for the conservation of introns throughout the eukaryotic lineage.
Collapse
Affiliation(s)
- Amandine Bonnet
- Institut Jacques Monod, CNRS, UMR 7592, Université Paris Diderot, Sorbonne Paris Cité, 75013 Paris, France
| | - Ana R Grosso
- Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de Lisboa, 1600-276 Lisboa, Portugal
| | - Abdessamad Elkaoutari
- Cancer Research Center of Marseille (CRCM), Equipe Labellisée Ligue, U1068 INSERM, UMR7258 CNRS, Institut Paoli-Calmettes, Aix Marseille University, 13284 Marseille, France
| | - Emeline Coleno
- Institut Jacques Monod, CNRS, UMR 7592, Université Paris Diderot, Sorbonne Paris Cité, 75013 Paris, France
| | - Adrien Presle
- Institut Jacques Monod, CNRS, UMR 7592, Université Paris Diderot, Sorbonne Paris Cité, 75013 Paris, France
| | - Sreerama C Sridhara
- Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de Lisboa, 1600-276 Lisboa, Portugal
| | - Guilhem Janbon
- Institut Pasteur, Unité Biologie des ARN des Pathogènes Fongiques, Département de Mycologie, 75015 Paris, France
| | - Vincent Géli
- Cancer Research Center of Marseille (CRCM), Equipe Labellisée Ligue, U1068 INSERM, UMR7258 CNRS, Institut Paoli-Calmettes, Aix Marseille University, 13284 Marseille, France
| | - Sérgio F de Almeida
- Instituto de Medicina Molecular, Faculdade de Medicina da Universidade de Lisboa, 1600-276 Lisboa, Portugal
| | - Benoit Palancade
- Institut Jacques Monod, CNRS, UMR 7592, Université Paris Diderot, Sorbonne Paris Cité, 75013 Paris, France.
| |
Collapse
|
7
|
Catania F. From intronization to intron loss: How the interplay between mRNA-associated processes can shape the architecture and the expression of eukaryotic genes. Int J Biochem Cell Biol 2017; 91:136-144. [PMID: 28673893 DOI: 10.1016/j.biocel.2017.06.017] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2017] [Revised: 06/25/2017] [Accepted: 06/30/2017] [Indexed: 12/29/2022]
Abstract
Transcription-coupled processes such as capping, splicing, and cleavage/polyadenylation participate in the journey from genes to proteins. Although they are traditionally thought to serve only as steps in the generation of mature mRNAs, a synthesis of available data indicates that these processes could also act as a driving force for the evolution of eukaryotic genes. A theoretical framework for how mRNA-associated processes may shape gene structure and expression has recently been proposed. Factors that promote splicing and cleavage/polyadenylation in this framework compete for access to overlapping or neighboring signals throughout the transcription cycle. These antagonistic interactions allow mechanisms for intron gain and splice site recognition as well as common trends in eukaryotic gene structure and expression to be coherently integrated. Here, I extend this framework further. Observations that largely (but not exclusively) revolve around the formation of DNA-RNA hybrid structures, called R loops, and promoter directionality are integrated. Additionally, the interplay between splicing factors and cleavage/polyadenylation factors is theorized to also affect the formation of intragenic DNA double-stranded breaks thereby contributing to intron loss. The most notable prediction in this proposition is that RNA molecules can mediate intron loss by serving as a template to repair DNA double-stranded breaks. The framework presented here leverages a vast body of empirical observations, logically extending previous suggestions, and generating verifiable predictions to further substantiate the view that the intracellular environment plays an active role in shaping the structure and the expression of eukaryotic genes.
Collapse
Affiliation(s)
- Francesco Catania
- Institute for Evolution and Biodiversity, University of Münster, Hüfferstraße 1, 48149 Münster, Germany.
| |
Collapse
|
8
|
Bondarenko VS, Gelfand MS. Evolution of the Exon-Intron Structure in Ciliate Genomes. PLoS One 2016; 11:e0161476. [PMID: 27603699 PMCID: PMC5014332 DOI: 10.1371/journal.pone.0161476] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2016] [Accepted: 08/06/2016] [Indexed: 12/27/2022] Open
Abstract
A typical eukaryotic gene is comprised of alternating stretches of regions, exons and introns, retained in and spliced out a mature mRNA, respectively. Although the length of introns may vary substantially among organisms, a large fraction of genes contains short introns in many species. Notably, some Ciliates (Paramecium and Nyctotherus) possess only ultra-short introns, around 25 bp long. In Paramecium, ultra-short introns with length divisible by three (3n) are under strong evolutionary pressure and have a high frequency of in-frame stop codons, which, in the case of intron retention, cause premature termination of mRNA translation and consequent degradation of the mis-spliced mRNA by the nonsense-mediated decay mechanism. Here, we analyzed introns in five genera of Ciliates, Paramecium, Tetrahymena, Ichthyophthirius, Oxytricha, and Stylonychia. Introns can be classified into two length classes in Tetrahymena and Ichthyophthirius (with means 48 bp, 69 bp, and 55 bp, 64 bp, respectively), but, surprisingly, comprise three distinct length classes in Oxytricha and Stylonychia (with means 33–35 bp, 47–51 bp, and 78–80 bp). In most ranges of the intron lengths, 3n introns are underrepresented and have a high frequency of in-frame stop codons in all studied species. Introns of Paramecium, Tetrahymena, and Ichthyophthirius are preferentially located at the 5' and 3' ends of genes, whereas introns of Oxytricha and Stylonychia are strongly skewed towards the 5' end. Analysis of evolutionary conservation shows that, in each studied genome, a significant fraction of intron positions is conserved between the orthologs, but intron lengths are not correlated between the species. In summary, our study provides a detailed characterization of introns in several genera of Ciliates and highlights some of their distinctive properties, which, together, indicate that splicing spellchecking is a universal and evolutionarily conserved process in the biogenesis of short introns in various representatives of Ciliates.
Collapse
Affiliation(s)
- Vladyslav S. Bondarenko
- Institute of Molecular Biology and Genetics, NASU, Zabolotnogo Str. 150, Kyiv, 03680, Ukraine
- * E-mail:
| | - Mikhail S. Gelfand
- A.A. Kharkevich Institute for Information Transmission Problems, RAS, Bolshoy Karetny per. 19, Moscow, 127994, Russia
- Skolkovo Institute of Science and Technology, Moscow, 143026, Russia
- Department of Bioengineering and Bioinformatics, M.V. Lomonosov Moscow State University, Vorobievy Gory 1–73, Moscow GSP-1, 119234, Russia
| |
Collapse
|
9
|
mRNA-Associated Processes and Their Influence on Exon-Intron Structure in Drosophila melanogaster. G3-GENES GENOMES GENETICS 2016; 6:1617-26. [PMID: 27172210 PMCID: PMC4889658 DOI: 10.1534/g3.116.029231] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
mRNA-associated processes and gene structure in eukaryotes are typically treated as separate research subjects. Here, we bridge this separation and leverage the extensive multidisciplinary work on Drosophila melanogaster to examine the roles that capping, splicing, cleavage/polyadenylation, and telescripting (i.e., the protection of nascent transcripts from premature cleavage/polyadenylation by the splicing factor U1) might play in shaping exon-intron architecture in protein-coding genes. Our findings suggest that the distance between subsequent internal 5′ splice sites (5′ss) in Drosophila genes is constrained such that telescripting effects are maximized, in theory, and thus nascent transcripts are less vulnerable to premature termination. Exceptionally weak 5′ss and constraints on intron-exon size at the gene 5′ end also indicate that capping might enhance the recruitment of U1 and, in turn, promote telescripting at this location. Finally, a positive correlation between last exon length and last 5′ss strength suggests that optimal donor splice sites in the proximity of the pre-mRNA tail may inhibit the processing of downstream polyadenylation signals more than weak donor splice sites do. These findings corroborate and build upon previous experimental and computational studies on Drosophila genes. They support the possibility, hitherto scantly explored, that mRNA-associated processes impose significant constraints on the evolution of eukaryotic gene structure.
Collapse
|
10
|
Ferro D, Lepennetier G, Catania F. Cis-acting signals modulate the efficiency of programmed DNA elimination in Paramecium tetraurelia. Nucleic Acids Res 2015; 43:8157-68. [PMID: 26304543 PMCID: PMC4787833 DOI: 10.1093/nar/gkv843] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Accepted: 08/01/2015] [Indexed: 12/12/2022] Open
Abstract
In Paramecium, the regeneration of a functional somatic genome at each sexual event relies on the elimination of thousands of germline DNA sequences, known as Internal Eliminated Sequences (IESs), from the zygotic nuclear DNA. Here, we provide evidence that IESs’ length and sub-terminal bases jointly modulate IES excision by affecting DNA conformation in P. tetraurelia. Our study reveals an excess of complementary base pairing between IESs’ sub-terminal and contiguous sites, suggesting that IESs may form DNA loops prior to cleavage. The degree of complementary base pairing between IESs’ sub-terminal sites (termed Cin-score) is positively associated with IES length and is shaped by natural selection. Moreover, it escalates abruptly when IES length exceeds 45 nucleotides (nt), indicating that only sufficiently large IESs may form loops. Finally, we find that IESs smaller than 46 nt are favored targets of the cellular surveillance systems, presumably because of their relatively inefficient excision. Our findings extend the repertoire of cis-acting determinants for IES recognition/excision and provide unprecedented insights into the distinct selective pressures that operate on IESs and somatic DNA regions. This information potentially moves current models of IES evolution and of mechanisms of IES recognition/excision forward.
Collapse
Affiliation(s)
- Diana Ferro
- Institute for Evolution and Biodiversity, University of Münster, Hüfferstrasse 1, 48149 Münster, Germany
| | - Gildas Lepennetier
- Institute for Evolution and Biodiversity, University of Münster, Hüfferstrasse 1, 48149 Münster, Germany
| | - Francesco Catania
- Institute for Evolution and Biodiversity, University of Münster, Hüfferstrasse 1, 48149 Münster, Germany
| |
Collapse
|
11
|
Catania F, Schmitz J. On the path to genetic novelties: insights from programmed DNA elimination and RNA splicing. WILEY INTERDISCIPLINARY REVIEWS-RNA 2015; 6:547-61. [PMID: 26140477 DOI: 10.1002/wrna.1293] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Revised: 04/29/2015] [Accepted: 06/06/2015] [Indexed: 12/17/2022]
Abstract
Understanding how genetic novelties arise is a central goal of evolutionary biology. To this end, programmed DNA elimination and RNA splicing deserve special consideration. While programmed DNA elimination reshapes genomes by eliminating chromatin during organismal development, RNA splicing rearranges genetic messages by removing intronic regions during transcription. Small RNAs help to mediate this class of sequence reorganization, which is not error-free. It is this imperfection that makes programmed DNA elimination and RNA splicing excellent candidates for generating evolutionary novelties. Leveraging a number of these two processes' mechanistic and evolutionary properties, which have been uncovered over the past years, we present recently proposed models and empirical evidence for how splicing can shape the structure of protein-coding genes in eukaryotes. We also chronicle a number of intriguing similarities between the processes of programmed DNA elimination and RNA splicing, and highlight the role that the variation in the population-genetic environment may play in shaping their target sequences.
Collapse
Affiliation(s)
- Francesco Catania
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Jürgen Schmitz
- Institute of Experimental Pathology (ZMBE), University of Münster, Münster, Germany
| |
Collapse
|
12
|
Zhou K, Kuo A, Grigoriev IV. Reverse transcriptase and intron number evolution. Stem Cell Investig 2014; 1:17. [PMID: 27358863 DOI: 10.3978/j.issn.2306-9759.2014.08.01] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2014] [Accepted: 08/04/2014] [Indexed: 11/14/2022]
Abstract
BACKGROUND Introns are universal in eukaryotic genomes and play important roles in transcriptional regulation, mRNA export to the cytoplasm, nonsense-mediated decay as both a regulatory and a splicing quality control mechanism, R-loop avoidance, alternative splicing, chromatin structure, and evolution by exon-shuffling. METHODS Sixteen complete fungal genomes were used 13 of which were sequenced and annotated by JGI. Ustilago maydis, Cryptococcus neoformans, and Coprinus cinereus (also named Coprinopsis cinerea) were from the Broad Institute. Gene models from JGI-annotated genomes were taken from the GeneCatalog track that contained the best representative gene models. Varying fractions of the GeneCatalog were manually curated by external users. For clarity, we used the JGI unique database identifier. RESULTS The last common ancestor of eukaryotes (LECA) has an estimated 6.4 coding exons per gene (EPG) and evolved into the diverse eukaryotic life forms, which is recapitulated by the development of a stem cell. We found a parallel between the simulated reverse transcriptase (RT)-mediated intron loss and the comparative analysis of 16 fungal genomes that spanned a wide range of intron density. Although footprints of RT (RTF) were dynamic, relative intron location (RIL) to the 5'-end of mRNA faithfully traced RT-mediated intron loss and revealed 7.7 EPG for LECA. The mode of exon length distribution was conserved in simulated intron loss, which was exemplified by the shared mode of 75 nt between fungal and Chlamydomonas genomes. The dominant ancient exon length was corroborated by the average exon length of the most intron-rich genes in fungal genomes and consistent with ancient protein modules being ~25 aa. Combined with the conservation of a protein length of 400 aa, the earliest ancestor of eukaryotes could have 16 EPG. During earlier evolution, Ascomycota's ancestor had significantly more 3'-biased RT-mediated intron loss that was followed by dramatic RTF loss. There was a down trend of EPG from more conserved to less conserved genes. Moreover, species-specific genes have higher exon-densities, shorter exons, and longer introns when compared to genes conserved at the phylum level. However, intron length in species-specific genes became shorter than that of genes conserved in all species after genomes experiencing drastic intron loss. The estimated EPG from the most frequent exon length is more than double that from the RIL method. CONCLUSIONS This implies significant intron loss during the very early period of eukaryotic evolution. De novo gene-birth contributes to shorter exons, longer introns, and higher exon-density in species-specific genes relative to conserved genes.
Collapse
Affiliation(s)
- Kemin Zhou
- 1 Computational Genomics, Bristol-Myers Squibb, 311 Pennington Rocky Hill Road, Pennington, NJ 08534, USA ; 2 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Alan Kuo
- 1 Computational Genomics, Bristol-Myers Squibb, 311 Pennington Rocky Hill Road, Pennington, NJ 08534, USA ; 2 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| | - Igor V Grigoriev
- 1 Computational Genomics, Bristol-Myers Squibb, 311 Pennington Rocky Hill Road, Pennington, NJ 08534, USA ; 2 US Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598, USA
| |
Collapse
|
13
|
Frequency of intron loss correlates with processed pseudogene abundance: a novel strategy to test the reverse transcriptase model of intron loss. BMC Biol 2013; 11:23. [PMID: 23497167 PMCID: PMC3652778 DOI: 10.1186/1741-7007-11-23] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2013] [Accepted: 03/05/2013] [Indexed: 11/23/2022] Open
Abstract
Background Although intron loss in evolution has been described, the mechanism involved is still unclear. Three models have been proposed, the reverse transcriptase (RT) model, genomic deletion model and double-strand-break repair model. The RT model, also termed mRNA-mediated intron loss, suggests that cDNA molecules reverse transcribed from spliced mRNA recombine with genomic DNA causing intron loss. Many studies have attempted to test this model based on its predictions, such as simultaneous loss of adjacent introns, 3'-side bias of intron loss, and germline expression of intron-lost genes. Evidence either supporting or opposing the model has been reported. The mechanism of intron loss proposed in the RT model shares the process of reverse transcription with the formation of processed pseudogenes. If the RT model is correct, genes that have produced more processed pseudogenes are more likely to undergo intron loss. Results In the present study, we observed that the frequency of intron loss is correlated with processed pseudogene abundance by analyzing a new dataset of intron loss obtained in mice and rats. Furthermore, we found that mRNA molecules of intron-lost genes are mostly translated on free cytoplasmic ribosomes, a feature shared by mRNA molecules of the parental genes of processed pseudogenes and long interspersed elements. This feature is likely convenient for intron-lost gene mRNA molecules to be reverse transcribed. Analyses of adjacent intron loss, 3'-side bias of intron loss, and germline expression of intron-lost genes also support the RT model. Conclusions Compared with previous evidence, the correlation between the abundance of processed pseudogenes and intron loss frequency more directly supports the RT model of intron loss. Exploring such a correlation is a new strategy to test the RT model in organisms with abundant processed pseudogenes.
Collapse
|
14
|
Cohen NE, Shen R, Carmel L. The role of reverse transcriptase in intron gain and loss mechanisms. Mol Biol Evol 2011; 29:179-86. [PMID: 21804076 DOI: 10.1093/molbev/msr192] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Intron density is highly variable across eukaryotic species. It seems that different lineages have experienced considerably different levels of intron gain and loss events, but the reasons for this are not well known. A large number of mechanisms for intron loss and gain have been suggested, and most of them have at least some level of indirect support. We therefore figured out that the variability in intron density can be a reflection of the fact that different mechanisms are active in different lineages. Quite a number of these putative mechanisms, both for intron loss and for intron gain, postulate that the enzyme reverse transcriptase (RT) has a key role in the process. In this paper, we lay out three predictions whose approval or falsification gives indication for the involvement of RT in intron gain and loss processes. Testing these predictions requires data on the intron gain and loss rates of individual genes along different branches of the eukaryotic phylogenetic tree. So far, such rates could not be computed, and hence, these predictions could not be rigorously evaluated. Here, we use a maximum likelihood algorithm that we have devised in the past, Evolutionary Reconstruction by Expectation Maximization, which allows the estimation of such rates. Using this algorithm, we computed the intron loss and gain rates of more than 300 genes in each branch of the phylogenetic tree of 19 eukaryotic species. Based on that we found only little support for RT activity in intron gain. In contrast, we suggest that RT-mediated intron loss is a mechanism that is very efficient in removing introns, and thus, its levels of activity may be a major determinant of intron number. Moreover, we found that intron gain and loss rates are negatively correlated in intron-poor species but are positively correlated for intron-rich species. One explanation to this is that intron gain and loss mechanisms in intron-rich species (like metazoans) share a common mechanistic component, albeit not a RT.
Collapse
Affiliation(s)
- Noa E Cohen
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, Faculty of Science, The Hebrew University of Jerusalem, Jerusalem, Israel
| | | | | |
Collapse
|
15
|
Chavali S, Morais DADL, Gough J, Babu MM. Evolution of eukaryotic genome architecture: Insights from the study of a rapidly evolving metazoan, Oikopleura dioica. Bioessays 2011; 33:592-601. [DOI: 10.1002/bies.201100034] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
16
|
DNA double-strand break repair and the evolution of intron density. Trends Genet 2010; 27:1-6. [PMID: 21106271 PMCID: PMC3020277 DOI: 10.1016/j.tig.2010.10.004] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2010] [Revised: 10/18/2010] [Accepted: 10/18/2010] [Indexed: 01/23/2023]
Abstract
The density of introns is both an important feature of genome architecture and a highly variable trait across eukaryotes. This heterogeneity has posed an evolutionary puzzle for the last 30 years. Recent evidence is consistent with novel introns being the outcome of the error-prone repair of DNA double-stranded breaks (DSBs) via non-homologous end joining (NHEJ). Here we suggest that deletion of pre-existing introns could occur via the same pathway. We propose a novel framework in which species-specific differences in the activity of NHEJ and homologous recombination (HR) during the repair of DSBs underlie changes in intron density.
Collapse
|
17
|
Gabriško M, Janeček Š. Characterization of Maltase Clusters in the Genus Drosophila. J Mol Evol 2010; 72:104-18. [DOI: 10.1007/s00239-010-9406-3] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2010] [Accepted: 10/27/2010] [Indexed: 11/28/2022]
|
18
|
Mekouar M, Blanc-Lenfle I, Ozanne C, Da Silva C, Cruaud C, Wincker P, Gaillardin C, Neuvéglise C. Detection and analysis of alternative splicing in Yarrowia lipolytica reveal structural constraints facilitating nonsense-mediated decay of intron-retaining transcripts. Genome Biol 2010; 11:R65. [PMID: 20573210 PMCID: PMC2911113 DOI: 10.1186/gb-2010-11-6-r65] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2010] [Revised: 06/15/2010] [Accepted: 06/23/2010] [Indexed: 11/10/2022] Open
Abstract
Background Hemiascomycetous yeasts have intron-poor genomes with very few cases of alternative splicing. Most of the reported examples result from intron retention in Saccharomyces cerevisiae and some have been shown to be functionally significant. Here we used transcriptome-wide approaches to evaluate the mechanisms underlying the generation of alternative transcripts in Yarrowia lipolytica, a yeast highly divergent from S. cerevisiae. Results Experimental investigation of Y. lipolytica gene models identified several cases of alternative splicing, mostly generated by intron retention, principally affecting the first intron of the gene. The retention of introns almost invariably creates a premature termination codon, as a direct consequence of the structure of intron boundaries. An analysis of Y. lipolytica introns revealed that introns of multiples of three nucleotides in length, particularly those without stop codons, were underrepresented. In other organisms, premature termination codon-containing transcripts are targeted for degradation by the nonsense-mediated mRNA decay (NMD) machinery. In Y. lipolytica, homologs of S. cerevisiae UPF1 and UPF2 genes were identified, but not UPF3. The inactivation of Y. lipolytica UPF1 and UPF2 resulted in the accumulation of unspliced transcripts of a test set of genes. Conclusions Y. lipolytica is the hemiascomycete with the most intron-rich genome sequenced to date, and it has several unusual genes with large introns or alternative transcription start sites, or introns in the 5' UTR. Our results suggest Y. lipolytica intron structure is subject to significant constraints, leading to the under-representation of stop-free introns. Consequently, intron-containing transcripts are degraded by a functional NMD pathway.
Collapse
Affiliation(s)
- Meryem Mekouar
- INRA UMR1319 Micalis - AgroParisTech, Biologie intégrative du métabolisme lipidique microbien, Bât, CBAI, 78850 Thiverval-Grignon, France
| | | | | | | | | | | | | | | |
Collapse
|
19
|
Haenni S, Sharpe HE, Gravato Nobre M, Zechner K, Browne C, Hodgkin J, Furger A. Regulation of transcription termination in the nematode Caenorhabditis elegans. Nucleic Acids Res 2009; 37:6723-36. [PMID: 19740764 PMCID: PMC2777434 DOI: 10.1093/nar/gkp744] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The current predicted mechanisms that describe RNA polymerase II (pol II) transcription termination downstream of protein expressing genes fail to adequately explain, how premature termination is prevented in eukaryotes that possess operon-like structures. Here we address this issue by analysing transcription termination at the end of single protein expressing genes and genes located within operons in the nematode Caenorhabditis elegans. By using a combination of RT-PCR and ChIP analysis we found that pol II generally transcribes up to 1 kb past the poly(A) sites into the 3' flanking regions of the nematode genes before it terminates. We also show that pol II does not terminate after transcription of internal poly(A) sites in operons. We provide experimental evidence that five randomly chosen C. elegans operons are transcribed as polycistronic pre-mRNAs. Furthermore, we show that cis-splicing of the first intron located in downstream positioned genes in these polycistronic pre-mRNAs is critical for their expression and may play a role in preventing premature pol II transcription termination.
Collapse
Affiliation(s)
- Simon Haenni
- Genetics Unit, Department of Biochemistry, University of Oxford, Oxford OX1 3QU, UK
| | | | | | | | | | | | | |
Collapse
|
20
|
Niu DK. Exon definition as a potential negative force against intron losses in evolution. Biol Direct 2008; 3:46. [PMID: 19014515 PMCID: PMC2614967 DOI: 10.1186/1745-6150-3-46] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2008] [Accepted: 11/13/2008] [Indexed: 12/03/2022] Open
Abstract
Background Previous studies have indicated that the wide variation in intron density (the number of introns per gene) among different eukaryotes largely reflects varying degrees of intron loss during evolution. The most popular model, which suggests that organisms lose introns through a mechanism in which reverse-transcribed cDNA recombines with the genomic DNA, concerns only one mutational force. Hypothesis Using exons as the units of splicing-site recognition, exon definition constrains the length of exons. An intron-loss event results in fusion of flanking exons and thus a larger exon. The large size of the newborn exon may cause splicing errors, i.e., exon skipping, if the splicing of pre-mRNAs is initiated by exon definition. By contrast, if the splicing of pre-mRNAs is initiated by intron definition, intron loss does not matter. Exon definition may thus be a selective force against intron loss. An organism with a high frequency of exon definition is expected to experience a low rate of intron loss throughout evolution and have a high density of spliceosomal introns. Conclusion The majority of spliceosomal introns in vertebrates may be maintained during evolution not because of potential functions, but because of their splicing mechanism (i.e., exon definition). Further research is required to determine whether exon definition is a negative force in maintaining the high intron density of vertebrates. Reviewers This article was reviewed by Dr. Scott W. Roy (nominated by Dr. John Logsdon), Dr. Eugene V. Koonin, and Dr. Igor B. Rogozin (nominated by Dr. Mikhail Gelfand). For the full reviews, please go to the Reviewers' comments section.
Collapse
Affiliation(s)
- Deng-Ke Niu
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, PR China.
| |
Collapse
|
21
|
Bradnam KR, Korf I. Longer first introns are a general property of eukaryotic gene structure. PLoS One 2008; 3:e3093. [PMID: 18769727 PMCID: PMC2518113 DOI: 10.1371/journal.pone.0003093] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2008] [Accepted: 08/11/2008] [Indexed: 11/19/2022] Open
Abstract
While many properties of eukaryotic gene structure are well characterized, differences in the form and function of introns that occur at different positions within a transcript are less well understood. In particular, the dynamics of intron length variation with respect to intron position has received relatively little attention. This study analyzes all available data on intron lengths in GenBank and finds a significant trend of increased length in first introns throughout a wide range of species. This trend was found to be even stronger when using high-confidence gene annotation data for three model organisms (Arabidopsis thaliana, Caenorhabditis elegans, and Drosophila melanogaster) which show that the first intron in the 5' UTR is--on average--significantly longer than all downstream introns within a gene. A partial explanation for increased first intron length in A. thaliana is suggested by the increased frequency of certain motifs that are present in first introns. The phenomenon of longer first introns can potentially be used to improve gene prediction software and also to detect errors in existing gene annotations.
Collapse
Affiliation(s)
- Keith R Bradnam
- Genome Center, University of California Davis, Davis, California, USA.
| | | |
Collapse
|
22
|
O'Toole N, Hattori M, Andres C, Iida K, Lurin C, Schmitz-Linneweber C, Sugita M, Small I. On the expansion of the pentatricopeptide repeat gene family in plants. Mol Biol Evol 2008; 25:1120-8. [PMID: 18343892 DOI: 10.1093/molbev/msn057] [Citation(s) in RCA: 285] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Pentatricopeptide repeat (PPR) proteins form a huge family in plants (450 members in Arabidopsis and 477 in rice) defined by tandem repetitions of characteristic sequence motifs. Some of these proteins have been shown to play a role in posttranscriptional processes within organelles, and they are thought to be sequence-specific RNA-binding proteins. The origins of this family are obscure as they are lacking from almost all prokaryotes, and the spectacular expansion of the family in land plants is equally enigmatic. In this study, we investigate the growth of the family in plants by undertaking a genome-wide identification and comparison of the PPR genes of 3 organisms: the flowering plants Arabidopsis thaliana and Oryza sativa and the moss Physcomitrella patens. A large majority of the PPR genes in each of the flowering plants are intron less. In contrast, most of the 103 PPR genes in Physcomitrella are intron rich. A phylogenetic comparison of the PPR genes in all 3 species shows similarities between the intron-rich PPR genes in Physcomitrella and the few intron-rich PPR genes in higher plants. Intron-poor PPR genes in all 3 species also display a bias toward a position of their introns at their 5' ends. These results provide compelling evidence that one or more waves of retrotransposition were responsible for the expansion of the PPR gene family in flowering plants. The differing numbers of PPR proteins are highly correlated with differences in organellar RNA editing between the 3 species.
Collapse
Affiliation(s)
- Nicholas O'Toole
- Centre for Computational Systems Biology, University of Western Australia, Perth, Australia
| | | | | | | | | | | | | | | |
Collapse
|
23
|
Irimia M, Roy SW. Spliceosomal introns as tools for genomic and evolutionary analysis. Nucleic Acids Res 2008; 36:1703-12. [PMID: 18263615 PMCID: PMC2275149 DOI: 10.1093/nar/gkn012] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Over the past 5 years, the availability of dozens of whole genomic sequences from a wide variety of eukaryotic lineages has revealed a very large amount of information about the dynamics of intron loss and gain through eukaryotic history, as well as the evolution of intron sequences. Implicit in these advances is a great deal of information about the structure and evolution of surrounding sequences. Here, we review the wealth of ways in which structures of spliceosomal introns as well as their conservation and change through evolution may be harnessed for evolutionary and genomic analysis. First, we discuss uses of intron length distributions and positions in sequence assembly and annotation, and for improving alignment of homologous regions. Second, we review uses of introns in evolutionary studies, including the utility of introns as indicators of rates of sequence evolution, for inferences about molecular evolution, as signatures of orthology and paralogy, and for estimating rates of nucleotide substitution. We conclude with a discussion of phylogenetic methods utilizing intron sequences and positions.
Collapse
Affiliation(s)
- Manuel Irimia
- Departament de Genètica, Universitat de Barcelona, Barcelona, Spain
| | | |
Collapse
|
24
|
Zhou H, Lin K. Excess of microRNAs in large and very 5' biased introns. Biochem Biophys Res Commun 2008; 368:709-15. [PMID: 18249189 DOI: 10.1016/j.bbrc.2008.01.117] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2008] [Accepted: 01/27/2008] [Indexed: 11/29/2022]
Abstract
Many of microRNAs (miRNAs) and small nucleolar RNAs (snoRNAs) are located within the introns of genes in eukaryotes. Contrary to intronic snoRNAs, intronic miRNAs are processed from unspliced intronic regions before the catalysis of splicing in vertebrates. By analyzing the distribution patterns of the length and position of the introns hosting these two groups of small RNA genes, we observed that both human and mouse intronic miRNAs tended to be present in large introns, and miRNA host introns have a more 5'-biased position distribution compared with all other introns among the two genomes. These observations indicate that the negative selection of functional constraints might affect the intron size in both genomes. Interestingly, the very 5'-biased positions of miRNA host introns may be necessary for the transcription and regulation of intronic miRNAs to utilize the regulatory signals within the 5'-UTRs of their host genes.
Collapse
Affiliation(s)
- Hongjun Zhou
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and College of Life Sciences, Beijing Normal University, No. 19, Xinjiekouwai Street, Beijing 100875, China
| | | |
Collapse
|
25
|
Abstract
Research into the origins of introns is at a critical juncture in the resolution of theories on the evolution of early life (which came first, RNA or DNA?), the identity of LUCA (the last universal common ancestor, was it prokaryotic- or eukaryotic-like?), and the significance of noncoding nucleotide variation. One early notion was that introns would have evolved as a component of an efficient mechanism for the origin of genes. But alternative theories emerged as well. From the debate between the "introns-early" and "introns-late" theories came the proposal that introns arose before the origin of genetically encoded proteins and DNA, and the more recent "introns-first" theory, which postulates the presence of introns at that early evolutionary stage from a reconstruction of the "RNA world." Here we review seminal and recent ideas about intron origins. Recent discoveries about the patterns and causes of intron evolution make this one of the most hotly debated and exciting topics in molecular evolutionary biology today.
Collapse
Affiliation(s)
- Francisco Rodríguez-Trelles
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697-2525, USA.
| | | | | |
Collapse
|
26
|
Nielsen H, Wernersson R. An overabundance of phase 0 introns immediately after the start codon in eukaryotic genes. BMC Genomics 2006; 7:256. [PMID: 17034638 PMCID: PMC1626468 DOI: 10.1186/1471-2164-7-256] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2006] [Accepted: 10/11/2006] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND A knowledge of the positions of introns in eukaryotic genes is important for understanding the evolution of introns. Despite this, there has been relatively little focus on the distribution of intron positions in genes. RESULTS In proteins with signal peptides, there is an overabundance of phase 1 introns around the region of the signal peptide cleavage site. This has been described before. But in proteins without signal peptides, a novel phenomenon is observed: There is a sharp peak of phase 0 intron positions immediately following the start codon, i.e. between codons 1 and 2. This effect is seen in a wide range of eukaryotes: Vertebrates, arthropods, fungi, and flowering plants. Proteins carrying this start codon intron are found to comprise a special class of relatively short, lysine-rich and conserved proteins with an overrepresentation of ribosomal proteins. In addition, there is a peak of phase 0 introns at position 5 in Drosophila genes with signal peptides, predominantly representing cuticle proteins. CONCLUSION There is an overabundance of phase 0 introns immediately after the start codon in eukaryotic genes, which has been described before only for human ribosomal proteins. We give a detailed description of these start codon introns and the proteins that contain them.
Collapse
Affiliation(s)
- Henrik Nielsen
- Center for Biological Sequence Analysis, Technical University of Denmark, Building 208, 2800 Lyngby, Denmark
| | - Rasmus Wernersson
- Center for Biological Sequence Analysis, Technical University of Denmark, Building 208, 2800 Lyngby, Denmark
| |
Collapse
|
27
|
Knowles DG, McLysaght A. High Rate of Recent Intron Gain and Loss in Simultaneously Duplicated Arabidopsis Genes. Mol Biol Evol 2006; 23:1548-57. [PMID: 16720694 DOI: 10.1093/molbev/msl017] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We examined the gene structure of a set of 2563 Arabidopsis thaliana paralogous pairs that were duplicated simultaneously 20-60 MYA by tetraploidy. Out of a total of 23,164 introns in these genes, we found that 10,004 pairs have been conserved and 578 introns have been inserted or deleted in the time since the duplication event. This intron insertion/deletion rate of 2.7 x 10(-3) to 9.1 x 10(-4) per site per million years is high in comparison to previous studies. At least 56 introns were gained and 39 lost based on parsimony analysis of the phylogenetic distribution of these introns. We found weak evidence that genes undergoing intron gain and loss are biased with respect to gene ontology terms. Gene pairs that experienced at least 2 intron insertions or deletions show evidence of enrichment for membrane location and transport and transporter activity function. We do not find any relationship of intron flux to expression level or G + C content of the gene. Detection of a bias in the location of intron gains and losses within a gene depends on the method of measurement: an intragene method indicates that events (specifically intron losses) are biased toward the 3' end of the gene. Despite the relatively recent acquisition of these introns, we found only one case where we could identify the mechanism of intron origin--the TOUCH3 gene has experienced 2 tandem, partial, internal gene duplications that duplicated a preexisting intron and also created a novel, alternatively spliced intron that makes use of a duplicated pair of cryptic splice sites.
Collapse
Affiliation(s)
- David G Knowles
- Smurfit Institute of Genetics, University of Dublin, Trinity College, Dublin, Ireland
| | | |
Collapse
|
28
|
Roy SW, Hartl DL. Very little intron loss/gain in Plasmodium: intron loss/gain mutation rates and intron number. Genome Res 2006; 16:750-6. [PMID: 16702411 PMCID: PMC1473185 DOI: 10.1101/gr.4845406] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We compared intron positions in conserved regions of 3479 orthologous gene pairs from Plasmodium falciparum and Plasmodium yoelii, which likely diverged >or=100 million years ago (Mya). Only 27 out of 2212 positions were specific to one of the two species. Intron presence in related species shows that at least 19 and possibly 26 of the changes are due to intron loss, depending on phylogeny. The implied intron loss and gain rates are much lower than previously estimated for nematodes, arthropods, fungi, and plants, and are comparable only with the rates in vertebrates. That all observed changes were exact, occurring without loss or gain of flanking coding sequence, suggests intron loss via an mRNA intermediate, as does a nonsignificant trend toward loss of introns at adjacent positions. Many of the intron changes occurred in genes encoding proteins involved in nucleic acid-related processes, as previously found for intron gains in nematodes. Two changes occurred in the chloroquine resistance transporter, suggesting a role for positive selection in intron loss in Plasmodium. The dearth of intron loss and gain could be explained by the lack of known transposable elements in Plasmodium, since transposable elements and/or reverse transcriptase are thought to be necessary for both processes. The observed pattern suggests that the availability of stochastic intron loss and gain mutations can be a major determinant of changes in intron number.
Collapse
Affiliation(s)
- Scott William Roy
- Department of Organismic and Evolutionary Biology, Harvard, Cambridge, Massachusetts 02138, USA.
| | | |
Collapse
|