1
|
Larue GE, Roy SW. Where the minor things are: a pan-eukaryotic survey suggests neutral processes may explain much of minor intron evolution. Nucleic Acids Res 2023; 51:10884-10908. [PMID: 37819006 PMCID: PMC10639083 DOI: 10.1093/nar/gkad797] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 09/12/2023] [Accepted: 09/19/2023] [Indexed: 10/13/2023] Open
Abstract
Spliceosomal introns are gene segments removed from RNA transcripts by ribonucleoprotein machineries called spliceosomes. In some eukaryotes a second 'minor' spliceosome is responsible for processing a tiny minority of introns. Despite its seemingly modest role, minor splicing has persisted for roughly 1.5 billion years of eukaryotic evolution. Identifying minor introns in over 3000 eukaryotic genomes, we report diverse evolutionary histories including surprisingly high numbers in some fungi and green algae, repeated loss, as well as general biases in their positional and genic distributions. We estimate that ancestral minor intron densities were comparable to those of vertebrates, suggesting a trend of long-term stasis. Finally, three findings suggest a major role for neutral processes in minor intron evolution. First, highly similar patterns of minor and major intron evolution contrast with both functionalist and deleterious model predictions. Second, observed functional biases among minor intron-containing genes are largely explained by these genes' greater ages. Third, no association of intron splicing with cell proliferation in a minor intron-rich fungus suggests that regulatory roles are lineage-specific and thus cannot offer a general explanation for minor splicing's persistence. These data constitute the most comprehensive view of minor introns and their evolutionary history to date, and provide a foundation for future studies of these remarkable genetic elements.
Collapse
Affiliation(s)
- Graham E Larue
- Quantitative and Systems Biology Graduate Program, University of California Merced, Merced, CA 95343, USA
| | - Scott W Roy
- Department of Molecular and Cell Biology, University of California Merced, Merced, CA 95343, USA
- Department of Biology, San Francisco State University, San Francisco, CA 94132, USA
| |
Collapse
|
2
|
Abstract
Within the next decade, the genomes of 1.8 million eukaryotic species will be sequenced. Identifying genes in these sequences is essential to understand the biology of the species. This is challenging due to the transcriptional complexity of eukaryotic genomes, which encode hundreds of thousands of transcripts of multiple types. Among these, a small set of protein-coding mRNAs play a disproportionately large role in defining phenotypes. Due to their sequence conservation, orthology can be established, making it possible to define the universal catalog of eukaryotic protein-coding genes. This catalog should substantially contribute to uncovering the genomic events underlying the emergence of eukaryotic phenotypes. This piece briefly reviews the basics of protein-coding gene prediction, discusses challenges in finalizing annotation of the human genome, and proposes strategies for producing annotations across the eukaryotic Tree of Life. This lays the groundwork for obtaining the catalog of all genes-the Earth's code of life.
Collapse
Affiliation(s)
- Roderic Guigó
- Bioinformatics and Genomics, Center for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Dr. Aiguader 88, 08003 Barcelona, Catalonia
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia
| |
Collapse
|
3
|
Nuadthaisong J, Phetruen T, Techawisutthinan C, Chanarat S. Insights into the Mechanism of Pre-mRNA Splicing of Tiny Introns from the Genome of a Giant Ciliate Stentor coeruleus. Int J Mol Sci 2022; 23:ijms231810973. [PMID: 36142882 PMCID: PMC9505925 DOI: 10.3390/ijms231810973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 09/10/2022] [Accepted: 09/14/2022] [Indexed: 12/03/2022] Open
Abstract
Stentor coeruleus is a ciliate known for its regenerative ability. Recent genome sequencing reveals that its spliceosomal introns are exceptionally small. We wondered whether the multimegadalton spliceosome has any unique characteristics for removal of the tiny introns. First, we analyzed intron features and identified spliceosomal RNA/protein components. We found that all snRNAs are present, whereas many proteins are conserved but slightly reduced in size. Some regulators, such as Serine/Arginine-rich proteins, are noticeably undetected. Interestingly, while most parts of spliceosomal proteins, including Prp8′s positively charged catalytic cavity, are conserved, regions of branching factors projecting to the active site are not. We conjecture that steric-clash avoidance between spliceosomal proteins and a sharply looped lariat might occur, and splicing regulation may differ from other species.
Collapse
|
4
|
Poverennaya IV, Roytberg MA. Spliceosomal Introns: Features, Functions, and Evolution. BIOCHEMISTRY (MOSCOW) 2021; 85:725-734. [PMID: 33040717 DOI: 10.1134/s0006297920070019] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Spliceosomal introns, which have been found in most eukaryotic genes, are non-coding sequences excised from pre-mRNAs by a special complex called spliceosome during mRNA splicing. Introns occur in both protein- and RNA-coding genes and can be found in coding and untranslated gene regions. Because intron sequences vary greatly due to a high rate of polymorphism, the functions of intron had been for a long time associated only with alternative splicing, while intron evolution had been viewed not as an evolution of an individual genomic element, but rather considered within a framework of the evolution of the gene intron-exon structure. Here, we review the theories of intron origin, evolutionary events in the exon-intron structure, such as intron gain, loss, and sliding, intron functions known to date, and mechanisms by which changes in the intron features (length and phase) can affect the regulation of gene-mediated processes.
Collapse
Affiliation(s)
- I V Poverennaya
- Vavilov Institute of General Genetics, Russian Academy of Sciences, 119991, Moscow, Russia. .,Institute of Mathematical Problems in Biology, Keldysh Branch of Institute of Applied Mathematics, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russia
| | - M A Roytberg
- Institute of Mathematical Problems in Biology, Keldysh Branch of Institute of Applied Mathematics, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russia.,Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141701, Russia.,Higher School of Economics, Moscow, 101000, Russia
| |
Collapse
|
5
|
Rathore OS, Silva RD, Ascensão-Ferreira M, Matos R, Carvalho C, Marques B, Tiago MN, Prudêncio P, Andrade RP, Roignant JY, Barbosa-Morais NL, Martinho RG. NineTeen Complex-subunit Salsa is required for efficient splicing of a subset of introns and dorsal-ventral patterning. RNA (NEW YORK, N.Y.) 2020; 26:1935-1956. [PMID: 32963109 PMCID: PMC7668242 DOI: 10.1261/rna.077446.120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 09/07/2020] [Indexed: 06/11/2023]
Abstract
The NineTeen Complex (NTC), also known as pre-mRNA-processing factor 19 (Prp19) complex, regulates distinct spliceosome conformational changes necessary for splicing. During Drosophila midblastula transition, splicing is particularly sensitive to mutations in NTC-subunit Fandango, which suggests differential requirements of NTC during development. We show that NTC-subunit Salsa, the Drosophila ortholog of human RNA helicase Aquarius, is rate-limiting for splicing of a subset of small first introns during oogenesis, including the first intron of gurken Germline depletion of Salsa and splice site mutations within gurken first intron impair both adult female fertility and oocyte dorsal-ventral patterning, due to an abnormal expression of Gurken. Supporting causality, the fertility and dorsal-ventral patterning defects observed after Salsa depletion could be suppressed by the expression of a gurken construct without its first intron. Altogether, our results suggest that one of the key rate-limiting functions of Salsa during oogenesis is to ensure the correct expression and efficient splicing of the first intron of gurken mRNA. Retention of gurken first intron compromises the function of this gene most likely because it undermines the correct structure and function of the transcript 5'UTR.
Collapse
Affiliation(s)
- Om Singh Rathore
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
| | - Rui D Silva
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
| | - Mariana Ascensão-Ferreira
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal
| | - Ricardo Matos
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
| | - Célia Carvalho
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal
| | - Bruno Marques
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
| | - Margarida N Tiago
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
| | - Pedro Prudêncio
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal
| | - Raquel P Andrade
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
- Department of Medicine and Biomedical Sciences and Algarve Biomedical Center, Universidade do Algarve, 8005-139 Faro, Portugal
| | - Jean-Yves Roignant
- Center for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne, CH-1015 Lausanne, Switzerland
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, 55128 Mainz, Germany
| | - Nuno L Barbosa-Morais
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal
| | - Rui Gonçalo Martinho
- Center for Biomedical Research (CBMR), Universidade do Algarve, Faro, 8005-139 Portugal
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028 Lisboa, Portugal
- Department of Medical Sciences and Institute for Biomedicine (iBiMED), Universidade de Aveiro, 3810-193 Aveiro, Portugal
| |
Collapse
|
6
|
Parenteau J, Abou Elela S. Introns: Good Day Junk Is Bad Day Treasure. Trends Genet 2019; 35:923-934. [PMID: 31668856 DOI: 10.1016/j.tig.2019.09.010] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 08/28/2019] [Accepted: 09/19/2019] [Indexed: 02/01/2023]
Abstract
Introns are ubiquitous in eukaryotic transcripts. They are often viewed as junk RNA but the huge energetic burden of transcribing, removing, and degrading them suggests a significant evolutionary advantage. Ostensibly, an intron functions within the host pre-mRNA to regulate its splicing, transport, and degradation. However, recent studies have revealed an entirely new class of trans-acting functions where the presence of intronic RNA in the cell impacts the expression of other genes in trans. Here, we review possible new mechanisms of intron functions, with a focus on the role of yeast introns in regulating the cell growth response to starvation.
Collapse
Affiliation(s)
- Julie Parenteau
- Département de microbiologie et d'infectiologie, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada
| | - Sherif Abou Elela
- Département de microbiologie et d'infectiologie, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Sherbrooke, QC J1E 4K8, Canada.
| |
Collapse
|
7
|
Krzyzanowski PM, Sircoulomb F, Yousif F, Normand J, La Rose J, E Francis K, Suarez F, Beck T, McPherson JD, Stein LD, Rottapel RK. Regional perturbation of gene transcription is associated with intrachromosomal rearrangements and gene fusion transcripts in high grade ovarian cancer. Sci Rep 2019; 9:3590. [PMID: 30837567 PMCID: PMC6401071 DOI: 10.1038/s41598-019-39878-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Accepted: 01/30/2019] [Indexed: 01/10/2023] Open
Abstract
Genomic rearrangements are a hallmark of cancer biology and progression, allowing cells to rapidly transform through alterations in regulatory structures, changes in expression patterns, reprogramming of signaling pathways, and creation of novel transcripts via gene fusion events. Though functional gene fusions encoding oncogenic proteins are the most dramatic outcomes of genomic rearrangements, we investigated the relationship between rearrangements evidenced by fusion transcripts and local expression changes in cancer using transcriptome data alone. 9,953 gene fusion predictions from 418 primary serious ovarian cancer tumors were analyzed, identifying depletions of gene fusion breakpoints within coding regions of fused genes as well as an N-terminal enrichment of breakpoints within fused genes. We identified 48 genes with significant fusion-associated upregulation and furthermore demonstrate that significant regional overexpression of intact genes in patient transcriptomes occurs within 1 megabase of 78 novel gene fusions that function as central markers of these regions. We reveal that cancer transcriptomes select for gene fusions that preserve protein and protein domain coding potential. The association of gene fusion transcripts with neighboring gene overexpression supports rearrangements as mechanism through which cancer cells remodel their transcriptomes and identifies a new way to utilize gene fusions as indicators of regional expression changes in diseased cells with only transcriptomic data.
Collapse
Affiliation(s)
- Paul M Krzyzanowski
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada.
| | - Fabrice Sircoulomb
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada
| | - Fouad Yousif
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada
| | - Josee Normand
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Jose La Rose
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada
| | - Kyle E Francis
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada
| | - Fernando Suarez
- Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada
| | - Tim Beck
- Human Longevity Inc., San Diego, California, USA
| | - John D McPherson
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada.,University of California, Davis Medical Center, Sacramento, California, USA
| | - Lincoln D Stein
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada. .,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
| | - Robert K Rottapel
- Department of Medicine, University of Toronto, Ontario Institute for Cancer Research, MaRS Centre, Toronto, Ontario, Canada. .,Department of Immunology, University of Toronto, Princess Margaret Cancer Center, MaRS Centre, Toronto, Ontario, Canada.
| |
Collapse
|
8
|
Rabokon A, Demkovych A, Sozinov A, Kozub N, Sozinov I, Pirko Y, Blume Y. Intron length polymorphism of β-tubulin genes of Aegilops biuncialis Vis. Cell Biol Int 2017; 43:1031-1039. [PMID: 29024189 DOI: 10.1002/cbin.10886] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2017] [Accepted: 10/08/2017] [Indexed: 12/13/2022]
Abstract
Intron-specific DNA polymorphism is present among plant β-tubulin gene family members and is considered to be one of the molecular markers based on the difference of tubulin introns length assayed both separately (TBP: 1st intron) or in combination (h-TBP: 1st and 2nd introns). These two approaches are possibly useful for wheat breeding programs, since TBP and h-TBP help to differentiate between the accessions of Aegilops biuncialis Vis., a wild relative of wheat. PCR-derived polymorphic fragments were resolved by PAGE electrophoresis. The length of amplicons varied significantly (395-3900 bp for TBP and 466-3440 bp for h-TBP), while the numbers of polymorphic bands were 21 for TBP and 23 for h-TBP, respectively. PIC mean value was circa 0.3. Dendrograms constructed on the basis of the Nei and Li coefficient with the high bootstrap support reveal a similar order of hierarchy for the samples analyzed using both methods. Thus, both techniques uncover DNA polymorphism level sufficiently high to distinguish different accessions of Ae. biuncialis Vis.
Collapse
Affiliation(s)
- Anastasiia Rabokon
- Institute of Food Biotechnology and Genomics, Osipovskogo St., 2a, Kyiv-123, 04123, Ukraine
| | - Andrii Demkovych
- Institute of Food Biotechnology and Genomics, Osipovskogo St., 2a, Kyiv-123, 04123, Ukraine
| | - Alexei Sozinov
- Institute of Food Biotechnology and Genomics, Osipovskogo St., 2a, Kyiv-123, 04123, Ukraine
| | - Natalia Kozub
- Institute of Food Biotechnology and Genomics, Osipovskogo St., 2a, Kyiv-123, 04123, Ukraine.,Institute of Plant Protection, Vasylkivska St., 33, Kyiv-022, 03022, Ukraine
| | - Igor Sozinov
- Institute of Plant Protection, Vasylkivska St., 33, Kyiv-022, 03022, Ukraine
| | - Yaroslav Pirko
- Institute of Food Biotechnology and Genomics, Osipovskogo St., 2a, Kyiv-123, 04123, Ukraine
| | - Yaroslav Blume
- Institute of Food Biotechnology and Genomics, Osipovskogo St., 2a, Kyiv-123, 04123, Ukraine
| |
Collapse
|
9
|
mRNA-Associated Processes and Their Influence on Exon-Intron Structure in Drosophila melanogaster. G3-GENES GENOMES GENETICS 2016; 6:1617-26. [PMID: 27172210 PMCID: PMC4889658 DOI: 10.1534/g3.116.029231] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
mRNA-associated processes and gene structure in eukaryotes are typically treated as separate research subjects. Here, we bridge this separation and leverage the extensive multidisciplinary work on Drosophila melanogaster to examine the roles that capping, splicing, cleavage/polyadenylation, and telescripting (i.e., the protection of nascent transcripts from premature cleavage/polyadenylation by the splicing factor U1) might play in shaping exon-intron architecture in protein-coding genes. Our findings suggest that the distance between subsequent internal 5′ splice sites (5′ss) in Drosophila genes is constrained such that telescripting effects are maximized, in theory, and thus nascent transcripts are less vulnerable to premature termination. Exceptionally weak 5′ss and constraints on intron-exon size at the gene 5′ end also indicate that capping might enhance the recruitment of U1 and, in turn, promote telescripting at this location. Finally, a positive correlation between last exon length and last 5′ss strength suggests that optimal donor splice sites in the proximity of the pre-mRNA tail may inhibit the processing of downstream polyadenylation signals more than weak donor splice sites do. These findings corroborate and build upon previous experimental and computational studies on Drosophila genes. They support the possibility, hitherto scantly explored, that mRNA-associated processes impose significant constraints on the evolution of eukaryotic gene structure.
Collapse
|
10
|
Abstract
The presence of intervening sequences, termed introns, is a defining characteristic of eukaryotic nuclear genomes. Once transcribed into pre-mRNA, these introns must be removed within the spliceosome before export of the processed mRNA to the cytoplasm, where it is translated into protein. Although intron loss has been demonstrated experimentally, several mysteries remain regarding the origin and propagation of introns. Indeed, documented evidence of gain of an intron has only been suggested by phylogenetic analyses. We report the use of a strategy that detects selected intron gain and loss events. We have experimentally verified, to our knowledge, the first demonstrations of intron transposition in any organism. From our screen, we detected two separate intron gain events characterized by the perfect transposition of a reporter intron into the yeast genes RPL8B and ADH2, respectively. We show that the newly acquired introns are able to be removed from their respective pre-mRNAs by the spliceosome. Additionally, the novel allele, RPL8Bint, is functional when overexpressed within the genome in a strain lacking the Rpl8 paralogue RPL8A, demonstrating that the gene targeted for intronogenesis is functional.
Collapse
|
11
|
Verhelst B, Van de Peer Y, Rouzé P. The complex intron landscape and massive intron invasion in a picoeukaryote provides insights into intron evolution. Genome Biol Evol 2014; 5:2393-401. [PMID: 24273312 PMCID: PMC3879977 DOI: 10.1093/gbe/evt189] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Genes in pieces and spliceosomal introns are a landmark of eukaryotes, with intron invasion usually assumed to have happened early on in evolution. Here, we analyze the intron landscape of Micromonas, a unicellular green alga in the Mamiellophyceae lineage, demonstrating the coexistence of several classes of introns and the occurrence of recent massive intron invasion. This study focuses on two strains, CCMP1545 and RCC299, and their related individuals from ocean samplings, showing that they not only harbor different classes of introns depending on their location in the genome, as for other Mamiellophyceae, but also uniquely carry several classes of repeat introns. These introns, dubbed introner elements (IEs), are found at novel positions in genes and have conserved sequences, contrary to canonical introns. This IE invasion has a huge impact on the genome, doubling the number of introns in the CCMP1545 strain. We hypothesize that each IE class originated from a single ancestral IE that has been colonizing the genome after strain divergence by inserting copies of itself into genes by intron transposition, likely involving reverse splicing. Along with similar cases recently observed in other organisms, our observations in Micromonas strains shed a new light on the evolution of introns, suggesting that intron gain is more widespread than previously thought.
Collapse
Affiliation(s)
- Bram Verhelst
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Belgium
| | | | | |
Collapse
|
12
|
Park SG, Hannenhalli S, Choi SS. Conservation in first introns is positively associated with the number of exons within genes and the presence of regulatory epigenetic signals. BMC Genomics 2014; 15:526. [PMID: 24964727 PMCID: PMC4085337 DOI: 10.1186/1471-2164-15-526] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Accepted: 06/18/2014] [Indexed: 01/04/2023] Open
Abstract
Background Genomes of higher eukaryotes have surprisingly long first introns and in some cases, the first introns have been shown to have higher conservation relative to other introns. However, the functional relevance of conserved regions in the first introns is poorly understood. Leveraging the recent ENCODE data, here we assess potential regulatory roles of conserved regions in the first intron of human genes. Results We first show that relative to other downstream introns, the first introns are enriched for blocks of highly conserved sequences. We also found that the first introns are enriched for several chromatin marks indicative of active regulatory regions and this enrichment of regulatory marks is correlated with enrichment of conserved blocks in the first intron; the enrichments of conservation and regulatory marks in first intron are not entirely explained by a general, albeit variable, bias for certain marks toward the 5’ end of introns. Interestingly, conservation as well as proportions of active regulatory chromatin marks in the first intron of a gene correlates positively with the numbers of exons in the gene but the correlation is significantly weakened in second introns and negligible beyond the second intron. The first intron conservation is also positively correlated with the gene’s expression level in several human tissues. Finally, a gene-wise analysis shows significant enrichments of active chromatin marks in conserved regions of first introns, relative to the conserved regions in other introns of the same gene. Conclusions Taken together, our analyses strongly suggest that first introns are enriched for active transcriptional regulatory signals under purifying selection. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-526) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Sridhar Hannenhalli
- Department of Cell Biology and Molecular Genetics, Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, MD 20742, USA.
| | | |
Collapse
|
13
|
Frequency of intron loss correlates with processed pseudogene abundance: a novel strategy to test the reverse transcriptase model of intron loss. BMC Biol 2013; 11:23. [PMID: 23497167 PMCID: PMC3652778 DOI: 10.1186/1741-7007-11-23] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2013] [Accepted: 03/05/2013] [Indexed: 11/23/2022] Open
Abstract
Background Although intron loss in evolution has been described, the mechanism involved is still unclear. Three models have been proposed, the reverse transcriptase (RT) model, genomic deletion model and double-strand-break repair model. The RT model, also termed mRNA-mediated intron loss, suggests that cDNA molecules reverse transcribed from spliced mRNA recombine with genomic DNA causing intron loss. Many studies have attempted to test this model based on its predictions, such as simultaneous loss of adjacent introns, 3'-side bias of intron loss, and germline expression of intron-lost genes. Evidence either supporting or opposing the model has been reported. The mechanism of intron loss proposed in the RT model shares the process of reverse transcription with the formation of processed pseudogenes. If the RT model is correct, genes that have produced more processed pseudogenes are more likely to undergo intron loss. Results In the present study, we observed that the frequency of intron loss is correlated with processed pseudogene abundance by analyzing a new dataset of intron loss obtained in mice and rats. Furthermore, we found that mRNA molecules of intron-lost genes are mostly translated on free cytoplasmic ribosomes, a feature shared by mRNA molecules of the parental genes of processed pseudogenes and long interspersed elements. This feature is likely convenient for intron-lost gene mRNA molecules to be reverse transcribed. Analyses of adjacent intron loss, 3'-side bias of intron loss, and germline expression of intron-lost genes also support the RT model. Conclusions Compared with previous evidence, the correlation between the abundance of processed pseudogenes and intron loss frequency more directly supports the RT model of intron loss. Exploring such a correlation is a new strategy to test the RT model in organisms with abundant processed pseudogenes.
Collapse
|
14
|
Rogozin IB, Carmel L, Csuros M, Koonin EV. Origin and evolution of spliceosomal introns. Biol Direct 2012; 7:11. [PMID: 22507701 PMCID: PMC3488318 DOI: 10.1186/1745-6150-7-11] [Citation(s) in RCA: 217] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 03/15/2012] [Indexed: 12/31/2022] Open
Abstract
Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers’ Reports section.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information NLM/NIH, 8600 Rockville Pike, Bldg, 38A, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
15
|
Evolutionary Genomics of Colias Phosphoglucose Isomerase (PGI) Introns. J Mol Evol 2012; 74:96-111. [DOI: 10.1007/s00239-012-9492-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2011] [Accepted: 02/15/2012] [Indexed: 10/28/2022]
|
16
|
Parenteau J, Durand M, Morin G, Gagnon J, Lucier JF, Wellinger RJ, Chabot B, Elela SA. Introns within ribosomal protein genes regulate the production and function of yeast ribosomes. Cell 2011; 147:320-31. [PMID: 22000012 DOI: 10.1016/j.cell.2011.08.044] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2011] [Revised: 07/05/2011] [Accepted: 08/22/2011] [Indexed: 12/13/2022]
Abstract
In budding yeast, the most abundantly spliced pre-mRNAs encode ribosomal proteins (RPs). To investigate the contribution of splicing to ribosome production and function, we systematically eliminated introns from all RP genes to evaluate their impact on RNA expression, pre-rRNA processing, cell growth, and response to stress. The majority of introns were required for optimal cell fitness or growth under stress. Most introns are found in duplicated RP genes, and surprisingly, in the majority of cases, deleting the intron from one gene copy affected the expression of the other in a nonreciprocal manner. Consistently, 70% of all duplicated genes were asymmetrically expressed, and both introns and gene deletions displayed copy-specific phenotypic effects. Together, our results indicate that splicing in yeast RP genes mediates intergene regulation and implicate the expression ratio of duplicated RP genes in modulating ribosome function.
Collapse
Affiliation(s)
- Julie Parenteau
- Laboratoire de génomique fonctionnelle de l'Université de Sherbrooke, Département de microbiologie et d'infectiologie, Faculté de médecine et des sciences de la santé, Université de Sherbrooke, Québec, Canada
| | | | | | | | | | | | | | | |
Collapse
|
17
|
Cohen NE, Shen R, Carmel L. The role of reverse transcriptase in intron gain and loss mechanisms. Mol Biol Evol 2011; 29:179-86. [PMID: 21804076 DOI: 10.1093/molbev/msr192] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Intron density is highly variable across eukaryotic species. It seems that different lineages have experienced considerably different levels of intron gain and loss events, but the reasons for this are not well known. A large number of mechanisms for intron loss and gain have been suggested, and most of them have at least some level of indirect support. We therefore figured out that the variability in intron density can be a reflection of the fact that different mechanisms are active in different lineages. Quite a number of these putative mechanisms, both for intron loss and for intron gain, postulate that the enzyme reverse transcriptase (RT) has a key role in the process. In this paper, we lay out three predictions whose approval or falsification gives indication for the involvement of RT in intron gain and loss processes. Testing these predictions requires data on the intron gain and loss rates of individual genes along different branches of the eukaryotic phylogenetic tree. So far, such rates could not be computed, and hence, these predictions could not be rigorously evaluated. Here, we use a maximum likelihood algorithm that we have devised in the past, Evolutionary Reconstruction by Expectation Maximization, which allows the estimation of such rates. Using this algorithm, we computed the intron loss and gain rates of more than 300 genes in each branch of the phylogenetic tree of 19 eukaryotic species. Based on that we found only little support for RT activity in intron gain. In contrast, we suggest that RT-mediated intron loss is a mechanism that is very efficient in removing introns, and thus, its levels of activity may be a major determinant of intron number. Moreover, we found that intron gain and loss rates are negatively correlated in intron-poor species but are positively correlated for intron-rich species. One explanation to this is that intron gain and loss mechanisms in intron-rich species (like metazoans) share a common mechanistic component, albeit not a RT.
Collapse
Affiliation(s)
- Noa E Cohen
- Department of Genetics, The Alexander Silberman Institute of Life Sciences, Faculty of Science, The Hebrew University of Jerusalem, Jerusalem, Israel
| | | | | |
Collapse
|
18
|
Da Lage JL, Maczkowiak F, Cariou ML. Phylogenetic distribution of intron positions in alpha-amylase genes of bilateria suggests numerous gains and losses. PLoS One 2011; 6:e19673. [PMID: 21611157 PMCID: PMC3096672 DOI: 10.1371/journal.pone.0019673] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2010] [Accepted: 04/03/2011] [Indexed: 11/19/2022] Open
Abstract
Most eukaryotes have at least some genes interrupted by introns. While it is well accepted that introns were already present at moderate density in the last eukaryote common ancestor, the conspicuous diversity of intron density among genomes suggests a complex evolutionary history, with marked differences between phyla. The question of the rates of intron gains and loss in the course of evolution and factors influencing them remains controversial. We have investigated a single gene family, alpha-amylase, in 55 species covering a variety of animal phyla. Comparison of intron positions across phyla suggests a complex history, with a likely ancestral intronless gene undergoing frequent intron loss and gain, leading to extant intron/exon structures that are highly variable, even among species from the same phylum. Because introns are known to play no regulatory role in this gene and there is no alternative splicing, the structural differences may be interpreted more easily: intron positions, sizes, losses or gains may be more likely related to factors linked to splicing mechanisms and requirements, and to recognition of introns and exons, or to more extrinsic factors, such as life cycle and population size. We have shown that intron losses outnumbered gains in recent periods, but that "resets" of intron positions occurred at the origin of several phyla, including vertebrates. Rates of gain and loss appear to be positively correlated. No phase preference was found. We also found evidence for parallel gains and for intron sliding. Presence of introns at given positions was correlated to a strong protosplice consensus sequence AG/G, which was much weaker in the absence of intron. In contrast, recent intron insertions were not associated with a specific sequence. In animal Amy genes, population size and generation time seem to have played only minor roles in shaping gene structures.
Collapse
Affiliation(s)
- Jean-Luc Da Lage
- Laboratoire Evolution, génomes et spéciation, UPR 9034 CNRS, Gif sur Yvette, France.
| | | | | |
Collapse
|
19
|
Chalamcharla VR, Curcio MJ, Belfort M. Nuclear expression of a group II intron is consistent with spliceosomal intron ancestry. Genes Dev 2010; 24:827-36. [PMID: 20351053 DOI: 10.1101/gad.1905010] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Group II introns are self-splicing RNAs found in eubacteria, archaea, and eukaryotic organelles. They are mechanistically similar to the metazoan nuclear spliceosomal introns; therefore, group II introns have been invoked as the progenitors of the eukaryotic pre-mRNA introns. However, the ability of group II introns to function outside of the bacteria-derived organelles is debatable, since they are not found in the nuclear genomes of eukaryotes. Here, we show that the Lactococcus lactis Ll.LtrB group II intron splices accurately and efficiently from different pre-mRNAs in a eukaryote, Saccharomyces cerevisiae. However, a pre-mRNA harboring a group II intron is spliced predominantly in the cytoplasm and is subject to nonsense-mediated mRNA decay (NMD), and the mature mRNA from which the group II intron is spliced is poorly translated. In contrast, a pre-mRNA bearing the Tetrahymena group I intron or the yeast spliceosomal ACT1 intron at the same location is not subject to NMD, and the mature mRNA is translated efficiently. Thus, a group II intron can splice from a nuclear transcript, but RNA instability and translation defects would have favored intron loss or evolution into protein-dependent spliceosomal introns, consistent with the bacterial group II intron ancestry hypothesis.
Collapse
|
20
|
van Diepeningen AD, Goedbloed DJ, Slakhorst SM, Koopmanschap AB, Maas MFPM, Hoekstra RF, Debets AJM. Mitochondrial recombination increases with age in Podospora anserina. Mech Ageing Dev 2010; 131:315-22. [PMID: 20226205 DOI: 10.1016/j.mad.2010.03.001] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2009] [Revised: 03/02/2010] [Accepted: 03/03/2010] [Indexed: 12/15/2022]
Abstract
With uniparental inheritance of mitochondria, there seems little reason for homologous recombination in mitochondria, but the machinery for mitochondrial recombination is quite well-conserved in many eukaryote species. In fungi and yeasts heteroplasmons may be formed when strains fuse and transfer of organelles takes place, making it possible to study mitochondrial recombination when introduced mitochondria contain different markers. A survey of wild-type isolates from a local population of the filamentous fungus Podospora anserina for the presence of seven optional mitochondrial introns indicated that mitochondrial recombination does take place in nature. Moreover the recombination frequency appeared to be correlated with age: the more rapidly ageing fraction of the population had a significantly lower linkage disequilibrium indicating more recombination. Direct confrontation experiments with heterokaryon incompatible strains with different mitochondrial markers at different (relative) age confirmed that mitochondrial recombination increases with age. We propose that with increasing mitochondrial damage over time, mitochondrial recombination - even within a homoplasmic population of mitochondria - is a mechanism that may restore mitochondrial function.
Collapse
Affiliation(s)
- Anne D van Diepeningen
- Laboratory of Genetics, Plant Sciences, Wageningen University, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands
| | | | | | | | | | | | | |
Collapse
|
21
|
Bradnam KR, Korf I. Longer first introns are a general property of eukaryotic gene structure. PLoS One 2008; 3:e3093. [PMID: 18769727 PMCID: PMC2518113 DOI: 10.1371/journal.pone.0003093] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2008] [Accepted: 08/11/2008] [Indexed: 11/19/2022] Open
Abstract
While many properties of eukaryotic gene structure are well characterized, differences in the form and function of introns that occur at different positions within a transcript are less well understood. In particular, the dynamics of intron length variation with respect to intron position has received relatively little attention. This study analyzes all available data on intron lengths in GenBank and finds a significant trend of increased length in first introns throughout a wide range of species. This trend was found to be even stronger when using high-confidence gene annotation data for three model organisms (Arabidopsis thaliana, Caenorhabditis elegans, and Drosophila melanogaster) which show that the first intron in the 5' UTR is--on average--significantly longer than all downstream introns within a gene. A partial explanation for increased first intron length in A. thaliana is suggested by the increased frequency of certain motifs that are present in first introns. The phenomenon of longer first introns can potentially be used to improve gene prediction software and also to detect errors in existing gene annotations.
Collapse
Affiliation(s)
- Keith R Bradnam
- Genome Center, University of California Davis, Davis, California, USA.
| | | |
Collapse
|
22
|
Mrinal N, Nagaraju J. Intron loss is associated with gain of function in the evolution of the gloverin family of antibacterial genes in Bombyx mori. J Biol Chem 2008; 283:23376-87. [PMID: 18524767 DOI: 10.1074/jbc.m801080200] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Gene duplication is a characteristic feature of eukaryotic genomes. Here we investigated the role of gene duplication in the evolution of the gloverin family of antibacterial genes (Bmglv1, Bmglv2, Bmglv3, and Bmglv4) in Bombyx mori. We observed the following two significant changes during the first duplication event: (i) loss of intronV, located in the 3'-untranslated region (UTR) of the ancestral gene Bmglv1, and (ii) 12-bp deletion in exon3. We show that loss of intronV during Bmglv1 to Bmglv2 duplication was associated with embryonic expression of Bmglv2. Gel mobility shift, chromatin immunoprecipitation, and immunodepletion assays identified chorion factor 2, a zinc finger protein, as the repressor molecule that bound to a 10-bp regulatory motif in intronV of Bmglv1 and repressed its transcription. gloverin paralogs that lacked intronV were independent of chorion factor 2 regulation and expressed in embryo. These results suggest that change in cis-regulation because of intron loss resulted in embryonic expression of Bmglv2-4, a gain of function over Bmglv1. Studies on the significance of intron loss have focused on introns present within the coding sequences for their potential effect on the open reading frame, whereas introns present in the UTRs of the genes were not given due attention. This study emphasizes the regulatory function of the 3'-UTR intron. In addition, we also studied the genomic loss and show that "in-frame" deletion of 12 nucleotides led to loss of amino acids IHDF resulting in the generation of a prepro-processing site in BmGlv2. As a result, the N-terminal pro-part of BmGlv2, but not of BmGlv1, gets processed in an infection-dependent manner suggesting that prepro-processing is an evolved feature in Gloverins.
Collapse
Affiliation(s)
- Nirotpal Mrinal
- Laboratory of Molecular Genetics, Centre for DNA Fingerprinting and Diagnostics, Hyderabad-500076, India
| | | |
Collapse
|
23
|
Basu MK, Makalowski W, Rogozin IB, Koonin EV. U12 intron positions are more strongly conserved between animals and plants than U2 intron positions. Biol Direct 2008; 3:19. [PMID: 18479526 PMCID: PMC2426677 DOI: 10.1186/1745-6150-3-19] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2008] [Accepted: 05/14/2008] [Indexed: 11/17/2022] Open
Abstract
We report that the positions of minor, U12 introns are conserved in orthologous genes from human and Arabidopsis to an even greater extent than the positions of the major, U2 introns. The U12 introns, especially, conserved ones are concentrated in 5'-portions of plant and animal genes, where the U12 to U2 conversions occurs preferentially in the 3'-portions of genes. These results are compatible with the hypothesis that the high level of conservation of U12 intron positions and their persistence in genomes despite the unidirectional U12 to U2 conversion are explained by the role of the slowly excised U12 introns in down-regulation of gene expression.
Collapse
Affiliation(s)
- Malay Kumar Basu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| | | | | | | |
Collapse
|
24
|
O'Toole N, Hattori M, Andres C, Iida K, Lurin C, Schmitz-Linneweber C, Sugita M, Small I. On the expansion of the pentatricopeptide repeat gene family in plants. Mol Biol Evol 2008; 25:1120-8. [PMID: 18343892 DOI: 10.1093/molbev/msn057] [Citation(s) in RCA: 285] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Pentatricopeptide repeat (PPR) proteins form a huge family in plants (450 members in Arabidopsis and 477 in rice) defined by tandem repetitions of characteristic sequence motifs. Some of these proteins have been shown to play a role in posttranscriptional processes within organelles, and they are thought to be sequence-specific RNA-binding proteins. The origins of this family are obscure as they are lacking from almost all prokaryotes, and the spectacular expansion of the family in land plants is equally enigmatic. In this study, we investigate the growth of the family in plants by undertaking a genome-wide identification and comparison of the PPR genes of 3 organisms: the flowering plants Arabidopsis thaliana and Oryza sativa and the moss Physcomitrella patens. A large majority of the PPR genes in each of the flowering plants are intron less. In contrast, most of the 103 PPR genes in Physcomitrella are intron rich. A phylogenetic comparison of the PPR genes in all 3 species shows similarities between the intron-rich PPR genes in Physcomitrella and the few intron-rich PPR genes in higher plants. Intron-poor PPR genes in all 3 species also display a bias toward a position of their introns at their 5' ends. These results provide compelling evidence that one or more waves of retrotransposition were responsible for the expansion of the PPR gene family in flowering plants. The differing numbers of PPR proteins are highly correlated with differences in organellar RNA editing between the 3 species.
Collapse
Affiliation(s)
- Nicholas O'Toole
- Centre for Computational Systems Biology, University of Western Australia, Perth, Australia
| | | | | | | | | | | | | | | |
Collapse
|
25
|
Zhou H, Lin K. Excess of microRNAs in large and very 5' biased introns. Biochem Biophys Res Commun 2008; 368:709-15. [PMID: 18249189 DOI: 10.1016/j.bbrc.2008.01.117] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2008] [Accepted: 01/27/2008] [Indexed: 11/29/2022]
Abstract
Many of microRNAs (miRNAs) and small nucleolar RNAs (snoRNAs) are located within the introns of genes in eukaryotes. Contrary to intronic snoRNAs, intronic miRNAs are processed from unspliced intronic regions before the catalysis of splicing in vertebrates. By analyzing the distribution patterns of the length and position of the introns hosting these two groups of small RNA genes, we observed that both human and mouse intronic miRNAs tended to be present in large introns, and miRNA host introns have a more 5'-biased position distribution compared with all other introns among the two genomes. These observations indicate that the negative selection of functional constraints might affect the intron size in both genomes. Interestingly, the very 5'-biased positions of miRNA host introns may be necessary for the transcription and regulation of intronic miRNAs to utilize the regulatory signals within the 5'-UTRs of their host genes.
Collapse
Affiliation(s)
- Hongjun Zhou
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and College of Life Sciences, Beijing Normal University, No. 19, Xinjiekouwai Street, Beijing 100875, China
| | | |
Collapse
|
26
|
Artamonova II, Gelfand MS. Comparative Genomics and Evolution of Alternative Splicing: The Pessimists' Science. Chem Rev 2007; 107:3407-30. [PMID: 17645315 DOI: 10.1021/cr068304c] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Irena I Artamonova
- Group of Bioinformatics, Vavilov Institute of General Genetics, RAS, Gubkina 3, Moscow 119991, Russia
| | | |
Collapse
|
27
|
Gazave E, Marqués-Bonet T, Fernando O, Charlesworth B, Navarro A. Patterns and rates of intron divergence between humans and chimpanzees. Genome Biol 2007; 8:R21. [PMID: 17309804 PMCID: PMC1852421 DOI: 10.1186/gb-2007-8-2-r21] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2006] [Revised: 12/08/2006] [Accepted: 02/19/2007] [Indexed: 01/08/2023] Open
Abstract
An analysis of human-chimpanzee intron divergence shows strong correlations between intron length and divergence and GC-content. Background Introns, which constitute the largest fraction of eukaryotic genes and which had been considered to be neutral sequences, are increasingly acknowledged as having important functions. Several studies have investigated levels of evolutionary constraint along introns and across classes of introns of different length and location within genes. However, thus far these studies have yielded contradictory results. Results We present the first analysis of human-chimpanzee intron divergence, in which differences in the number of substitutions per intronic site (Ki) can be interpreted as the footprint of different intensities and directions of the pressures of natural selection. Our main findings are as follows: there was a strong positive correlation between intron length and divergence; there was a strong negative correlation between intron length and GC content; and divergence rates vary along introns and depending on their ordinal position within genes (for instance, first introns are more GC rich, longer and more divergent, and divergence is lower at the 3' and 5' ends of all types of introns). Conclusion We show that the higher divergence of first introns is related to their larger size. Also, the lower divergence of short introns suggests that they may harbor a relatively greater proportion of regulatory elements than long introns. Moreover, our results are consistent with the presence of functionally relevant sequences near the 5' and 3' ends of introns. Finally, our findings suggest that other parts of introns may also be under selective constraints.
Collapse
Affiliation(s)
- Elodie Gazave
- Unitat de Biologia Evolutiva, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Carrer Dr Aiguader 88, 08003 Barcelona, Catalonia, Spain
| | - Tomàs Marqués-Bonet
- Unitat de Biologia Evolutiva, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Carrer Dr Aiguader 88, 08003 Barcelona, Catalonia, Spain
| | - Olga Fernando
- Unitat de Biologia Evolutiva, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Carrer Dr Aiguader 88, 08003 Barcelona, Catalonia, Spain
- Instituto de Tecnologia Química e Biológica (ITQB), Universidade Nova de Lisboa, Av. da República (EAN) 2781-901 Oeiras, Lisboa, Portugal
| | - Brian Charlesworth
- Institute of Evolutionary Biology, University of Edinburgh, West Mains Road, Edinburgh, Scotland, EH7 3JT, UK
| | - Arcadi Navarro
- Institucio Catalana de Recerca i Estudis Avancats (ICREA), Unitat de Biologia Evolutiva, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Carrer Dr Aiguader 88, 08003 Barcelona, Catalonia, Spain
| |
Collapse
|
28
|
Abstract
Research into the origins of introns is at a critical juncture in the resolution of theories on the evolution of early life (which came first, RNA or DNA?), the identity of LUCA (the last universal common ancestor, was it prokaryotic- or eukaryotic-like?), and the significance of noncoding nucleotide variation. One early notion was that introns would have evolved as a component of an efficient mechanism for the origin of genes. But alternative theories emerged as well. From the debate between the "introns-early" and "introns-late" theories came the proposal that introns arose before the origin of genetically encoded proteins and DNA, and the more recent "introns-first" theory, which postulates the presence of introns at that early evolutionary stage from a reconstruction of the "RNA world." Here we review seminal and recent ideas about intron origins. Recent discoveries about the patterns and causes of intron evolution make this one of the most hotly debated and exciting topics in molecular evolutionary biology today.
Collapse
Affiliation(s)
- Francisco Rodríguez-Trelles
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697-2525, USA.
| | | | | |
Collapse
|
29
|
Kertész S, Kerényi Z, Mérai Z, Bartos I, Pálfy T, Barta E, Silhavy D. Both introns and long 3'-UTRs operate as cis-acting elements to trigger nonsense-mediated decay in plants. Nucleic Acids Res 2006; 34:6147-57. [PMID: 17088291 PMCID: PMC1693880 DOI: 10.1093/nar/gkl737] [Citation(s) in RCA: 174] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Nonsense-mediated mRNA decay (NMD) is a eukaryotic quality control mechanism that identifies and eliminates aberrant mRNAs containing a premature termination codon (PTC). Although, key trans-acting NMD factors, UPF1, UPF2 and UPF3 are conserved in yeast and mammals, the cis-acting NMD elements are different. In yeast, short specific sequences or long 3'-untranslated regions (3'-UTRs) render an mRNA subject to NMD, while in mammals' 3'-UTR located introns trigger NMD. Plants also possess an NMD system, although little is known about how it functions. We have elaborated an agroinfiltration-based transient NMD assay system and defined the cis-acting elements that mediate plant NMD. We show that unusually long 3'-UTRs or the presence of introns in the 3'-UTR can subject mRNAs to NMD. These data suggest that both long 3'-UTR-based and intron-based PTC definition operated in the common ancestors of extant eukaryotes (stem eukaryotes) and support the theory that intron-based NMD facilitated the spreading of introns in stem eukaryotes. We have also identified plant UPF1 and showed that tethering of UPF1 to either the 5'- or 3'-UTR of an mRNA results in reduced transcript accumulation. Thus, plant UPF1 might bind to mRNA in a late, irreversible phase of NMD.
Collapse
Affiliation(s)
| | | | - Zsuzsanna Mérai
- Agricultural Biotechnology Center, GödöllőHungary
- Department of Genetics, Eötvös Loránd UniversityBudapest, Hungary
| | - Imre Bartos
- Institute of Physics, Eötvös Loránd UniversityBudapest, Hungary
| | - Tamás Pálfy
- Agricultural Biotechnology Center, GödöllőHungary
| | - Endre Barta
- Agricultural Biotechnology Center, GödöllőHungary
| | - Dániel Silhavy
- Agricultural Biotechnology Center, GödöllőHungary
- To whom correspondence should be addressed at H-2101 Gödöllő, P.O. Box 411, Hungary. Tel: +36 28 526 194; Fax: +36 28 526 145;
| |
Collapse
|
30
|
Nielsen H, Wernersson R. An overabundance of phase 0 introns immediately after the start codon in eukaryotic genes. BMC Genomics 2006; 7:256. [PMID: 17034638 PMCID: PMC1626468 DOI: 10.1186/1471-2164-7-256] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2006] [Accepted: 10/11/2006] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND A knowledge of the positions of introns in eukaryotic genes is important for understanding the evolution of introns. Despite this, there has been relatively little focus on the distribution of intron positions in genes. RESULTS In proteins with signal peptides, there is an overabundance of phase 1 introns around the region of the signal peptide cleavage site. This has been described before. But in proteins without signal peptides, a novel phenomenon is observed: There is a sharp peak of phase 0 intron positions immediately following the start codon, i.e. between codons 1 and 2. This effect is seen in a wide range of eukaryotes: Vertebrates, arthropods, fungi, and flowering plants. Proteins carrying this start codon intron are found to comprise a special class of relatively short, lysine-rich and conserved proteins with an overrepresentation of ribosomal proteins. In addition, there is a peak of phase 0 introns at position 5 in Drosophila genes with signal peptides, predominantly representing cuticle proteins. CONCLUSION There is an overabundance of phase 0 introns immediately after the start codon in eukaryotic genes, which has been described before only for human ribosomal proteins. We give a detailed description of these start codon introns and the proteins that contain them.
Collapse
Affiliation(s)
- Henrik Nielsen
- Center for Biological Sequence Analysis, Technical University of Denmark, Building 208, 2800 Lyngby, Denmark
| | - Rasmus Wernersson
- Center for Biological Sequence Analysis, Technical University of Denmark, Building 208, 2800 Lyngby, Denmark
| |
Collapse
|
31
|
Lin H, Zhu W, Silva JC, Gu X, Buell CR. Intron gain and loss in segmentally duplicated genes in rice. Genome Biol 2006; 7:R41. [PMID: 16719932 PMCID: PMC1779517 DOI: 10.1186/gb-2006-7-5-r41] [Citation(s) in RCA: 119] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2006] [Revised: 03/21/2006] [Accepted: 04/24/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Introns are under less selection pressure than exons, and consequently, intronic sequences have a higher rate of gain and loss than exons. In a number of plant species, a large portion of the genome has been segmentally duplicated, giving rise to a large set of duplicated genes. The recent completion of the rice genome in which segmental duplication has been documented has allowed us to investigate intron evolution within rice, a diploid monocotyledonous species. RESULTS Analysis of segmental duplication in rice revealed that 159 Mb of the 371 Mb genome and 21,570 of the 43,719 non-transposable element-related genes were contained within a duplicated region. In these duplicated regions, 3,101 collinear paired genes were present. Using this set of segmentally duplicated genes, we investigated intron evolution from full-length cDNA-supported non-transposable element-related gene models of rice. Using gene pairs that have an ortholog in the dicotyledonous model species Arabidopsis thaliana, we identified more intron loss (49 introns within 35 gene pairs) than intron gain (5 introns within 5 gene pairs) following segmental duplication. We were unable to demonstrate preferential intron loss at the 3' end of genes as previously reported in mammalian genomes. However, we did find that the four nucleotides of exons that flank lost introns had less frequently used 4-mers. CONCLUSION We observed that intron evolution within rice following segmental duplication is largely dominated by intron loss. In two of the five cases of intron gain within segmentally duplicated genes, the gained sequences were similar to transposable elements.
Collapse
Affiliation(s)
- Haining Lin
- The Institute for Genomic Research, Medical Center Drive, Rockville, MD 20850, USA
| | - Wei Zhu
- The Institute for Genomic Research, Medical Center Drive, Rockville, MD 20850, USA
| | - Joana C Silva
- The Institute for Genomic Research, Medical Center Drive, Rockville, MD 20850, USA
| | - Xun Gu
- Department of Genetics, Development, and Cell Biology, Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, IA 50011, USA
| | - C Robin Buell
- The Institute for Genomic Research, Medical Center Drive, Rockville, MD 20850, USA
| |
Collapse
|
32
|
Chung BYW, Simons C, Firth AE, Brown CM, Hellens RP. Effect of 5'UTR introns on gene expression in Arabidopsis thaliana. BMC Genomics 2006; 7:120. [PMID: 16712733 PMCID: PMC1482700 DOI: 10.1186/1471-2164-7-120] [Citation(s) in RCA: 144] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2006] [Accepted: 05/19/2006] [Indexed: 11/02/2022] Open
Abstract
BACKGROUND The majority of introns in gene transcripts are found within the coding sequences (CDSs). A small but significant fraction of introns are also found to reside within the untranslated regions (5'UTRs and 3'UTRs) of expressed sequences. Alignment of the whole genome and expressed sequence tags (ESTs) of the model plant Arabidopsis thaliana has identified introns residing in both coding and non-coding regions of the genome. RESULTS A bioinformatic analysis revealed some interesting observations: (1) the density of introns in 5'UTRs is similar to that in CDSs but much higher than that in 3'UTRs; (2) the 5'UTR introns are preferentially located close to the initiating ATG codon; (3) introns in the 5'UTRs are, on average, longer than introns in the CDSs and 3'UTRs; and (4) 5'UTR introns have a different nucleotide composition to that of CDS and 3'UTR introns. Furthermore, we show that the 5'UTR intron of the A. thaliana EF1alpha-A3 gene affects the gene expression and the size of the 5'UTR intron influences the level of gene expression. CONCLUSION Introns within the 5'UTR show specific features that distinguish them from introns that reside within the coding sequence and the 3'UTR. In the EF1alpha-A3 gene, the presence of a long intron in the 5'UTR is sufficient to enhance gene expression in plants in a size dependent manner.
Collapse
Affiliation(s)
- Betty YW Chung
- Biochemistry Department, University of Otago, Dunedin, New Zealand
- Bioscience Institute, University College Cork, Cork, Ireland
| | - Cas Simons
- HortResearch, Auckland, New Zealand
- Institute of Molecular Biosciences, Brisbane, Australia
| | - Andrew E Firth
- Biochemistry Department, University of Otago, Dunedin, New Zealand
| | - Chris M Brown
- Biochemistry Department, University of Otago, Dunedin, New Zealand
| | | |
Collapse
|
33
|
Roy SW, Hartl DL. Very little intron loss/gain in Plasmodium: intron loss/gain mutation rates and intron number. Genome Res 2006; 16:750-6. [PMID: 16702411 PMCID: PMC1473185 DOI: 10.1101/gr.4845406] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We compared intron positions in conserved regions of 3479 orthologous gene pairs from Plasmodium falciparum and Plasmodium yoelii, which likely diverged >or=100 million years ago (Mya). Only 27 out of 2212 positions were specific to one of the two species. Intron presence in related species shows that at least 19 and possibly 26 of the changes are due to intron loss, depending on phylogeny. The implied intron loss and gain rates are much lower than previously estimated for nematodes, arthropods, fungi, and plants, and are comparable only with the rates in vertebrates. That all observed changes were exact, occurring without loss or gain of flanking coding sequence, suggests intron loss via an mRNA intermediate, as does a nonsignificant trend toward loss of introns at adjacent positions. Many of the intron changes occurred in genes encoding proteins involved in nucleic acid-related processes, as previously found for intron gains in nematodes. Two changes occurred in the chloroquine resistance transporter, suggesting a role for positive selection in intron loss in Plasmodium. The dearth of intron loss and gain could be explained by the lack of known transposable elements in Plasmodium, since transposable elements and/or reverse transcriptase are thought to be necessary for both processes. The observed pattern suggests that the availability of stochastic intron loss and gain mutations can be a major determinant of changes in intron number.
Collapse
Affiliation(s)
- Scott William Roy
- Department of Organismic and Evolutionary Biology, Harvard, Cambridge, Massachusetts 02138, USA.
| | | |
Collapse
|
34
|
Abstract
The origins and importance of spliceosomal introns comprise one of the longest-abiding mysteries of molecular evolution. Considerable debate remains over several aspects of the evolution of spliceosomal introns, including the timing of intron origin and proliferation, the mechanisms by which introns are lost and gained, and the forces that have shaped intron evolution. Recent important progress has been made in each of these areas. Patterns of intron-position correspondence between widely diverged eukaryotic species have provided insights into the origins of the vast differences in intron number between eukaryotic species, and studies of specific cases of intron loss and gain have led to progress in understanding the underlying molecular mechanisms and the forces that control intron evolution.
Collapse
Affiliation(s)
- Scott William Roy
- Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand.
| | | |
Collapse
|
35
|
Abstract
In this work, 21 completely sequenced eukaryotic genomes were analyzed using an intragene comparison approach. We found that all of these genomes show a significant 5'-biased distribution of introns of protein-coding genes. Our findings are different from previous studies based on the intergene method, where introns are biased towards the 5' end of genes only in intron-poor genomes, but are evenly distributed in intron-rich genomes. In addition, by analyzing the patterns of intron distribution of a set of well-compiled housekeeping genes from human and their respective orthologs identified by a bidirectional best BLAST hit method from the other genomes, we found that the trend of 5'-biased intron positions of the set of housekeeping genes for each genome is much more skewed than that of all genes of the same genome, and rarely if any of the housekeeping genes examined have an extremely 3'-biased position distribution in which all introns of a gene are located only at the 3' portion of the gene. The most parsimonious explanation for our findings may be the model in which intron loss is caused by homologous recombination between the genomic copy of a gene and a reverse transcriptase product of a spliced mRNA.
Collapse
Affiliation(s)
- Kui Lin
- MOE Key Laboratory for Biodiversity Science and Ecological Engineering and College of Life Sciences, Beijing Normal University, Beijing 100875, China.
| | | |
Collapse
|
36
|
Szafranski K, Lehmann R, Parra G, Guigo R, Glöckner G. Gene organization features in A/T-rich organisms. J Mol Evol 2005; 60:90-8. [PMID: 15696371 DOI: 10.1007/s00239-004-0201-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2003] [Accepted: 08/18/2004] [Indexed: 10/25/2022]
Abstract
Several species have genomes in which the four nucleotides are not equally represented (Glöckner 2000). Interestingly, shifts to very high A/T or G/C levels can occur in several distinct branches of the tree of life. The underlying reasons for these shifts therefore may be of different origin. Now entire chromosome sequences from two different A/T-rich genomes, Dictyostelium discoideum and Plasmodium falciparum, are available (Bowman et al. 1999; Gardner et al. 2002; Glöckner et al. 2002). This gives us the opportunity to investigate how a high A/T content may influence the signals that are the landmarks for gene specification. We found that, in contrast with most known metazoan and plant genomes, splice signals contain, little information other than the canonical GT-AG dinucleotides. Intron lengths in A/T rich organisms, on the other hand, are comparable to those of other lower eukaryotes. Intergenic regions show, dependent on the orientation of adjacent genes, a size pattern with a ratio of 1 (3'-3') to 2 (3'-5') to 3 (5'-5'). Overall, gene organization patterns seem not to be influenced by the A/T bias. Surprisingly, the slightly higher A/T content of the P. falciparum genome compared to that of D. discoideum (80.1 versus 77.4%) is not achieved by increased A/T richness in intergenic regions. Instead both the shift of the nucleotide usage in coding regions to A/T-rich codons and the longer intergenic regions make an equal contribution to the higher A/T content in this organism.
Collapse
Affiliation(s)
- Karol Szafranski
- Department of Genome Analysis, Institute for Molecular Biotechnology Jena, Beutenbergstr. 11, D-07745 Jena, Germany
| | | | | | | | | |
Collapse
|
37
|
Roy SW, Gilbert W. Rates of intron loss and gain: implications for early eukaryotic evolution. Proc Natl Acad Sci U S A 2005; 102:5773-8. [PMID: 15827119 PMCID: PMC556292 DOI: 10.1073/pnas.0500383102] [Citation(s) in RCA: 150] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We study the intron-exon structures of 684 groups of orthologs from seven diverse eukaryotic genomes and provide maximum likelihood estimates for rates and numbers of intron losses and gains in these same genes for a variety of lineages. Rates of intron loss vary from approximately 2 x 10(-9) to 2 x 10(-10) per year. Rates of gain vary from 6 x 10(-13) to 4 x 10(-12) per possible intron insertion site per year. There is an inverse correspondence between rates of intron loss and gain, leading to a 20-fold variation among lineages in the ratio of the rates of the two processes. The observed rates of intron gain are insufficient to explain the large number of introns estimated to have been present in the plant-animal ancestor, suggesting that introns present in early eukaryotes may have been created by a fundamentally different process than more recently gained introns.
Collapse
Affiliation(s)
- Scott William Roy
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA.
| | | |
Collapse
|
38
|
Niu DK, Hou WR, Li SW. mRNA-mediated intron losses: evidence from extraordinarily large exons. Mol Biol Evol 2005; 22:1475-81. [PMID: 15788745 DOI: 10.1093/molbev/msi138] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Multicellular eukaryotes that have high intron density have their introns almost evenly distributed within genes, but unicellular eukaryotes that are generally intron poor have their introns asymmetrically distributed toward the 5' ends of genes. This was explained by homologous recombination of genomic DNA with the cDNA reverse transcribed from the 3' polyadenylated tail of spliced mRNA. This paper is to study whether mRNA-mediated intron losses have ever occurred in multicellular eukaryotes. If intron losses were mRNA-mediated, adjacent introns should be commonly lost together. A direct result is fusion of several previously adjacent exons and producing a large exon. We found that extraordinarily large exons (ELEs) are common not only in unicellular eukaryotes but also in multicellular eukaryotes. The percentage of genes having ELEs is negatively correlated with intron abundance. In addition, the number of lost introns estimated from the relative lengths of ELEs is negatively correlated with the number of extant introns. These results support mRNA-mediated intron losses in all eukaryotes. Moreover, we found that the ELEs of intron-common eukaryotes (with more than 0.5 intron per gene on average) are not only located at 3' ends but also at 5' ends and the middle of genes. This is contrary to what would be expected if the involved cDNAs were reverse transcribed from the 3' polyadenosine ends. A remarkable difference in intron distribution was revealed between intron-rare eukaryotes and intron-common eukaryotes. The intron-rare eukaryotes show very strong 5'-biased intron distribution, whereas the intron-common eukaryotes display even intron distribution or only weak 5'-biased distribution. We suspected that intron losses from 3' end of genes may be limited in intron-rare eukaryotes. The intron losses from intron-common eukaryotes should have other priming mechanism, like self-primed reverse transcription.
Collapse
Affiliation(s)
- Deng-Ke Niu
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, China.
| | | | | |
Collapse
|
39
|
Cavalcanti ARO, Dunn DM, Weiss R, Herrick G, Landweber LF, Doak TG. Sequence features of Oxytricha trifallax (class Spirotrichea) macronuclear telomeric and subtelomeric sequences. Protist 2005; 155:311-22. [PMID: 15552058 DOI: 10.1078/1434461041844196] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
We sequenced and analyzed the subtelomeric regions of 1356 macronuclear "nanochromosomes" of the spirotrichous ciliate Oxytricha trifallax. We show that the telomeres in this species have a length of 20 nt, with minor deviations; there is no correlation between telomere lengths at the two ends of the molecule. A search for open reading frames revealed that the 3' and 5' untranslated regions are short, with a median length of approximately 130 nt, and that surprisingly there are no detectable differences between sequences upstream and downstream of genes. Our results confirm a previously reported purine bias in the first approximately 80 nucleotides of the subtelomeric regions, but with this larger data set we curiously detected a 10 bp periodicity in the bias; we relate this finding to the possible regulatory and structural functions these regions must serve. Palindromic sequences in opposing subtelomeric regions, although present in most sequences, are not statistically significant.
Collapse
Affiliation(s)
- Andre R O Cavalcanti
- Department of Ecology and Evolutionary Biology, Princeton University, NJ 08544, USA
| | | | | | | | | | | |
Collapse
|
40
|
Vanácová S, Yan W, Carlton JM, Johnson PJ. Spliceosomal introns in the deep-branching eukaryote Trichomonas vaginalis. Proc Natl Acad Sci U S A 2005; 102:4430-5. [PMID: 15764705 PMCID: PMC554003 DOI: 10.1073/pnas.0407500102] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Eukaryotes have evolved elaborate splicing mechanisms to remove introns that would otherwise destroy the protein-coding capacity of genes. Nuclear premRNA splicing requires sequence motifs in the intron and is mediated by a ribonucleoprotein complex, the spliceosome. Here we demonstrate the presence of a splicing apparatus in the protist Trichomonas vaginalis and show that RNA motifs found in yeast and metazoan introns are required for splicing. We also describe the first introns in this deep-branching lineage. The positions of these introns are often conserved in orthologous genes, indicating they were present in a common ancestor of trichomonads, yeast, and metazoa. All examined T. vaginalis introns have a highly conserved 12-nt 3' splice-site motif that encompasses the branch point and is necessary for splicing. This motif is also found in the only described intron in a gene from another deep-branching eukaryote, Giardia intestinalis. These studies demonstrate the conservation of intron splicing signals across large evolutionary distances, reveal unexpected motif conservation in deep-branching lineages that suggest a simplified mechanism of splicing in primitive unicellular eukaryotes, and support the presence of introns in the earliest eukaryote.
Collapse
Affiliation(s)
- Stepánka Vanácová
- Department of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, CA 90095, USA
| | | | | | | |
Collapse
|
41
|
White TW, Wang H, Mui R, Litteral J, Brink PR. Cloning and functional expression of invertebrate connexins from Halocynthia pyriformis. FEBS Lett 2005; 577:42-8. [PMID: 15527759 DOI: 10.1016/j.febslet.2004.09.071] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2004] [Accepted: 09/22/2004] [Indexed: 11/13/2022]
Abstract
Unlike many other ion channels, unrelated gene families encode gap junctions in different animal phyla. Connexin and pannexin genes are found in deuterostomes, while protostomal species use innexin genes. Connexins are often described as vertebrate genes, despite the existence of invertebrate deuterostomes. We have cloned connexin sequences from an invertebrate chordate, Halocynthia pyriformis. Invertebrate connexins shared 25-40% sequence identity with human connexins, had extracellular domains containing six invariant cysteine residues, coding regions that were interrupted by introns, and formed functional channels in vitro. These data show that gap junction channels based on connexins are present in animals that predate vertebrate evolution.
Collapse
Affiliation(s)
- Thomas W White
- Department of Physiology and Biophysics, State University of New York, BST 5-147, Stony Brook, NY 11794, USA.
| | | | | | | | | |
Collapse
|
42
|
Abstract
We studied intron loss in 684 groups of orthologous genes from seven fully sequenced eukaryotic genomes. We found that introns closer to the 3' ends of genes are preferentially lost, as predicted if introns are lost through gene conversion with a reverse transcriptase product of a spliced mRNA. Adjacent introns tend to be lost in concert, as expected if such events span multiple intron positions. Directly contrary to the expectations of some, introns that do not interrupt codons (phase zero) are more, not less, likely to be lost, an intriguing and previously unappreciated result. Adjacent introns with matching phases are not more likely to be retained, as would be expected if they enjoyed a relative selective advantage. The findings of 3' and phase zero intron loss biases are in direct contradiction to an extremely recent study of fungi intron evolution. All patterns are less pronounced in the lineage leading to Caenorhabditis elegans, suggesting that the process of intron loss may be qualitatively different in nematodes. Our results support a reverse transcriptase-mediated model of intron loss.
Collapse
Affiliation(s)
- Scott W Roy
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA.
| | | |
Collapse
|
43
|
Sverdlov AV, Babenko VN, Rogozin IB, Koonin EV. Preferential loss and gain of introns in 3' portions of genes suggests a reverse-transcription mechanism of intron insertion. Gene 2004; 338:85-91. [PMID: 15302409 DOI: 10.1016/j.gene.2004.05.027] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2004] [Revised: 04/19/2004] [Accepted: 05/17/2004] [Indexed: 11/25/2022]
Abstract
In an attempt to gain insight into the dynamics of intron evolution in eukaryotic protein-coding genes, the distributions of old introns, that are conserved between distant phylogenetic lineages, and new, lineage-specific introns along the gene length, were examined. A significant excess of old introns in 5'-regions of genes was detected. New introns, when analyzed in bulk, showed a nearly flat distribution from the 5'- to the 3'-end. However, analysis of new intron distributions in individual genomes revealed notable lineage-specific features. While in intron-poor genomes, particularly yeast Schizosaccharomyces pombe (Sp), the 5'-portions of genes contain a significantly greater number of new introns than the 3'-portions, the intron-rich genomes of humans and Arabidopsis show the opposite trend. These observations seem to be compatible with the view that introns are both lost and inserted in 3'-terminal portions of genes more often than in 5'-portions. Overrepresentation of 3'-terminal sequences among cDNAs that mediate intron loss appears to be the most likely explanation for the apparent preferential loss of introns in the distal parts of genes. Preferential insertion of introns in the 3'-portions suggests that introns might be inserted via a reverse-transcription-mediated pathway similar to that implicated in intron loss. This mechanism could involve duplication of a portion of the coding region during reverse transcription followed by homologous recombination and subsequent rapid sequence divergence in the copy that becomes a new intron.
Collapse
Affiliation(s)
- Alexander V Sverdlov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Building 38A, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
44
|
Kitamura-Abe S, Itoh H, Washio T, Tsutsumi A, Tomita M. Characterization of the splice sites in GT-AG and GC-AG introns in higher eukaryotes using full-length cDNAs. J Bioinform Comput Biol 2004; 2:309-31. [PMID: 15297984 DOI: 10.1142/s0219720004000570] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2003] [Revised: 11/17/2003] [Accepted: 11/17/2003] [Indexed: 11/18/2022]
Abstract
For the purpose of analyzing the relation between the splice sites and the order of introns, we conducted the following analysis for the GT-AG and GC-AG splice site groups. First, the pre-mRNAs of H. sapiens, M. musculus, D. melanogaster, A. thaliana and O. sativa were sampled by mapping the full-length cDNA to the genomes. Next, the consensus sequences at different regions of pre-mRNAs were analyzed in the five species. We also investigated the mononucleotide and dinucleotide frequencies in the extensive regions around the 5' splice sites (5'ss) and 3' splice sites (3'ss). As a result, differential frequencies of nucleotides at the first 5'ss in both the GT-AG and GC-AG splice site groups were observed in A. thaliana and O. sativa pre-mRNAs. The trend, which indicates that GC 5'ss possess strong consensus sequences, was observed not only in mammalian pre-mRNAs but also in the pre-mRNAs of D. melanogaster, A. thaliana and O. sativa. Furthermore, we examined the consensus sequences of the constitutive and alternative splice sites. It was suggested that in the case of the alternative GC-AG introns, the tendency to have a weak consensus sequence at 5'ss is different between H. sapiens and M. musculus pre-mRNAs.
Collapse
Affiliation(s)
- Sumie Kitamura-Abe
- Laboratory for Bioinformatics, Institute for Advanced Biosciences, Keio University, Fujisawa, Kanagawa 252-8520, Japan.
| | | | | | | | | |
Collapse
|
45
|
Chamary JV, Hurst LD. Similar rates but different modes of sequence evolution in introns and at exonic silent sites in rodents: evidence for selectively driven codon usage. Mol Biol Evol 2004; 21:1014-23. [PMID: 15014158 DOI: 10.1093/molbev/msh087] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In mammals divergence at fourfold degenerate sites in codons (K(4)) and intronic sequence (K(i)) are both used to estimate the mutation rate, under the supposition that both evolve neutrally. Does it matter which of these we use? Using either class of sequence can be defended because (1) K(4) is the same as K(i) (at least in rodents) and (2) there is no selectively driven codon usage (hence no systematic selection on third sites). Here we re-examine these findings using 560 introns (for 136 genes) in the mouse-rat comparison, aligned by eye and using a new maximum likelihood protocol. We find that the rate of evolution at fourfold sites and at intronic sites is similar in magnitude, but only after eliminating putatively constrained sites from introns (first introns and sites flanking intron-exon junctions). Any approximate congruence between the two rates is not, however, owing to an underlying similarity in the mode of sequence evolution. Some dinucleotides are hypermutable and differently abundant in exons and introns (e.g., CpGs). More importantly, after controlling for relative abundance, all dinucleotides starting with A or T are more prevalent in mismatches in exons than in introns, whereas C-starting dinucleotides (except CG) are more common in introns. Although C content at intronic sites is lower than at flanking fourfold sites, G content is similar, demonstrating that there exists a strong strand-specific preference for C nucleotides that is unique to exons. Transcription-coupled mutational processes and biased gene conversion cannot explain this, as they should affect introns and flanking exons equally. Therefore, by elimination, we propose this to be strong evidence for selectively driven codon usage in mammals.
Collapse
Affiliation(s)
- Jean-Vincent Chamary
- Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | | |
Collapse
|
46
|
Sakabe NJ, de Souza JES, Galante PAF, de Oliveira PSL, Passetti F, Brentani H, Osório EC, Zaiats AC, Leerkes MR, Kitajima JP, Brentani RR, Strausberg RL, Simpson AJG, de Souza SJ. ORESTES are enriched in rare exon usage variants affecting the encoded proteins. C R Biol 2003; 326:979-85. [PMID: 14744104 DOI: 10.1016/j.crvi.2003.09.027] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
A significant fraction of the variability found in the human transcriptome is due to alternative splicing, including alternative exon usage (AEU), intron retention and use of cryptic splice sites. We present a comparison of a large-scale analysis of AEU in the human transcriptome through genome mapping of Open Reading Frame ESTs (ORESTES) and conventional ESTs. It is shown here that ORESTES probe low abundant messages more efficiently. In addition, most of the variants detected by ORESTES affect the structure of the corresponding proteins.
Collapse
Affiliation(s)
- Noboru Jo Sakabe
- Ludwig Institute for Cancer Research, Sao Paulo Branch, Rua Prof Antonio Prudente 109, 4(o) andar, 01509-010, Sao Paulo, Brazil
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Affiliation(s)
- Tobias Mourier
- Department of Evolutionary Biology, Zoological Institute, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark.
| | | |
Collapse
|