1
|
Patil G. Evolution of fibrinogen domain related proteins in Aedes aegypti: Their expression during Arbovirus infections. GENE REPORTS 2021. [DOI: 10.1016/j.genrep.2021.101030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
2
|
Moyer DC, Larue GE, Hershberger CE, Roy SW, Padgett RA. Comprehensive database and evolutionary dynamics of U12-type introns. Nucleic Acids Res 2020; 48:7066-7078. [PMID: 32484558 PMCID: PMC7367187 DOI: 10.1093/nar/gkaa464] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 05/19/2020] [Accepted: 05/20/2020] [Indexed: 12/16/2022] Open
Abstract
During nuclear maturation of most eukaryotic pre-messenger RNAs and long non-coding RNAs, introns are removed through the process of RNA splicing. Different classes of introns are excised by the U2-type or the U12-type spliceosomes, large complexes of small nuclear ribonucleoprotein particles and associated proteins. We created intronIC, a program for assigning intron class to all introns in a given genome, and used it on 24 eukaryotic genomes to create the Intron Annotation and Orthology Database (IAOD). We then used the data in the IAOD to revisit several hypotheses concerning the evolution of the two classes of spliceosomal introns, finding support for the class conversion model explaining the low abundance of U12-type introns in modern genomes.
Collapse
Affiliation(s)
- Devlin C Moyer
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic Lerner College of Medicine, Cleveland Clinic and Department of Molecular Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Graham E Larue
- Department of Molecular and Cell Biology, University of California, Merced, Merced, CA 95343, USA
| | - Courtney E Hershberger
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic Lerner College of Medicine, Cleveland Clinic and Department of Molecular Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Scott W Roy
- Department of Biology, San Francisco State University, San Francisco, CA 94132, USA
| | - Richard A Padgett
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic Lerner College of Medicine, Cleveland Clinic and Department of Molecular Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
3
|
Comprehensive genomic analyses with 115 plastomes from algae to seed plants: structure, gene contents, GC contents, and introns. Genes Genomics 2020; 42:553-570. [PMID: 32200544 DOI: 10.1007/s13258-020-00923-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Accepted: 03/09/2020] [Indexed: 02/08/2023]
Abstract
BACKGROUND Chloroplasts are a common character in plants. The chloroplasts in each plant lineage have shaped their own genomes, plastomes, by structural changes and transferring many genes to nuclear genomes during plant evolution. Some plastid genes have introns that are mostly group II introns. OBJECTIVE This study aimed to get genomic and evolutionary insights on the plastomes from green algae to flowering plants. METHODS Plastomes of 115 species from green algae, bryophytes, pteridophytes (spore bearing vascular plants), gymnosperms, and angiosperms were mined from NCBI organelle genome database. Plastome structure, gene contents and GC contents were analyzed by the in-house developed Phyton code. Intronic features including presence/absence, length, intron phases were analyzed by manually in the annotated information in NCBI. RESULTS The canonical quadripartite structures were retained in most plastomes except of a few plastomes that had lost an invert repeat (IR). Expansion or reduction or deletion of IRs resulted in the length variation of the plastomes. The number of protein coding genes ranged from 40 to 92 with an average 79.43 ± 5.84 per plastome and gene losses were apparent in specific lineages. The number of trn genes ranged from 13 to 33 with an average 21.19 ± 2.42 per plastome. Ribosomal RNA genes, rrn, were located in the IRs so that they were present in a duplicate except of the species that had lost one of the IR. GC contents were variable from 24.9 to 51.0% with an average 38.21 ± 3.27%, indicating bias to high AT contents. Plastid introns were present in 18 protein coding genes, six trn genes, and one rrn gene. Intron losses occurred among the orthologous genes in different plant lineages. The plastid introns were long compared with the nuclear introns, which might be related with the spliceosome nuclear introns and self-splicing group II plastid introns. The trnK-UUU intron contained the maturase encoding matK gene except in the chlorophyte algae and monilophyte ferns in which the trnK-UUU was lost, but matK retained. There were many annotation artefacts in the intron positions in the NCBI database. In the analysis of intron phases, phase 0 introns were more frequent than those of phase 2 and 3 introns. Phase polymorphism was observed in the introns of clpP which was derived from nucleotide insertion. Plastid trn introns were long compared to the archaeal or eukaryotic nuclear tRNA introns. Of the six plastid trn introns, one was at the D loop and other five were at the anticodon loop. The insertion sites were conserved among the trn genes in archaea, eukaryotic nuclear and plastid tRNA genes. CONCLUSIONS Current study refurbrished the previous findings of structural variations, gene contents, and GC contents of the chloroplast genomes from green algae to flowering plants. The study also included some noble findings and discussions on the plastome introns including their length variations and phase variation. We also presented and corrected some false annotations on the introns in protein coding and tRNA genes in the genome database, which might be confirmed by the chloroplast transcriptome analysis in the future.
Collapse
|
4
|
Wang L, Stein LD. Modeling the evolution dynamics of exon-intron structure with a general random fragmentation process. BMC Evol Biol 2013; 13:57. [PMID: 23448166 PMCID: PMC3732091 DOI: 10.1186/1471-2148-13-57] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2012] [Accepted: 02/22/2013] [Indexed: 12/02/2022] Open
Abstract
Background Most eukaryotic genes are interrupted by spliceosomal introns. The evolution of exon-intron structure remains mysterious despite rapid advance in genome sequencing technique. In this work, a novel approach is taken based on the assumptions that the evolution of exon-intron structure is a stochastic process, and that the characteristics of this process can be understood by examining its historical outcome, the present-day size distribution of internal translated exons (exon). Through the combination of simulation and modeling the size distribution of exons in different species, we propose a general random fragmentation process (GRFP) to characterize the evolution dynamics of exon-intron structure. This model accurately predicts the probability that an exon will be split by a new intron and the distribution of novel insertions along the length of the exon. Results As the first observation from this model, we show that the chance for an exon to obtain an intron is proportional to its size to the 3rd power. We also show that such size dependence is nearly constant across gene, with the exception of the exons adjacent to the 5′ UTR. As the second conclusion from the model, we show that intron insertion loci follow a normal distribution with a mean of 0.5 (center of the exon) and a standard deviation of 0.11. Finally, we show that intron insertions within a gene are independent of each other for vertebrates, but are more negatively correlated for non-vertebrate. We use simulation to demonstrate that the negative correlation might result from significant intron loss during evolution, which could be explained by selection against multi-intron genes in these organisms. Conclusions The GRFP model suggests that intron gain is dynamic with a higher chance for longer exons; introns are inserted into exons randomly with the highest probability at the center of the exon. GRFP estimates that there are 78 introns in every 10 kb coding sequences for vertebrate genomes, agreeing with empirical observations. GRFP also estimates that there are significant intron losses in the evolution of non-vertebrate genomes, with extreme cases of around 57% intron loss in Drosophila melanogaster, 28% in Caenorhabditis elegans, and 24% in Oryza sativa.
Collapse
Affiliation(s)
- Liya Wang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.
| | | |
Collapse
|
5
|
Rogozin IB, Carmel L, Csuros M, Koonin EV. Origin and evolution of spliceosomal introns. Biol Direct 2012; 7:11. [PMID: 22507701 PMCID: PMC3488318 DOI: 10.1186/1745-6150-7-11] [Citation(s) in RCA: 224] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 03/15/2012] [Indexed: 12/31/2022] Open
Abstract
Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers’ Reports section.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information NLM/NIH, 8600 Rockville Pike, Bldg, 38A, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
6
|
Da Lage JL, Maczkowiak F, Cariou ML. Phylogenetic distribution of intron positions in alpha-amylase genes of bilateria suggests numerous gains and losses. PLoS One 2011; 6:e19673. [PMID: 21611157 PMCID: PMC3096672 DOI: 10.1371/journal.pone.0019673] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2010] [Accepted: 04/03/2011] [Indexed: 11/19/2022] Open
Abstract
Most eukaryotes have at least some genes interrupted by introns. While it is well accepted that introns were already present at moderate density in the last eukaryote common ancestor, the conspicuous diversity of intron density among genomes suggests a complex evolutionary history, with marked differences between phyla. The question of the rates of intron gains and loss in the course of evolution and factors influencing them remains controversial. We have investigated a single gene family, alpha-amylase, in 55 species covering a variety of animal phyla. Comparison of intron positions across phyla suggests a complex history, with a likely ancestral intronless gene undergoing frequent intron loss and gain, leading to extant intron/exon structures that are highly variable, even among species from the same phylum. Because introns are known to play no regulatory role in this gene and there is no alternative splicing, the structural differences may be interpreted more easily: intron positions, sizes, losses or gains may be more likely related to factors linked to splicing mechanisms and requirements, and to recognition of introns and exons, or to more extrinsic factors, such as life cycle and population size. We have shown that intron losses outnumbered gains in recent periods, but that "resets" of intron positions occurred at the origin of several phyla, including vertebrates. Rates of gain and loss appear to be positively correlated. No phase preference was found. We also found evidence for parallel gains and for intron sliding. Presence of introns at given positions was correlated to a strong protosplice consensus sequence AG/G, which was much weaker in the absence of intron. In contrast, recent intron insertions were not associated with a specific sequence. In animal Amy genes, population size and generation time seem to have played only minor roles in shaping gene structures.
Collapse
Affiliation(s)
- Jean-Luc Da Lage
- Laboratoire Evolution, génomes et spéciation, UPR 9034 CNRS, Gif sur Yvette, France.
| | | | | |
Collapse
|
7
|
Koduri PKH, Gordon GS, Barker EI, Colpitts CC, Ashton NW, Suh DY. Genome-wide analysis of the chalcone synthase superfamily genes of Physcomitrella patens. PLANT MOLECULAR BIOLOGY 2010; 72:247-63. [PMID: 19876746 DOI: 10.1007/s11103-009-9565-z] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2009] [Accepted: 10/19/2009] [Indexed: 05/08/2023]
Abstract
Enzymes of the chalcone synthase (CHS) superfamily catalyze the production of a variety of secondary metabolites in bacteria, fungi and plants. Some of these metabolites have played important roles during the early evolution of land plants by providing protection from various environmental assaults including UV irradiation. The genome of the moss, Physcomitrella patens, contains at least 17 putative CHS superfamily genes. Three of these genes (PpCHS2b, PpCHS3 and PpCHS5) exist in multiple copies and all have corresponding ESTs. PpCHS11 and probably also PpCHS9 encode non-CHS enzymes, while PpCHS10 appears to be an ortholog of plant genes encoding anther-specific CHS-like enzymes. It was inferred from the genomic locations of genes comprising it that the moss CHS superfamily expanded through tandem and segmental duplication events. Inferred exon-intron architectures and results from phylogenetic analysis of representative CHS superfamily genes of P. patens and other plants showed that intron gain and loss occurred several times during evolution of this gene superfamily. A high proportion of P. patens CHS genes (7 of 14 genes for which the full sequence is known and probably 3 additional genes) are intronless, prompting speculation that CHS gene duplication via retrotransposition has occurred at least twice in the moss lineage. Analyses of sequence similarities, catalytic motifs and EST data indicated that a surprisingly large number (as many as 13) of the moss CHS superfamily genes probably encode active CHS. EST distribution data and different light responsiveness observed with selected genes provide evidence for their differential regulation. Observed diversity within the moss CHS superfamily and amenability to gene manipulation make Physcomitrella a highly suitable model system for studying expansion and functional diversification of the plant CHS superfamily of genes.
Collapse
Affiliation(s)
- P K Harshavardhan Koduri
- Department of Chemistry and Biochemistry, University of Regina, 3737 Wascana Parkway, Regina, SK, S4S 0A2, Canada
| | | | | | | | | | | |
Collapse
|
8
|
Regulapati R, Bhasi A, Singh CK, Senapathy P. Origination of the split structure of spliceosomal genes from random genetic sequences. PLoS One 2008; 3:e3456. [PMID: 18941625 PMCID: PMC2565106 DOI: 10.1371/journal.pone.0003456] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2008] [Accepted: 08/01/2008] [Indexed: 11/18/2022] Open
Abstract
The mechanism by which protein-coding portions of eukaryotic genes came to be separated by long non-coding stretches of DNA, and the purpose for this perplexing arrangement, have remained unresolved fundamental biological problems for three decades. We report here a plausible solution to this problem based on analysis of open reading frame (ORF) length constraints in the genomes of nine diverse species. If primordial nucleic acid sequences were random in sequence, functional proteins that are innately long would not be encoded due to the frequent occurrence of stop codons. The best possible way that a long protein-coding sequence could have been derived was by evolving a split-structure from the random DNA (or RNA) sequence. Results of the systematic analyses of nine complete genome sequences presented here suggests that perhaps the major underlying structural features of split-genes have evolved due to the indigenous occurrence of split protein-coding genes in primordial random nucleotide sequence. The results also suggest that intron-rich genes containing short exons may have been the original form of genes intrinsically occurring in random DNA, and that intron-poor genes containing long exons were perhaps derived from the original intron-rich genes.
Collapse
Affiliation(s)
- Rahul Regulapati
- Department of Biotechnology, Indian Institute of Technology Madras, Chennai, India
| | - Ashwini Bhasi
- Department of Human Genetics, Genome International Corporation, Madison, Wisconsin, United States of America
| | - Chandan Kumar Singh
- Department of Bioinformatics, International Center for Advanced Genomics and Proteomics, Nehru Nagar, Chennai, India
| | - Periannan Senapathy
- Department of Human Genetics, Genome International Corporation, Madison, Wisconsin, United States of America
- Department of Bioinformatics, International Center for Advanced Genomics and Proteomics, Nehru Nagar, Chennai, India
- * E-mail:
| |
Collapse
|
9
|
Abstract
Although introns were first discovered almost 30 years ago, their evolutionary origin remains elusive. In this work, we used multispecies whole-genome alignments to map Drosophila melanogaster introns onto 10 other fully sequenced Drosophila genomes. We were able to find 1,944 sites where an intron was missing in one or more species. We show that for most (>80%) of these cases, there is no leftover intronic sequence or any missing exonic sequence, indicating exact intron loss or gain events. We used parsimony to classify these differences as 1,754 intron loss events and 213 gain events. We show that lost and gained introns are significantly shorter than average and flanked by longer than average exons. They also display quite distinct phase distributions and show greater than average similarity between the 5' splice site and its 3' partner splice site. Introns that have been lost in one or more species evolve faster than other introns, occur in slowly evolving genes, and are found adjacent to each other more often than would be expected for independent single losses. Our results support the cDNA recombination mechanism of intron loss, suggest that selective pressures affect site-specific loss rates, and show conclusively that intron gain has occurred within the Drosophila lineage, solidifying the "introns-middle" hypothesis and providing some hints about the gain mechanism.
Collapse
|
10
|
Artamonova II, Gelfand MS. Comparative Genomics and Evolution of Alternative Splicing: The Pessimists' Science. Chem Rev 2007; 107:3407-30. [PMID: 17645315 DOI: 10.1021/cr068304c] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Irena I Artamonova
- Group of Bioinformatics, Vavilov Institute of General Genetics, RAS, Gubkina 3, Moscow 119991, Russia
| | | |
Collapse
|
11
|
Maczkowiak F, Da Lage JL. Origin and evolution of the Amyrel gene in the alpha-amylase multigene family of Diptera. Genetica 2007; 128:145-58. [PMID: 17028947 DOI: 10.1007/s10709-005-5578-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2005] [Accepted: 11/30/2005] [Indexed: 10/24/2022]
Abstract
Alpha-amylase genes often form multigene families in living organisms. In Diptera, a remote paralog, Amyrel, had been discovered in Drosophila, where this gene is currently used as a population and phylogenetic marker. The putative encoded protein has about 40% divergence with the classical amylases. We have searched the presence of the paralog in other families of Diptera to track its origin and understand its evolution. Amyrel was detected in a number of families of Muscomorpha (Brachycera-Cyclorrapha), suggesting an origin much older than previously thought. It has not been found elsewhere to date, and it is absent from the Anopheles gambiae genome. The intron-exon structures of the genes found so far suggest that the ancestral gene (before the duplication which gave rise to Amyrel) had two introns, and that subsequent, repeated and independent loss of one or both introns occurred in some Muscomorpha families. It seems that the Amyrel protein has experienced specific amino acid substitutions in regions generally well conserved in amylases, raising the possibility of peculiar, functional adaptations of this protein.
Collapse
Affiliation(s)
- Frédérique Maczkowiak
- Populations, génétique et évolution, UPR 9034, CNRS, Gif sur Yvette, 91198, Cedex, France
| | | |
Collapse
|
12
|
Abstract
Research into the origins of introns is at a critical juncture in the resolution of theories on the evolution of early life (which came first, RNA or DNA?), the identity of LUCA (the last universal common ancestor, was it prokaryotic- or eukaryotic-like?), and the significance of noncoding nucleotide variation. One early notion was that introns would have evolved as a component of an efficient mechanism for the origin of genes. But alternative theories emerged as well. From the debate between the "introns-early" and "introns-late" theories came the proposal that introns arose before the origin of genetically encoded proteins and DNA, and the more recent "introns-first" theory, which postulates the presence of introns at that early evolutionary stage from a reconstruction of the "RNA world." Here we review seminal and recent ideas about intron origins. Recent discoveries about the patterns and causes of intron evolution make this one of the most hotly debated and exciting topics in molecular evolutionary biology today.
Collapse
Affiliation(s)
- Francisco Rodríguez-Trelles
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California 92697-2525, USA.
| | | | | |
Collapse
|
13
|
Fridmanis D, Fredriksson R, Kapa I, Schiöth HB, Klovins J. Formation of new genes explains lower intron density in mammalian Rhodopsin G protein-coupled receptors. Mol Phylogenet Evol 2006; 43:864-80. [PMID: 17188520 DOI: 10.1016/j.ympev.2006.11.007] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2006] [Revised: 10/06/2006] [Accepted: 11/02/2006] [Indexed: 10/23/2022]
Abstract
Mammalian G protein-coupled receptor (GPCR) genes are characterised by a large proportion of intronless genes or a lower density of introns when compared with GPCRs of invertebrates. It is unclear which mechanisms have influenced intron density in this protein family, which is one of the largest in the mammalian genomes. We used a combination of Hidden Markov Models (HMM) and BLAST searches to establish the comprehensive repertoire of Rhodopsin GPCRs from seven species and performed overall alignments and phylogenetic analysis using the maximum parsimony method for over 1400 receptors in 12 subgroups. We identified 14 different Ancestral Receptor Groups (ARGs) that have members in both vertebrate and invertebrate species. We found that there exists a remarkable difference in the intron density among ancestral and new Rhodopsin GPCRs. The intron density among ARGs members was more than 3.5-fold higher than that within non-ARG members and more than 2-fold higher when considering only the 7TM region. This suggests that the new GPCR genes have been predominantly formed intronless while the ancestral receptors likely accumulated introns during their evolution. Many of the intron positions found in mammalian ARG receptor sequences were found to be present in orthologue invertebrate receptors suggesting that these intron positions are ancient. This analysis also revealed that one intron position is much more frequent than any other position and it is common for a number of phylogenetically different Rhodopsin GPCR groups. This intron position lies within a functionally important, conserved, DRY motif which may form a proto-splice site that could contribute to positional intron insertion. Moreover, we have found that other receptor motifs, similar to DRY, also contain introns between the second and third nucleotide of the arginine codon which also forms a proto-splice site. Our analysis presents compelling evidence that there was not a major loss of introns in mammalian GPCRs and formation of new GPCRs among mammals explains why these have fewer introns compared to invertebrate GPCRs. We also discuss and speculate about the possible role of different RNA- and DNA-based mechanisms of intron insertion and loss.
Collapse
Affiliation(s)
- Davids Fridmanis
- Biomedical Research and Study Centre, University of Latvia, Ratsupites 1, Riga, Latvia
| | | | | | | | | |
Collapse
|
14
|
Kim H, Sung S, Klein R. Expansion of symmetric exon-bordering domains does not explain evolution of lineage specific genes in mammals. Genetica 2006; 131:59-68. [PMID: 17082903 DOI: 10.1007/s10709-006-9113-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2006] [Accepted: 09/26/2006] [Indexed: 10/24/2022]
Abstract
In order to examine the evolution of lineage specific genes, we analyzed intron phase distributions and exon-bordering domains in primate and rodent specific genes. We found that the expansion of symmetric exon-bordering domains could not explain the evolution of lineage specific genes. Rather internal intron loss of a domain can partially explain the excess of class 1-1 intron phases in the lineage specific genes. We suggest the event that led to excess of symmetric exons in lineage specific genes had little bearing on shaping the phenotypes specific to the individual lineage. Instead, Kruppel-associated box (KRAB) proteins associated with zinc finger C2H2 (zf-C2H2) type are likely to be responsible for the lineage specific function.
Collapse
Affiliation(s)
- Heebal Kim
- Laboratory of Bioinformtics and Population Genetics, Department of Agricultural Biotechnology, Seoul National University, San 56-1, Sillim-dong, Gwanak-gu, Seoul 151-742, Korea.
| | | | | |
Collapse
|
15
|
Ruvinsky A, Ward W. A gradient in the distribution of introns in eukaryotic genes. J Mol Evol 2006; 63:136-41. [PMID: 16736103 DOI: 10.1007/s00239-005-0261-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2005] [Accepted: 02/13/2006] [Indexed: 10/24/2022]
Abstract
The majority of eukaryotic genes consist of exons and introns. Introns can be inserted either between codons (phase 0) or within codons, after the first nucleotide (phase 1) and after the second (phase 2). We report here that the frequency of phase 0 increases and phase 1 declines from the 5' region to the 3' end of genes. This trend is particularly noticeable in genomes of Homo sapiens and Arabidopsis thaliana, in which gains of novel introns in the 3' portion of genes were probably a dominant process. Similar but more moderate gradients exist in Drosophila melanogaster and Caenorhabditis elegans genomes, where the accumulation of novel introns was not a prevailing factor. There are nine types of exons, three symmetric (0,0; 1,1; 2,2) and six asymmetric (0,1; 1,0; 1,2; 2,1; 2,0; 0,2). Assuming random distribution of different types of introns along genes, one can expect the frequencies of asymmetric exons such as 0,1 and 1,0 or 1,2 and 2,1 to be approximately equal, allowing for some variation caused by randomness. The gradient in intron distribution leads to a small but consistent and statistically significant bias: phase 1 introns are more likely at the 5' ends and phase 0 introns are more likely at the 3' ends of asymmetric exons. For the same reason, the frequency of 0,0 exons increases and the frequency of 1,1 exons decreases in the 3' direction, at least in H. sapiens and A. thaliana. The number of introns per gene also affects the distribution and frequency of phase 0 and 1 introns. The gradient provides an insight into the evolution of intron-exon structures of eukaryotic genes.
Collapse
Affiliation(s)
- A Ruvinsky
- The Institute for Genetics and Bioinformatics, University of New England, Armidale, 2351, NSW, Australia.
| | | |
Collapse
|
16
|
Abstract
The origins and importance of spliceosomal introns comprise one of the longest-abiding mysteries of molecular evolution. Considerable debate remains over several aspects of the evolution of spliceosomal introns, including the timing of intron origin and proliferation, the mechanisms by which introns are lost and gained, and the forces that have shaped intron evolution. Recent important progress has been made in each of these areas. Patterns of intron-position correspondence between widely diverged eukaryotic species have provided insights into the origins of the vast differences in intron number between eukaryotic species, and studies of specific cases of intron loss and gain have led to progress in understanding the underlying molecular mechanisms and the forces that control intron evolution.
Collapse
Affiliation(s)
- Scott William Roy
- Allan Wilson Centre for Molecular Ecology and Evolution, Massey University, Palmerston North, New Zealand.
| | | |
Collapse
|
17
|
Rommens CM, Bougri O, Yan H, Humara JM, Owen J, Swords K, Ye J. Plant-derived transfer DNAs. PLANT PHYSIOLOGY 2005; 139:1338-49. [PMID: 16244143 PMCID: PMC1283770 DOI: 10.1104/pp.105.068692] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
The transfer of DNA from Agrobacterium to plant cell nuclei is initiated by a cleavage reaction within the 25-bp right border of Ti plasmids. In an effort to develop all-native DNA transformation vectors, 50 putative right border alternatives were identified in both plant expressed sequence tags and genomic DNA. Efficacy tests in a tobacco (Nicotiana tabacum) model system demonstrated that 14 of these elements displayed at least 50% of the activity of conventional Agrobacterium transfer DNA borders. Four of the most effective plant-derived right border alternatives were found to be associated with intron-exon junctions. Additional elements were embedded within introns, exons, untranslated trailers, and intergenic DNA. Based on the identification of a single right border alternative in Arabidopsis and three in rice (Oryza sativa), the occurrence of this motif was estimated at a frequency of at least 0.8x10(-8). Modification of plasmid DNA sequences flanking the alternative borders demonstrated that both upstream and downstream sequences play an important role in initiating DNA transfer. Optimal DNA transfer required the elements to be preceded by pyrimidine residues interspaced by AC-rich trinucleotides. Alteration of this organization lowered transformation frequencies by 46% to 93%. Despite their weaker resemblance with left borders, right border alternatives also functioned effectively in terminating DNA transfer, if both associated with an upstream A[C/T]T[C/G]A[A/T]T[G/T][C/T][G/T][C/G]A[C/T][C/T][A/T] domain and tightly linked cytosine clusters at their junctions with downstream DNA. New insights in border region requirements were used to construct an all-native alfalfa (Medicago sativa) transfer DNA vector that can be used for the production of intragenic plants.
Collapse
Affiliation(s)
- Caius M Rommens
- J.R. Simplot Company, Simplot Plant Sciences, Boise, IA 83706, USA.
| | | | | | | | | | | | | |
Collapse
|
18
|
Ruvinsky A, Eskesen ST, Eskesen FN, Hurst LD. Can codon usage bias explain intron phase distributions and exon symmetry? J Mol Evol 2005; 60:99-104. [PMID: 15696372 DOI: 10.1007/s00239-004-0032-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2004] [Accepted: 08/31/2004] [Indexed: 10/25/2022]
Abstract
More introns exist between codons (phase 0) than between the first and the second bases (phase 1) or between the second and the third base (phase 2) within the codon. Many explanations have been suggested for this excess of phase 0. It has, for example, been argued to reflect an ancient utility for introns in separating exons that code for separate protein modules. There may, however, be a simple, alternative explanation. Introns typically require, for correct splicing, particular nucleotides immediately 5' in exons (typically a G) and immediately 3' in the following exon (also often a G). Introns therefore tend to be found between particular nucleotide pairs (e.g., G|G pairs) in the coding sequence. If, owing to bias in usage of different codons, these pairs are especially common at phase 0, then intron phase biases may have a trivial explanation. Here we take codon usage frequencies for a variety of eukaryotes and use these to generate random sequences. We then ask about the phase of putative intron insertion sites. Importantly, in all simulated data sets intron phase distribution is biased in favor of phase 0. In many cases the bias is of the magnitude observed in real data and can be attributed to codon usage bias. It is also known that exons may carry either the same phase (symmetric) or different phases (asymmetric) at the opposite ends. We simulated a distribution of different types of exons using frequencies of introns observed in real genes assuming random combination of intron phases at the opposite sides of exons. Surprisingly the simulated pattern was quite similar to that observed. In the simulants we typically observe a prevalence of symmetric exons carrying phase 0 at both ends, which is common for eukaryotic genes. However, at least in some species, the extent of the bias in favor of symmetric (0,0) exons is not as great in simulants as in real genes. These results emphasize the need to construct a biologically relevant null model of successful intron insertion.
Collapse
Affiliation(s)
- A Ruvinsky
- Institute for Genetics and Bioinformatics, University of New England, Armidale 2351, NSW, Australia.
| | | | | | | |
Collapse
|
19
|
Abstract
We studied intron loss in 684 groups of orthologous genes from seven fully sequenced eukaryotic genomes. We found that introns closer to the 3' ends of genes are preferentially lost, as predicted if introns are lost through gene conversion with a reverse transcriptase product of a spliced mRNA. Adjacent introns tend to be lost in concert, as expected if such events span multiple intron positions. Directly contrary to the expectations of some, introns that do not interrupt codons (phase zero) are more, not less, likely to be lost, an intriguing and previously unappreciated result. Adjacent introns with matching phases are not more likely to be retained, as would be expected if they enjoyed a relative selective advantage. The findings of 3' and phase zero intron loss biases are in direct contradiction to an extremely recent study of fungi intron evolution. All patterns are less pronounced in the lineage leading to Caenorhabditis elegans, suggesting that the process of intron loss may be qualitatively different in nematodes. Our results support a reverse transcriptase-mediated model of intron loss.
Collapse
Affiliation(s)
- Scott W Roy
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA.
| | | |
Collapse
|
20
|
Sok AJ, Czajewska K, Ozyhar A, Kochman M. The structure of the juvenile hormone binding protein gene from Galleria mellonella. Biol Chem 2005; 386:1-10. [PMID: 15843141 DOI: 10.1515/bc.2005.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
AbstractJuvenile hormone (JH) and ecdysone are the key hormones controlling insect growth and development. The juvenile hormone binding protein (JHBP) is the first member in the array of proteins participating in JH signal transmission. In the present report a wholejhbpgene sequence (9790 bp) is described. Thejhbpgene contains four introns (A–D). All the introns have common flanking sequences: GT at the 5′ and AG at the 3′ end. The first intron is in phase 1, the second in phase 2, and the third and fourth in phase 1. An analysis of these sequences suggests that U2-class spliceosomes are involved in intron excision from pre-mRNA. Several horizontally transmitted elements from other genes were found in the introns. Alljhbpexons are positioned in local AT-reach regions of the gene. A search for core promoter regulatory elements revealed that the TATA box starts 29 bp preceding the start of transcription; the sequence TCAGTA representing a putative initiator sequence (Inr) starts at position +14. Eight characteristic sequences for bindingBroad-Complexgene products, which coordinate the ecdysone temporal response, are present in the non-coding sequence of thejhbpgene. An analysis of exon locations and intron phases indicates thatjhbpgene organization is related to theretinol binding proteingene, a member of the lipocalin family.
Collapse
Affiliation(s)
- Agnieszka J Sok
- Division of Biochemistry, Institute of Organic Chemistry, Biochemistry and Biotechnology, Wrocław University of Technology, Wybrzeze Wyspiańskiego 27, 50-370 Wrocław, Poland
| | | | | | | |
Collapse
|
21
|
Sverdlov AV, Rogozin IB, Babenko VN, Koonin EV. Reconstruction of ancestral protosplice sites. Curr Biol 2004; 14:1505-8. [PMID: 15324669 DOI: 10.1016/j.cub.2004.08.027] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2004] [Revised: 06/24/2004] [Accepted: 06/24/2004] [Indexed: 11/20/2022]
Abstract
Most of the eukaryotic protein-coding genes are interrupted by multiple introns. A substantial fraction of introns occupy the same position in orthologous genes from distant eukaryotes, such as plants and animals, and consequently are inferred to have been inherited from the common ancestor of these organisms. In contrast to these conserved introns, many other introns appear to have been gained during evolution of each major eukaryotic lineage. The mechanism(s) of insertion of new introns into genes remains unknown. Because the nucleotides that flank splice junctions are nonrandom, it has been proposed that introns are preferentially inserted into specific target sequences termed protosplice sites. However, it remains unclear whether the consensus nucleotides flanking the splice junctions are remnants of the original protosplice sites or if they evolved convergently after intron insertion. Here, we directly address the existence of protosplice sites by examining the context of introns inserted within codons that encode amino acids conserved in all eukaryotes and accordingly are not subject to selection for splicing efficiency. We show that introns are either predominantly inserted into specific protosplice sites, which have the consensus sequence (A/C)AG/Gt, or that they are inserted randomly but are preferentially fixed at such sites.
Collapse
Affiliation(s)
- Alexander V Sverdlov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
22
|
Abstract
The evolutionary origin of spliceosomal introns remains elusive. The startling success of a new way of predicting intron sites suggests that the splicing machinery determines where introns are added to genes.
Collapse
Affiliation(s)
- Arlin Stoltzfus
- Center for Advanced Research in Biotechnology (CARB), 9600 Gudelsky Drive, Rockville, Maryland 20850, USA.
| |
Collapse
|
23
|
Sverdlov AV, Rogozin IB, Babenko VN, Koonin EV. Evidence of splice signal migration from exon to intron during intron evolution. Curr Biol 2004; 13:2170-4. [PMID: 14680632 DOI: 10.1016/j.cub.2003.12.003] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
A comparison of the nucleotide sequences around the splice junctions that flank old (shared by two or more major lineages of eukaryotes) and new (lineage-specific) introns in eukaryotic genes reveals substantial differences in the distribution of information between introns and exons. Old introns have a lower information content in the exon regions adjacent to the splice sites than new introns but have a corresponding higher information content in the intron itself. This suggests that introns insert into nonrandom (proto-splice) sites but, during the evolution of an intron after insertion, the splice signal shifts from the flanking exon regions to the ends of the intron itself. Accumulation of information inside the intron during evolution suggests that new introns largely emerge de novo rather than through propagation and migration of old introns.
Collapse
Affiliation(s)
- Alexander V Sverdlov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | | | |
Collapse
|
24
|
Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV. Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr Biol 2003; 13:1512-7. [PMID: 12956953 DOI: 10.1016/s0960-9822(03)00558-x] [Citation(s) in RCA: 301] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Sequencing of eukaryotic genomes allows one to address major evolutionary problems, such as the evolution of gene structure. We compared the intron positions in 684 orthologous gene sets from 8 complete genomes of animals, plants, fungi, and protists and constructed parsimonious scenarios of evolution of the exon-intron structure for the respective genes. Approximately one-third of the introns in the malaria parasite Plasmodium falciparum are shared with at least one crown group eukaryote; this number indicates that these introns have been conserved through >1.5 billion years of evolution that separate Plasmodium from the crown group. Paradoxically, humans share many more introns with the plant Arabidopsis thaliana than with the fly or nematode. The inferred evolutionary scenario holds that the common ancestor of Plasmodium and the crown group and, especially, the common ancestor of animals, plants, and fungi had numerous introns. Most of these ancestral introns, which are retained in the genomes of vertebrates and plants, have been lost in fungi, nematodes, arthropods, and probably Plasmodium. In addition, numerous introns have been inserted into vertebrate and plant genes, whereas, in other lineages, intron gain was much less prominent.
Collapse
Affiliation(s)
- Igor B Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | | | | | |
Collapse
|
25
|
Fedorova L, Fedorov A. Introns in gene evolution. CONTEMPORARY ISSUES IN GENETICS AND EVOLUTION 2003. [DOI: 10.1007/978-94-010-0229-5_3] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
26
|
Fedorov A, Merican AF, Gilbert W. Large-scale comparison of intron positions among animal, plant, and fungal genes. Proc Natl Acad Sci U S A 2002; 99:16128-33. [PMID: 12444254 PMCID: PMC138576 DOI: 10.1073/pnas.242624899] [Citation(s) in RCA: 142] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We purge large databases of animal, plant, and fungal intron-containing genes to a 20% similarity level and then identify the most similar animal-plant, animal-fungal, and plant-fungal protein pairs. We identify the introns in each BLAST 2.0 alignment and score matched intron positions and slid (near-matched, within six nucleotides) intron positions automatically. Overall we find that 10% of the animal introns match plant positions, and a further 7% are "slides." Fifteen percent of fungal introns match animal positions, and 13% match plant positions. Furthermore, the number of alignments with high numbers of matches deviates greatly from the Poisson expectation. The 30 animal-plant alignments with the highest matches (for which 44% of animal introns match plant positions) when aligned with fungal genes are also highly enriched for triple matches: 39% of the fungal introns match both animal and plant positions. This is strong evidence for ancestral introns predating the animal-plant-fungal divergence, and in complete opposition to any expectations based on random insertion. In examining the slid introns, we show that at least half are caused by imperfections in the alignments, and are most likely to be actual matches at common positions. Thus, our final estimates are that approximately equal 14% of animal introns match plant positions, and that approximately equal 17-18% of fungal introns match animal or plant positions, all of these being likely to be ancestral in the eukaryotes.
Collapse
Affiliation(s)
- Alexei Fedorov
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA
| | | | | |
Collapse
|
27
|
Altenhein B, Markl J, Lieb B. Gene structure and hemocyanin isoform HtH2 from the mollusc Haliotis tuberculata indicate early and late intron hot spots. Gene 2002; 301:53-60. [PMID: 12490323 DOI: 10.1016/s0378-1119(02)01081-8] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
We have cloned and sequenced cDNAs coding for the complete primary structure of HtH2, the second hemocyanin isoform of the marine gastropod Haliotis tuberculata. The deduced protein sequence comprises 3399 amino acids, corresponding to a molecular mass of 392 kDa. It shares only 66% of structural identity with the previously analysed first isoform HtH1, and according to a molecular clock, the two isoforms of Haliotis hemocyanin separated ca. 320 million years ago. By genomic polymerase chain reaction and 5' race, we have also sequenced the complete gene of HtH2 (18,598 bp), except of the 5' region in front of the secreted protein. It encompasses 15 exons and 14 introns and shows several microsatellite-rich regions. It mirrors the modular structure of the encoded hemocyanin subunit, with a linear arrangement of eight different functional units separated and bordered by seven phase 1 'linker introns'. In addition, within regions encoding three of the functional units, the HtH2 gene contains six 'internal introns'. Comparison to previously sequenced genes of Octopus dofleini hemocyanin and Haliotis hemocyanin isoform (HtH1) suggests Precambrian and Palaeocoic hot spot of intron gains, followed by 320 million years of absolute stasis.
Collapse
Affiliation(s)
- Benjamin Altenhein
- Institute of Zoology, Johannes Gutenberg University, 55099, Mainz, Germany
| | | | | |
Collapse
|
28
|
Abstract
In this paper we critically review the 'classical' model for the emergence of the three domains (Archaea, Bacteria, Eucarya), which presents hyperthermophilic procaryotes as the ancestors of all life on this planet. We come to the conclusion that our last common ancestor is likely to have been rather a non-hyperthermophilic protoeucaryote endowed with sn-1,2 glycerol ester lipids (as in modern Bacteria and Eucarya), from which Archaea emerged by streamlining under pressure for adapting to heat, a process which involved an important molecular innovation: the advent of sn-2,3 glycerol ether lipids. The nature of the primeval bacterial lines of descent is less clear; it would appear, nevertheless, that the first extreme- and hyperthermophilic Bacteria emerged by converging mechanisms; lateral gene transfer from Archaea may have played a role in this adaptation.
Collapse
Affiliation(s)
- Ying Xu
- Microbiology, Free University of Brussels (VUB), JM Wiame Institute for Microbiology, 1 avenue Emile Gryzon, B-1070, Brussels, Belgium.
| | | |
Collapse
|
29
|
Abstract
Much progress in understanding the evolution of new genes has been accomplished in the past few years. Molecular mechanisms such as illegitimate recombination and LINE element mediated 3' transduction underlying exon shuffling, a major process for generating new genes, are better understood. The identification of young genes in invertebrates and vertebrates has revealed a significant role of adaptive evolution acting on initially rudimentary gene structures created as if by evolutionary tinkers. New genes in humans and our primate relatives add a new component to the understanding of genetic divergence between humans and non-humans.
Collapse
Affiliation(s)
- M Long
- Department of Ecology and Evolution, The University of Chicago, 1101 East 57th Street, Chicago Illinois 60637, USA.
| |
Collapse
|
30
|
Fedorov A, Cao X, Saxonov S, de Souza SJ, Roy SW, Gilbert W. Intron distribution difference for 276 ancient and 131 modern genes suggests the existence of ancient introns. Proc Natl Acad Sci U S A 2001; 98:13177-82. [PMID: 11687643 PMCID: PMC60844 DOI: 10.1073/pnas.231491498] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
o introns delineate elements of protein tertiary structure? This issue is crucial to the debate about the role and origin of introns. We present an analysis of the full set of proteins with known three-dimensional structures that have homologs with intron positions recorded in GenBank. A computer program was generated that maps on a reference sequence the positions of all introns in homologous genes. We have applied this program to a set of 665 nonredundant protein sequences with defined three-dimensional structures in the Protein Data Bank (PDB), which yielded 8,217 introns in 407 proteins. For the subset of proteins corresponding to ancient conserved regions (ACR), we find that there is a correlation of phase-zero introns with the boundary regions of modules and no correlation for the phase-one and phase-two positions. However, for a subset of proteins without prokaryotic counterparts (131 non-ACR proteins), a set of presumably modern proteins (or proteins that have diverged extremely far from any ancestral form), we do not find any correlation of phase-zero intron positions with three-dimensional structure. Furthermore, we find an anticorrelation of phase-one intron positions with module boundaries: they actually have a preference for the interior of modules. This finding is explicable as a preference for phase-one introns to lie in glycines, between G/G sequences, the preference for glycines being anticorrelated with the three-dimensional modules. We interpret this anticorrelation as a sign that a number of phase-one introns, and hence many modern introns, have been inserted into G/G "protosplice" sequences.
Collapse
Affiliation(s)
- A Fedorov
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA
| | | | | | | | | | | |
Collapse
|
31
|
Jean L, Long M, Young J, Péry P, Tomley F. Aspartyl proteinase genes from apicomplexan parasites: evidence for evolution of the gene structure. Trends Parasitol 2001; 17:491-8. [PMID: 11587964 DOI: 10.1016/s1471-4922(01)02030-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Aspartyl proteinases are a widely distributed family of enzymes. All vertebrate aspartyl proteinases share a conserved nine-exon gene structure, but in other organisms the structure of aspartyl proteinase genes varies considerably. The exon-intron patterns generally reflect phylogeny based on amino acid sequences. However, close comparison of these gene structures reveals some striking features, such as the conservation of intron positions and intron phases between aspartyl proteinases from nematodes and apicomplexans. Here, we discuss the implications of gene structure for the possible evolution of the aspartyl proteinase family, with particular reference to the plasmepsins of Plasmodium falciparum and eimepsin from Eimeria tenella.
Collapse
Affiliation(s)
- L Jean
- National Institute for Medical Research, Division of Parasitology, The Ridgeway, Mill Hill, London, UK NW7 1AA.
| | | | | | | | | |
Collapse
|