1
|
Chen B, Shao J, Zhuang H, Wen J. Evolutionary dynamics of triosephosphate isomerase gene intron location pattern in Metazoa: A new perspective on intron evolution in animals. Gene 2017; 602:24-32. [PMID: 27864009 DOI: 10.1016/j.gene.2016.11.027] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2016] [Revised: 11/13/2016] [Accepted: 11/14/2016] [Indexed: 11/17/2022]
Abstract
Intron evolution, including its dynamics in the evolutionary transitions and diversification of eukaryotes, remains elusive. Inadequate taxon sampling due to data shortage, unclear phylogenetic framework, and inappropriate outgroup application might be among the causes. Besides, the integrity of all the introns within a gene was often neglected previously. Taking advantage of the ancient conserved triosephosphate isomerase gene (tim), the relatively robust phylogeny of Metazoa, and choanoflagellates as outgroup, the evolutionary dynamics of tim intron location pattern (ILP) in Metazoa was investigated. From 133 representative species of ten phyla, 30 types of ILPs were identified. A most common one, which harbors the maximum six intron positions, is deduced to be the common ancestral tim ILP of Metazoa, which almost had formed in their protozoan ancestor and was surprisingly retained and passed down till to each ancestors of metazoan phyla. In the subsequent animal diversification, it underwent different evolutionary trajectories: within Deuterostomia, it was almost completely retained only with changes in a few species with relatively recently fast-evolving histories, while within the rapidly radiating Protostomia, besides few but remarkable retention, it usually displayed extensive intron losses and a few gains. Therefore, a common ancestral exon-intron arrangement pattern of an animal gene is definitely discovered; besides the 'intron-rich view' of early animal genes being confirmed, the novel insight that high exon-intron re-arrangements of genes seem to be associated with the relatively recently rapid evolution of lineages/species/genomes but have no correlation with the ancient major evolutionary transitions in animal evolution, is revealed.
Collapse
Affiliation(s)
- Bing Chen
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.
| | - Jingru Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.
| | - Huifu Zhuang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.
| | - Jianfan Wen
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China.
| |
Collapse
|
2
|
De Kee DW, Gopalan V, Stoltzfus A. A Sequence-Based Model Accounts Largely for the Relationship of Intron Positions to Protein Structural Features. Mol Biol Evol 2007; 24:2158-68. [PMID: 17646255 DOI: 10.1093/molbev/msm151] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Claims of intron-structure correlations have played a major role in debates surrounding split gene origins. In the formative (as opposed to disruptive or "insertional") model of split gene origins, introns represent the scars of chimaeric gene assembly. When analyzed retrospectively, formative introns should tend to fall between modular units, if such units exist, or at least to exhibit a preference for sites favorable to chimaera formation. However, there is another possible source of preferences: under a disruptive model of split gene origins, fortuitous intron-structure correlations may arise because the gain of introns is biased with respect to flanking nucleotide sequences. To investigate the extent to which a sequence-biased intron gain model may account for the present-day distribution of introns, data on over 10,000 introns in eukaryotic protein-coding genes were integrated with structural data from a set of 1,851 nonredundant protein chains. The positions of introns with respect to secondary structures, solvent accessibility, and so-called "modules" were evaluated relative to the expectations of a null model, a disruptive model based on amino acid frequencies at splice junctions, and a formative model defined relative to these. The null model can be excluded for most structural features and is highly improbable when intron sites are grouped by reading frame phase. Phase-dependent correlations with secondary structure and side-chain surface accessibility are particularly strong. However, these phase-dependent correlations are explained largely by the sequence-based disruptive model.
Collapse
Affiliation(s)
- Danny W De Kee
- Center for Advanced Research in Biotechnology, Rockville, MD, USA
| | | | | |
Collapse
|
3
|
Roy SW, Gilbert W. Rates of intron loss and gain: implications for early eukaryotic evolution. Proc Natl Acad Sci U S A 2005; 102:5773-8. [PMID: 15827119 PMCID: PMC556292 DOI: 10.1073/pnas.0500383102] [Citation(s) in RCA: 150] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We study the intron-exon structures of 684 groups of orthologs from seven diverse eukaryotic genomes and provide maximum likelihood estimates for rates and numbers of intron losses and gains in these same genes for a variety of lineages. Rates of intron loss vary from approximately 2 x 10(-9) to 2 x 10(-10) per year. Rates of gain vary from 6 x 10(-13) to 4 x 10(-12) per possible intron insertion site per year. There is an inverse correspondence between rates of intron loss and gain, leading to a 20-fold variation among lineages in the ratio of the rates of the two processes. The observed rates of intron gain are insufficient to explain the large number of introns estimated to have been present in the plant-animal ancestor, suggesting that introns present in early eukaryotes may have been created by a fundamentally different process than more recently gained introns.
Collapse
Affiliation(s)
- Scott William Roy
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA.
| | | |
Collapse
|
4
|
Barik S. When proteome meets genome: the alpha helix and the beta strand of proteins are eschewed by mRNA splice junctions and may define the minimal indivisible modules of protein architecture. J Biosci 2005; 29:261-73. [PMID: 15381847 PMCID: PMC2367099 DOI: 10.1007/bf02702608] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
The significance of the intron-exon structure of genes is a mystery. As eukaryotic proteins are made up of modular functional domains, each exon was suspected to encode some form of module; however, the definition of a module remained vague. Comparison of pre-mRNA splice junctions with the three-dimensional architecture of its protein product from different eukaryotes revealed that the junctions were far less likely to occur inside the alpha-helices and beta-strands of proteins than within the more flexible linker regions ('turns' and 'loops') connecting them. The splice junctions were equally distributed in the different types of linkers and throughout the linker sequence, although a slight preference for the central region of the linker was observed. The avoidance of the alpha-helix and the beta-strand by splice junctions suggests the existence of a selection pressure against their disruption, perhaps underscoring the investment made by nature in building these intricate secondary structures. A corollary is that the helix and the strand are the smallest integral architectural units of a protein and represent the minimal modules in the evolution of protein structure. These results should find use in comparative genomics, designing of cloning strategies, and in the mutual verification of genome sequences with protein structures.
Collapse
Affiliation(s)
- Sailen Barik
- Department of Biochemistry and Molecular Biology (MSB 2370), University of South Alabama, College of Medicine, 307 University Blvd., Mobile 36688-0002, USA.
| |
Collapse
|
5
|
Roy SW, Fedorov A, Gilbert W. The signal of ancient introns is obscured by intron density and homolog number. Proc Natl Acad Sci U S A 2002; 99:15513-7. [PMID: 12432089 PMCID: PMC137748 DOI: 10.1073/pnas.242600199] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In ancient genes whose products have known 3-dimensional structures, an excess of phase zero introns (those that lie between the codons) appear in the boundaries of modules, compact regions of the polypeptide chain. These excesses are highly significant and could support the hypothesis that ancient genes were assembled by exon shuffling involving compact modules. (Phase one and two introns, and many phase zero introns, appear to arise later.) However, as more genes, with larger numbers of homologs and intron positions, were examined, the effects became smaller, dropping from a 40% excess to an 8% excess as the number of intron positions increased from 570 to 3,328, even though the statistical significance remained strong. An interpretation of this behavior is that novel inserted positions appearing in homologs washed out the signal from a finite number of ancient positions. Here we show that this is likely to be the case. Analyses of intron positions restricted to those in genes for which relatively few intron positions from homologs are known, or to those in genes with a small number of known homologous gene structures, show a significant correlation of phase zero intron positions with the module structure, which weakens as the density of attributed intron positions or the number of homologs increases. These effects do not appear for phase one and phase two introns. This finding matches the expectation of the mixed model of intron origin, in which a fraction of phase zero introns are left from the assembly of the first genes, while other introns have been added in the course of evolution.
Collapse
Affiliation(s)
- Scott William Roy
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA
| | | | | |
Collapse
|
6
|
Morinaga Y, Nomura N, Sako Y. Population Dynamics of Archaeal Mobile Introns in Natural Environments: A Shrewd Invasion Strategy of the Latent Parasitic DNA. Microbes Environ 2002. [DOI: 10.1264/jsme2.17.153] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
- Yayoi Morinaga
- Division of Applied Biosciences, Graduate School of Agriculture, Kyoto University
| | - Norimichi Nomura
- Division of Applied Biosciences, Graduate School of Agriculture, Kyoto University
| | - Yoshihiko Sako
- Division of Applied Biosciences, Graduate School of Agriculture, Kyoto University
| |
Collapse
|
7
|
Fedorov A, Cao X, Saxonov S, de Souza SJ, Roy SW, Gilbert W. Intron distribution difference for 276 ancient and 131 modern genes suggests the existence of ancient introns. Proc Natl Acad Sci U S A 2001; 98:13177-82. [PMID: 11687643 PMCID: PMC60844 DOI: 10.1073/pnas.231491498] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
o introns delineate elements of protein tertiary structure? This issue is crucial to the debate about the role and origin of introns. We present an analysis of the full set of proteins with known three-dimensional structures that have homologs with intron positions recorded in GenBank. A computer program was generated that maps on a reference sequence the positions of all introns in homologous genes. We have applied this program to a set of 665 nonredundant protein sequences with defined three-dimensional structures in the Protein Data Bank (PDB), which yielded 8,217 introns in 407 proteins. For the subset of proteins corresponding to ancient conserved regions (ACR), we find that there is a correlation of phase-zero introns with the boundary regions of modules and no correlation for the phase-one and phase-two positions. However, for a subset of proteins without prokaryotic counterparts (131 non-ACR proteins), a set of presumably modern proteins (or proteins that have diverged extremely far from any ancestral form), we do not find any correlation of phase-zero intron positions with three-dimensional structure. Furthermore, we find an anticorrelation of phase-one intron positions with module boundaries: they actually have a preference for the interior of modules. This finding is explicable as a preference for phase-one introns to lie in glycines, between G/G sequences, the preference for glycines being anticorrelated with the three-dimensional modules. We interpret this anticorrelation as a sign that a number of phase-one introns, and hence many modern introns, have been inserted into G/G "protosplice" sequences.
Collapse
Affiliation(s)
- A Fedorov
- Department of Molecular and Cellular Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA 02138, USA
| | | | | | | | | | | |
Collapse
|
8
|
Takahashi KI, Noguti T, Hojo H, Yamauchi K, Kinoshita M, Aimoto S, Ohkubo T, Gō M. A mini-protein designed by removing a module from barnase: molecular modeling and NMR measurements of the conformation. PROTEIN ENGINEERING 1999; 12:673-80. [PMID: 10469828 DOI: 10.1093/protein/12.8.673] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
A globular domain can be decomposed into compact modules consisting of contiguous 10-30 amino acid residues. The correlation between modules and exons observed in different proteins suggests that each module was encoded by an ancestral exon and that modules were combined into globular domains by exon fusion. Barnase is a single domain RNase consisting of 110 amino acid residues and was decomposed into six modules. We designed a mini-protein by removing the second module, M2, from barnase in order to gain an insight into the structural and functional roles of the module. In the molecular modeling of the mini-protein, we evaluated thermodynamic stability and aqueous solubility together with mechanical stability of the model. We chemically synthesized a mini-barnase with (15)N-labeling at 10 residues, whose corresponding residues in barnase are all found in the region around the hydrophobic core. Circular dichroism and NMR measurements revealed that mini-barnase takes a non-random specific conformation that has a similar hydrophobic core structure to that of barnase. This result, that a module could be deleted without altering the structure of core region of barnase, supports the view that modules act as the building blocks of protein design.
Collapse
Affiliation(s)
- K i Takahashi
- Division of Biological Science, Graduate School of Science, Nagoya University, Furo-cho, Chikusa, Nagoya 464-8602, Japan
| | | | | | | | | | | | | | | |
Collapse
|
9
|
Tsuji T, Yoshida K, Satoh A, Kohno T, Kobayashi K, Yanagawa H. Foldability of barnase mutants obtained by permutation of modules or secondary structure units. J Mol Biol 1999; 286:1581-96. [PMID: 10064693 DOI: 10.1006/jmbi.1998.2558] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Modules, defined as stable, compact structure units in a globular protein, are good candidates for the construction of novel foldable proteins by permutation. Here we decomposed barnase into six modules (M1-M6) and constructed 23 barnase mutants containing permutations of the internal four (M2-M5) out of six modules. Globular proteins can also be subdivided into secondary structure units based on the extended structures that control the mutual relationships of the modules. We also decomposed barnase into six secondary structure units (S1-S6) and constructed 21 barnase mutants containing permutations of the internal four (S2-S5) out of six secondary structure units. Foldability of these two types of mutants was assessed by means of circular dichroism, fluorescence, and 1H-NMR measurements. A total of 15 of 23 module mutants and 15 of 21 secondary structure unit mutants formed definite secondary structures, such as alpha-helix and beta-sheet, at 20 microM owing to intermolecular interactions, but most of them converted to random coil structures at a lower concentration (1 microM). Of the 44 mutants, only two, M3245 and S2543, gave distinct near-UV CD spectra. S2543 especially showed definite signal dispersion in the amide and methyl regions of the 1H-NMR spectrum, though M3245 did not. Furthermore, urea-induced unfolding of S2543 monitored by far-UV CD and fluorescence measurements showed a distinct cooperative transition. These results strongly suggest that S2543 takes partially folded conformations in aqueous solution. Our results also suggest that building blocks such as secondary structure units capable of taking different stable conformations by adapting themselves to the surrounding environment, rather than building blocks such as modules having a specified stable conformation, are required for the formation of foldable proteins. Therefore, the use of secondary structure units for the construction of novel globular proteins is likely to be an effective approach.
Collapse
Affiliation(s)
- T Tsuji
- Department of Chemistry and Biotechnology, Yokohama National University, Tokiwadai Hodogaya-ku, Yokohama, 240, Japan
| | | | | | | | | | | |
Collapse
|
10
|
Abstract
Does the intron/exon structure of eukaryotic genes belie their ancient assembly by exon-shuffling or have introns been inserted into preformed genes during eukaryotic evolution? These are the central questions in the ongoing 'introns-early' versus 'introns-late' controversy. The phylogenetic distribution of spliceosomal introns continues to strongly favor the intronslate theory. The introns-early theory, however, has claimed support from intron phase and protein structure correlations.
Collapse
Affiliation(s)
- J M Logsdon
- Department of Biochemistry, Dalhousie University, Halifax, Nova Scotia,B3H 4H7, Canada.
| |
Collapse
|
11
|
Tyshenko MG, Walker VK. Towards a reconciliation of the introns early or late views: triosephosphate isomerase genes from insects. BIOCHIMICA ET BIOPHYSICA ACTA 1997; 1353:131-6. [PMID: 9294007 DOI: 10.1016/s0167-4781(97)00065-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The gene encoding the glycolytic enzyme, triosephosphate isomerase (TPI; EC 5.3.1.1), is a favourite model for molecular evolutionists who either subscribe to the theory that introns co-evolved with the ancestral gene, the introns early view, or alternatively, that introns are more recent immigrants. The discovery of an intron in the TPI gene of Culex mosquitoes at a site which was predicted by proponents of the intron early school supported that theory. More recently, the discovery of additional intron sites in several eukaryotes was presented as evidence supporting the introns late school. We have found the 'Culex intron' in two closely related mosquitoes, but not in two more evolutionary primitive Dipterans, suggesting that, if it is an 'ancient intron', loss may be more frequent than that supposed by the intron late school. In addition, we have found that three introns punctuating the TPI gene from the Lepidopteran, Heliothis, appear to be ancestrally related and may be the result of transposable element insertion, 50-90 million years ago. It is argued that both opposing schools in the intron debate be reconciled -- some introns may have been early and certainly others have arrived subsequent to the appearance of the TPI gene.
Collapse
Affiliation(s)
- M G Tyshenko
- Department of Biology, Queen's University, Kingston, Ont., Canada
| | | |
Collapse
|
12
|
Rzhetsky A, Ayala FJ, Hsu LC, Chang C, Yoshida A. Exon/intron structure of aldehyde dehydrogenase genes supports the "introns-late" theory. Proc Natl Acad Sci U S A 1997; 94:6820-5. [PMID: 9192649 PMCID: PMC21242 DOI: 10.1073/pnas.94.13.6820] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Whether or not nuclear introns predate the divergence of bacteria and eukaryotes is the central argument between the proponents of the "introns-early" and "introns-late" theories. In this study we compared the goodness-of-fit of each theory with a probabilistic model of exon/intron evolution and multiple nonallelic genes encoding human aldehyde dehydrogenases (ALDHs). Using a reconstructed phylogenetic tree of ALDH genes, we computed the likelihood of obtaining the present-day ALDH sequences under the assumptions of each competing theory. Although on the grounds of its own assumptions each theory accounted for the ALDH data significantly better than its rival, the introns-early model required frequent intron slippage, and the estimated slippage rates were too high to be consistent with reported correlations between the boundaries of ancient protein modules and the ends of ancient exons. Because the molecular mechanisms proposed to explain intron slippage are incapable of providing such high rates and are incompatible with the observed distribution of introns in higher eukaryotes, the ALDH data support the introns-late theory.
Collapse
Affiliation(s)
- A Rzhetsky
- Institute of Molecular Evolutionary Genetics and Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | | | | | | | | |
Collapse
|
13
|
Keeling PJ, Doolittle WF. Evidence that eukaryotic triosephosphate isomerase is of alpha-proteobacterial origin. Proc Natl Acad Sci U S A 1997; 94:1270-5. [PMID: 9037042 PMCID: PMC19780 DOI: 10.1073/pnas.94.4.1270] [Citation(s) in RCA: 87] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
We have cloned and sequenced genes for triosephosphate isomerase (TPI) from the gamma-proteobacterium Francisella tularensis, the green non-sulfur bacterium Chloroflexus aurantiacus, and the alpha-proteobacterium Rhizobium etli and used these in phylogenetic analysis with TPI sequences from other members of the Bacteria, Archaea, and Eukarya. These analyses show that eukaryotic TPI genes are most closely related to the homologue from the alpha-proteobacterium and most distantly related to archaebacterial homologues. This relationship suggests that the TPI genes present in modern eukaryotic genomes were derived from an alpha-proteobacterial genome (possibly that of the protomitochondrial endosymbiont) after the divergence of Archaea and Eukarya. Among these eukaryotic genes are some from deeply branching, amitochondrial eukaryotes (namely Giardia), which further suggests that this event took place quite early in eukaryotic evolution.
Collapse
Affiliation(s)
- P J Keeling
- Department of Biochemistry, Dalhousie University, Halifax, NS Canada
| | | |
Collapse
|
14
|
Gilson PR, McFadden GI. The miniaturized nuclear genome of eukaryotic endosymbiont contains genes that overlap, genes that are cotranscribed, and the smallest known spliceosomal introns. Proc Natl Acad Sci U S A 1996; 93:7737-42. [PMID: 8755545 PMCID: PMC38817 DOI: 10.1073/pnas.93.15.7737] [Citation(s) in RCA: 98] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Chlorarachniophyte algae contain a complex, multi-membraned chloroplast derived from the endosymbiosis of a eukaryotic alga. The vestigial nucleus of the endosymbiont, called the nucleomorph, contains only three small linear chromosomes with a haploid genome size of 380 kb and is the smallest known eukaryotic genome. Nucleotide sequence data from a subtelomeric fragment of chromosome III were analyzed as a preliminary investigation of the coding capacity of this vestigial genome. Several housekeeping genes including U6 small nuclear RNA (snRNA), ribosomal proteins S4 and S13, a core protein of the spliceosome [small nuclear ribonucleoprotein (snRNP) E], and a cip-like protease (clpP) were identified. Expression of these genes was confirmed by combinations of Northern blot analysis, in situ hybridization, immunocytochemistry, and cDNA analysis. The protein-encoding genes are typically eukaryotic in overall structure and their messenger RNAs are polyadenylylated. A novel feature is the abundance of 18-, 19-, or 20-nucleotide introns; the smallest spliceosomal introns known. Two of the genes, U6 and S13, overlap while another two genes, snRNP E and clpP, are cotranscribed in a single mRNA. The overall gene organization is extraordinarily compact, making the nucleomorph a unique model for eukaryotic genomics.
Collapse
Affiliation(s)
- P R Gilson
- Plant Cell Biology Research Centre, School of Botany, University of Melbourne, Australia
| | | |
Collapse
|
15
|
Abstract
Close analysis of intron phase - the position of introns within codons - is claimed to provide novel evidence supporting the view that introns predate the divergence of bacteria and eukaryotes and, via 'exon shuffling', played a crucial role in protein evolution. But just how compelling is this evidence?
Collapse
Affiliation(s)
- L D Hurst
- Department of Genetics, Downing Street, Cambridge, CB2 3EH, UK
| | | |
Collapse
|
16
|
Proudhon D, Wei J, Briat J, Theil EC. Ferritin gene organization: differences between plants and animals suggest possible kingdom-specific selective constraints. J Mol Evol 1996; 42:325-36. [PMID: 8661994 DOI: 10.1007/bf02337543] [Citation(s) in RCA: 43] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Ferritin, a protein widespread in nature, concentrates iron approximately 10(11)-10(12)-fold above the solubility within a spherical shell of 24 subunits; it derives in plants and animals from a common ancestor (based on sequence) but displays a cytoplasmic location in animals compared to the plastid in contemporary plants. Ferritin gene regulation in plants and animals is altered by development, hormones, and excess iron; iron signals target DNA in plants but mRNA in animals. Evolution has thus conserved the two end points of ferritin gene expression, the physiological signals and the protein structure, while allowing some divergence of the genetic mechanisms. Comparison of ferritin gene organization in plants and animals, made possible by the cloning of a dicot (soybean) ferritin gene presented here and the recent cloning of two monocot (maize) ferritin genes, shows evolutionary divergence in ferritin gene organization between plants and animals but conservation among plants or among animals; divergence in the genetic mechanism for iron regulation is reflected by the absence in all three plant genes of the IRE, a highly conserved, noncoding sequence in vertebrate animal ferritin mRNA. In plant ferritin genes, the number of introns (n = 7) is higher than in animals (n = 3). Second, no intron positions are conserved when ferritin genes of plants and animals are compared, although all ferritin gene introns are in the coding region; within kingdoms, the intron positions in ferritin genes are conserved. Finally, secondary protein structure has no apparent relationship to intron/exon boundaries in plant ferritin genes, whereas in animal ferritin genes the correspondence is high. The structural differences in introns/exons among phylogenetically related ferritin coding sequences and the high conservation of the gene structure within plant or animal kingdoms of the gene structure within plant or animal kingdoms suggest that kingdom-specific functional constraints may exist to maintain a particular intron/exon pattern within ferritin genes. In the case of plants, where ferritin gene intron placement is unrelated to triplet codons or protein structure, and where ferritin is targeted to the plastid, the selection pressure on gene organization may relate to RNA function and plastid/nuclear signaling.
Collapse
Affiliation(s)
- D Proudhon
- Department of Biochemistry, North Carolina State University, NCSU Box 7622, Raleigh, NC 27695-7622, USA
| | | | | | | |
Collapse
|
17
|
Delboni LF, Mande SC, Rentier-Delrue F, Mainfroid V, Turley S, Vellieux FM, Martial JA, Hol WG. Crystal structure of recombinant triosephosphate isomerase from Bacillus stearothermophilus. An analysis of potential thermostability factors in six isomerases with known three-dimensional structures points to the importance of hydrophobic interactions. Protein Sci 1995; 4:2594-604. [PMID: 8580851 PMCID: PMC2143043 DOI: 10.1002/pro.5560041217] [Citation(s) in RCA: 80] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
The structure of the thermostable triosephosphate isomerase (TIM) from Bacillus stearothermophilus complexed with the competitive inhibitor 2-phosphoglycolate was determined by X-ray crystallography to a resolution of 2.8 A. The structure was solved by molecular replacement using XPLOR. Twofold averaging and solvent flattening was applied to improve the quality of the map. Active sites in both the subunits are occupied by the inhibitor and the flexible loop adopts the "closed" conformation in either subunit. The crystallographic R-factor is 17.6% with good geometry. The two subunits have an RMS deviation of 0.29 A for 248 C alpha atoms and have average temperature factors of 18.9 and 15.9 A2, respectively. In both subunits, the active site Lys 10 adopts an unusual phi, psi combination. A comparison between the six known thermophilic and mesophilic TIM structures was conducted in order to understand the higher stability of B. stearothermophilus TIM. Although the ratio Arg/(Arg+Lys) is higher in B. stearothermophilus TIM, the structure comparisons do not directly correlate this higher ratio to the better stability of the B. stearothermophilus enzyme. A higher number of prolines contributes to the higher stability of B. stearothermophilus TIM. Analysis of the known TIM sequences points out that the replacement of a structurally crucial asparagine by a histidine at the interface of monomers, thus avoiding the risk of deamidation and thereby introducing a negative charge at the interface, may be one of the factors for adaptability at higher temperatures in the TIM family. Analysis of buried cavities and the areas lining these cavities also contributes to the greater thermal stability of the B. stearothermophilus enzyme. However, the most outstanding result of the structure comparisons appears to point to the hydrophobic stabilization of dimer formation by burying the largest amount of hydrophobic surface area in B. stearothermophilus TIM compared to all five other known TIM structures.
Collapse
Affiliation(s)
- L F Delboni
- Department of Biological Structure, School of Medicine, University of Washington, Seattle 98195, USA
| | | | | | | | | | | | | | | |
Collapse
|
18
|
Gómez-Puyou A, Saavedra-Lira E, Becker I, Zubillaga RA, Rojo-Domínguez A, Pérez-Montfort R. Using evolutionary changes to achieve species-specific inhibition of enzyme action--studies with triosephosphate isomerase. CHEMISTRY & BIOLOGY 1995; 2:847-55. [PMID: 8807818 DOI: 10.1016/1074-5521(95)90091-8] [Citation(s) in RCA: 62] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
BACKGROUND Many studies that attempt to design species-specific drugs focus on differences in the three-dimensional structures of homologous enzymes. The structures of homologous enzymes are generally well conserved especially at the active site, but the amino-acid sequences are often very different. We reasoned that if a non-conserved amino acid is fundamental to the function or stability of an enzyme from one particular species, one should be able to inhibit only the enzyme from that species by using an inhibitor targeted to that residue. We set out to test this hypothesis in a model system. RESULTS We first identified a non-conserved amino acid (Cys14) whose integrity is important for catalysis in triosephosphate isomerase (TIM) from Trypanosoma brucei. The equivalent residues in rabbit and yeast TIM are Met and Leu, respectively. A Cys14Leu mutant of trypanosomal TIM had a tendency to aggregate, reduced stability and altered kinetics. To model the effects of a molecule targeted to Cys14, we used methyl methanethiosulfonate (MMTS) to derivatize Cys14 to a methyl sulfide. This treatment dramatically inhibited TIMs with a Cys residue at a position equivalent to Cys14, but not rabbit TIM (20% inhibition) or yeast TIM (negligible inhibition), which lack this residue. CONCLUSIONS Cys14 of trypanosomal TIM is a non-conserved amino acid whose alteration leads to loss of enzyme structure and function. TIMs that have a cysteine residue at position 14 could be selectively inhibited by MMTS. This approach may offer an alternative route to species-specific enzyme inhibition.
Collapse
Affiliation(s)
- A Gómez-Puyou
- Departamento de Bioenergética, Universidad Nacional Autónoma de México, México DF
| | | | | | | | | | | |
Collapse
|
19
|
Logsdon JM, Tyshenko MG, Dixon C, D-Jafari J, Walker VK, Palmer JD. Seven newly discovered intron positions in the triose-phosphate isomerase gene: evidence for the introns-late theory. Proc Natl Acad Sci U S A 1995; 92:8507-11. [PMID: 7667320 PMCID: PMC41186 DOI: 10.1073/pnas.92.18.8507] [Citation(s) in RCA: 99] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
The gene encoding the glycolytic enzyme triose-phosphate isomerase (TPI; EC 5.3.1.1) has been central to the long-standing controversy on the origin and evolutionary significance of spliceosomal introns by virtue of its pivotal support for the introns-early view, or exon theory of genes. Putative correlations between intron positions and TPI protein structure have led to the conjecture that the gene was assembled by exon shuffling, and five TPI intron positions are old by the criterion of being conserved between animals and plants. We have sequenced TPI genes from three diverse eukaryotes--the basidiomycete Coprinus cinereus, the nematode Caenorhabditis elegans, and the insect Heliothis virescens--and have found introns at seven novel positions that disrupt previously recognized gene/protein structure correlations. The set of 21 TPI introns now known is consistent with a random model of intron insertion. Twelve of the 21 TPI introns appear to be of recent origin since each is present in but a single examined species. These results, together with their implication that as more TPI genes are sequenced more intron positions will be found, render TPI untenable as a paradigm for the introns-early theory and, instead, support the introns-late view that spliceosomal introns have been inserted into preexisting genes during eukaryotic evolution.
Collapse
Affiliation(s)
- J M Logsdon
- Department of Biology, Indiana University, Bloomington 47405, USA
| | | | | | | | | | | |
Collapse
|
20
|
Kwiatowski J, Krawczyk M, Kornacki M, Bailey K, Ayala FJ. Evidence against the exon theory of genes derived from the triose-phosphate isomerase gene. Proc Natl Acad Sci U S A 1995; 92:8503-6. [PMID: 7667319 PMCID: PMC41185 DOI: 10.1073/pnas.92.18.8503] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
The exon theory of genes proposes that the introns of protein-encoding nuclear genes are remnants of the DNA spacers between ancient minigenes. The discovery of an intron at a predicted position in the triose-phosphate isomerase (EC 5.3.1.1) gene of Culex mosquitoes has been hailed as an evidential pillar of the theory. We have found that that intron is also present in Aedes mosquitoes, which are closely related to Culex, but not in the phylogenetically more distant Anopheles, nor in the fly Calliphora vicina, nor in the moth Spodoptera littoralis. The presence of this intron in Culex and Aedes is parsimoniously explained as the result of an insertion in a recent common ancestor of these two species rather than as the remnant of an ancient intron. The absence of the intron in 19 species of very diverse organisms requires at least 10 independent evolutionary losses in order to be consistent with the exon theory.
Collapse
|
21
|
Schmidt M, Svendsen I, Feierabend J. Analysis of the primary structure of the chloroplast isozyme of triosephosphate isomerase from rye leaves by protein and cDNA sequencing indicates a eukaryotic origin of its gene. BIOCHIMICA ET BIOPHYSICA ACTA 1995; 1261:257-64. [PMID: 7711069 DOI: 10.1016/0167-4781(95)00015-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
The primary structure of the chloroplast isozyme of triosephosphate isomerase from rye leaves was identified by protein and cDNA sequencing and compared to the deduced amino acid sequence of a cDNA for the cytosolic isozyme. The mature cytosolic and chloroplast isozyme proteins share 64% amino acid sequence identity. The cDNA for the chloroplast isozyme codes for a precursor protein consisting of an N-terminal transit peptide of Mr 4351 and a mature subunit of Mr 27,282. Southern blot analysis indicates that the two rye isozymes are encoded by two independent single genes. Amino acid residues or sequence regions of basic functional relevance in known triosephosphate isomerases are strictly conserved in the chloroplast isozyme. The chloroplast isozyme contains 6 cysteine residues, instead of 4 in the cytosolic isozyme. A cysteine at position 143 of the chloroplast isozyme appears to be modified. Phylogenetic trees constructed on the basis of sequence comparisons for triosephosphate isomerases from different species of all major taxonomic groups indicate that the chloroplast isozyme is much more closely related to eukaryotic cytosolic enzymes than to eubacterial enzymes. The results indicate that the nuclear gene for the chloroplast isozyme originated with that for the cytosolic isozyme through duplication of an ancestral eukaryotic gene, rather than through gene transfer from a prokaryotic endosymbiont.
Collapse
Affiliation(s)
- M Schmidt
- Botanisches Institut, J.W. Goethe-Universität, Frankfurt am Main, Germany
| | | | | |
Collapse
|
22
|
Benevolenskaya EV, Nurminsky DI, Gvozdev VA. Structure of the Drosophila melanogaster annexin X gene. DNA Cell Biol 1995; 14:349-57. [PMID: 7710691 DOI: 10.1089/dna.1995.14.349] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
The annexin X gene was cloned in the P1 recombinant phage carrying a genomic sequence of approximately 70 kb long. This DNA fragment encompasses at least two annexin X copies and several 7.8-kb tandem units represented by an anonymous sequence fused to the 3' truncated part of the annexin X gene. The proteins of annexin family contain a variable amino-terminal domain and a core domain; the latter includes four structurally conserved repeats that presumably arose as a result of duplications. The annexin X gene of Drosophila is about 2 kb long and contains four exons. Exon 1 encodes four amino-terminal amino acids, exon 2 encodes the remaining part of the amino-terminal domain and the three conserved repeats, and exon 3 and exon 4 encode the fourth repeat. The positions of introns 2 and 3 are strictly conserved with respect to both the amino acid position and codon phase as compared to introns 10 and 12 of the fourth repeat in vertebrate annexin genes. We propose the existence of a primordial annexin coding structure comprising at least two introns whose duplications during evolution have been followed by the loss of ancient introns in the first three repeats of Drosophila and vertebrates. Acquisition of new introns in vertebrates is supposed taking into account that exon borders are not found at homologous locations in four repeats of a given vertebrate annexin. Transcription of the annexin gene was detected in embryonic cell cultures. No profound effects of ecdysterone on the annexin X message content in cell cultures were observed.
Collapse
|
23
|
Henze K, Schnarrenberger C, Kellermann J, Martin W. Chloroplast and cytosolic triosephosphate isomerases from spinach: purification, microsequencing and cDNA cloning of the chloroplast enzyme. PLANT MOLECULAR BIOLOGY 1994; 26:1961-73. [PMID: 7858230 DOI: 10.1007/bf00019506] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Chloroplast and cytosolic triosephosphate isomerases from spinach were separated and purified to homogeneity. Both enzymes were partially sequenced by Edman degradation. Using degenerate primers designed against the amino acid sequences, a homologous probe for the chloroplast enzyme was amplified and used to isolate several full-size cDNA clones. Chloroplast triosephosphate isomerase is encoded by a single gene in spinach. Analysis of the chloroplast cDNA sequence in the context of its homologues from eukaryotes and eubacteria reveals that the gene arose through duplication of its preexisting nuclear counterpart for the cytosolic enzyme during plant evolution.
Collapse
Affiliation(s)
- K Henze
- Institut für Genetik, Technische Universität Braunschweig, FRG, Germany
| | | | | | | |
Collapse
|
24
|
|
25
|
Stoltzfus A, Spencer DF, Zuker M, Logsdon JM, Doolittle WF. Testing the exon theory of genes: the evidence from protein structure. Science 1994; 265:202-7. [PMID: 8023140 DOI: 10.1126/science.8023140] [Citation(s) in RCA: 158] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
A tendency for exons to correspond to discrete units of protein structure in protein-coding genes of ancient origin would provide clear evidence in favor of the exon theory of genes, which proposes that split genes arose not by insertion of introns into unsplit genes, but from combinations of primordial mini-genes (exons) separated by spacers (introns). Although putative examples of such correspondence have strongly influenced previous debate on the origin of introns, a general correspondence has not been rigorously proved. Objective methods for detecting correspondences were developed and applied to four examples that have been cited previously as evidence of the exon theory of genes. No significant correspondence between exons and units of protein structure was detected, suggesting that the putative correspondence does not exist and that the exon theory of genes is untenable.
Collapse
Affiliation(s)
- A Stoltzfus
- Department of Biochemistry, Dalhousie University, Halifax, Nova Scotia, Canada
| | | | | | | | | |
Collapse
|
26
|
Mande SC, Mainfroid V, Kalk KH, Goraj K, Martial JA, Hol WG. Crystal structure of recombinant human triosephosphate isomerase at 2.8 A resolution. Triosephosphate isomerase-related human genetic disorders and comparison with the trypanosomal enzyme. Protein Sci 1994; 3:810-21. [PMID: 8061610 PMCID: PMC2142725 DOI: 10.1002/pro.5560030510] [Citation(s) in RCA: 99] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
The crystal structure of recombinant human triosephosphate isomerase (hTIM) has been determined complexed with the transition-state analogue 2-phosphoglycolate at a resolution of 2.8 A. After refinement, the R-factor is 16.7% with good geometry. The asymmetric unit contains 1 complete dimer of 53,000 Da, with only 1 of the subunits binding the inhibitor. The so-called flexible loop, comprising residues 168-174, is in its "closed" conformation in the subunit that binds the inhibitor, and in the "open" conformation in the other subunit. The tips of the loop in these 2 conformations differ up to 7 A in position. The RMS difference between hTIM and the enzyme of Trypanosoma brucei, the causative agent of sleeping sickness, is 1.12 A for 487 C alpha positions with 53% sequence identity. Significant sequence differences between the human and parasite enzymes occur at about 13 A from the phosphate binding site. The chicken and human enzymes have an RMS difference of 0.69 A for 484 equivalent residues and about 90% sequence identity. Complementary mutations ensure a great similarity in the packing of side chains in the core of the beta-barrels of these 2 enzymes. Three point mutations in hTIM have been correlated with severe genetic disorders ranging from hemolytic disorder to neuromuscular impairment. Knowledge of the structure of the human enzyme provides insight into the probable effect of 2 of these mutations, Glu 104 to Asp and Phe 240 to Ile, on the enzyme. The third mutation reported to be responsible for a genetic disorder, Gly 122 to Arg, is however difficult to explain. This residue is far away from both catalytic centers in the dimer, as well as from the dimer interface, and seems unlikely to affect stability or activity. Inspection of the 3-dimensional structure of trypanosomal triosephosphate isomerase, which has a methionine at position 122, only increased the mystery of the effects of the Gly to Arg mutation in the human enzyme.
Collapse
Affiliation(s)
- S C Mande
- Department of Biological Structure, School of Medicine, University of Washington, Seattle 98195
| | | | | | | | | | | |
Collapse
|
27
|
Kersanach R, Brinkmann H, Liaud MF, Zhang DX, Martin W, Cerff R. Five identical intron positions in ancient duplicated genes of eubacterial origin. Nature 1994; 367:387-9. [PMID: 8114942 DOI: 10.1038/367387a0] [Citation(s) in RCA: 88] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
In 1985 Cornish-Bowden wrote "although there is now much to suggest that introns are an ancient relic of primordial genes, convincing proof must await the discovery of clearly corresponding intron arrangements in genes that arose by duplication before the separation of prokaryotes and eukaryotes". Genes for chloroplast and cytosolic glyceraldehyde-3-phosphate dehydrogenases of eukaryotes are descendants of an ancient gene family that existed in the common ancestor of extant eubacteria. During eukaryotic evolution, both genes were transferred to the nucleus from the antecedents of present-day chloroplasts and mitochondria, respectively. Here we report the discovery of five spliceosomal introns at positions that are precisely conserved between nuclear genes for this chloroplast/cytosol enzyme pair. These data provide strong evidence in favour of the 'introns early' hypothesis, which proposes that introns were present in the earliest cells, consistent with the idea that introns facilitated the assembly of primordial genes by accelerating the rate of exon shuffling.
Collapse
Affiliation(s)
- R Kersanach
- Institut für Genetik, Universität Braunschweig, Germany
| | | | | | | | | | | |
Collapse
|
28
|
Abstract
We discuss some of the arguments for introns arising early or late in evolution. We outline the exon theory of genes and discuss the series of discoveries of introns in the gene (TPI) encoding triosephosphate isomerase (TPI) that have filled out a series of better fits to the Go plot, culminating in the 1986 prediction of an intron position that was finally discovered in 1992. We present a statistical argument that the 11-intron structure of TPI (based on attributing all of the introns to an ancestral gene and interpreting three cases of very close intron positions as examples of sliding) has a clear relationship to the protein structure. The exons of this 11-intron TPI are a better approximation to Mitiko Go's modules (Go, 1981) than are 99.9% of all alternative exon patterns corresponding to 11 introns placed randomly in the gene, and better than 96% of all alternative patterns in which the lengths of the exons are preserved while the introns are moved. We combine four tests relating exons to protein structure: (i) whether the exons are compact modules, (ii) whether the exons contain most of the close contacts in the protein, (iii) whether the exon configuration maximized buried surface area along the backbone, and (iv) whether the exons maximize their content of hydrogen bonds. On a joint measure for these tests, the native exon structure with 11 introns fits these tests better than 99.4% of all alternative structures obtained by permuting the exon lengths and intron positions.(ABSTRACT TRUNCATED AT 250 WORDS)
Collapse
Affiliation(s)
- W Gilbert
- Biological Laboratories, Harvard University, Cambridge, MA 02138
| | | |
Collapse
|
29
|
|
30
|
Hallick RB, Hong L, Drager RG, Favreau MR, Monfort A, Orsat B, Spielmann A, Stutz E. Complete sequence of Euglena gracilis chloroplast DNA. Nucleic Acids Res 1993; 21:3537-44. [PMID: 8346031 PMCID: PMC331456 DOI: 10.1093/nar/21.15.3537] [Citation(s) in RCA: 288] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
We report the complete DNA sequence of the Euglena gracilis, Pringsheim strain Z chloroplast genome. This circular DNA is 143,170 bp, counting only one copy of a 54 bp tandem repeat sequence that is present in variable copy number within a single culture. The overall organization of the genome involves a tandem array of three complete and one partial ribosomal RNA operons, and a large single copy region. There are genes for the 16S, 5S, and 23S rRNAs of the 70S chloroplast ribosomes, 27 different tRNA species, 21 ribosomal proteins plus the gene for elongation factor EF-Tu, three RNA polymerase subunits, and 27 known photosynthesis-related polypeptides. Several putative genes of unknown function have also been identified, including five within large introns, and five with amino acid sequence similarity to genes in other organisms. This genome contains at least 149 introns. There are 72 individual group II introns, 46 individual group III introns, 10 group II introns and 18 group III introns that are components of twintrons (introns-within-introns), and three additional introns suspected to be twintrons composed of multiple group II and/or group III introns, but not yet characterized. At least 54,804 bp, or 38.3% of the total DNA content is represented by introns.
Collapse
Affiliation(s)
- R B Hallick
- Department of Biochemistry, University of Arizona, Tucson 85721
| | | | | | | | | | | | | | | |
Collapse
|
31
|
|
32
|
Affiliation(s)
- N J Dibb
- Department of Haematology, Royal Postgraduate Medical School, Hammersmith Hospital, London, UK
| |
Collapse
|
33
|
|
34
|
Affiliation(s)
- D J Murphy
- Department of Brassica and Oilseeds Research, John Innes Centre, Norwich, U.K
| |
Collapse
|