1
|
Patthy L. Exon Shuffling Played a Decisive Role in the Evolution of the Genetic Toolkit for the Multicellular Body Plan of Metazoa. Genes (Basel) 2021; 12:382. [PMID: 33800339 PMCID: PMC8001218 DOI: 10.3390/genes12030382] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Revised: 03/01/2021] [Accepted: 03/04/2021] [Indexed: 11/30/2022] Open
Abstract
Division of labor and establishment of the spatial pattern of different cell types of multicellular organisms require cell type-specific transcription factor modules that control cellular phenotypes and proteins that mediate the interactions of cells with other cells. Recent studies indicate that, although constituent protein domains of numerous components of the genetic toolkit of the multicellular body plan of Metazoa were present in the unicellular ancestor of animals, the repertoire of multidomain proteins that are indispensable for the arrangement of distinct body parts in a reproducible manner evolved only in Metazoa. We have shown that the majority of the multidomain proteins involved in cell-cell and cell-matrix interactions of Metazoa have been assembled by exon shuffling, but there is no evidence for a similar role of exon shuffling in the evolution of proteins of metazoan transcription factor modules. A possible explanation for this difference in the intracellular and intercellular toolkits is that evolution of the transcription factor modules preceded the burst of exon shuffling that led to the creation of the proteins controlling spatial patterning in Metazoa. This explanation is in harmony with the temporal-to-spatial transition hypothesis of multicellularity that proposes that cell differentiation may have predated spatial segregation of cell types in animal ancestors.
Collapse
Affiliation(s)
- Laszlo Patthy
- Institute of Enzymology, Research Centre for Natural Sciences, H-1117 Budapest, Hungary
| |
Collapse
|
2
|
|
3
|
Patthy L. Exon skipping-rich transcriptomes of animals reflect the significance of exon-shuffling in metazoan proteome evolution. Biol Direct 2019; 14:2. [PMID: 30651122 PMCID: PMC6335736 DOI: 10.1186/s13062-019-0231-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Accepted: 01/04/2019] [Indexed: 12/31/2022] Open
Abstract
ᅟ Animals are known to have higher rates of exon skipping than other eukaryotes. In a recent study, Grau-Bové et al. (Genome Biology 19:135, 2018) have used RNA-seq data across 65 eukaryotic species to investigate when and how this high prevalence of exon skipping evolved. They have found that bilaterian Metazoa have significantly increased exon skipping frequencies compared to all other eukaryotic groups and that exon skipping in nearly all animals, including non-bilaterians, is strongly enriched for frame-preserving events. The authors have hypothesized that “the increase of exon skipping rates in animals followed a two-step process. First, exon skipping in early animals became enriched for frame-preserving events. Second, bilaterian ancestors dramatically increased their exon skipping frequencies, likely driven by the interplay between a shift in their genome architectures towards more exon definition and recruitment of frame-preserving exon skipping events to functionally diversify their cell-specific proteomes.” Here we offer a different explanation for the higher frequency of frame-preserving exon skipping in Metzoa than in all other eukaryotes. In our view these observations reflect the fact that the majority of multidomain proteins unique to metazoa and indispensable for metazoan type multicellularity were assembled by exon-shuffling from ‘symmetrical’ modules (i.e. modules flanked by introns of the same phase), whereas this type of protein evolution played a minor role in other groups of eukaryotes, including plants. The higher frequency of ‘symmetrical’ exons in Metazoan genomes provides an explanation for the enrichment for frame-preserving events since skipping or inclusion of ‘symmetrical’ modules during alternative splicing does not result in a reading-frame shift. Reviewers This article was reviewed by Manuel Irimia, Ashish Lal and Erez Levanon. The reviewers were nominated by the Editorial Board.
Collapse
Affiliation(s)
- Laszlo Patthy
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, H-1117, Hungary.
| |
Collapse
|
4
|
Harish A, Kurland CG. Akaryotes and Eukaryotes are independent descendants of a universal common ancestor. Biochimie 2017; 138:168-183. [PMID: 28461155 DOI: 10.1016/j.biochi.2017.04.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2017] [Accepted: 04/25/2017] [Indexed: 11/29/2022]
Abstract
We reconstructed a global tree of life (ToL) with non-reversible and non-stationary models of genome evolution that root trees intrinsically. We implemented Bayesian model selection tests and compared the statistical support for four conflicting ToL hypotheses. We show that reconstructions obtained with a Bayesian implementation (Klopfstein et al., 2015) are consistent with reconstructions obtained with an empirical Sankoff parsimony (ESP) implementation (Harish et al., 2013). Both are based on the genome contents of coding sequences for protein domains (superfamilies) from hundreds of genomes. Thus, we conclude that the independent descent of Eukaryotes and Akaryotes (archaea and bacteria) from the universal common ancestor (UCA) is the most probable as well as the most parsimonious hypothesis for the evolutionary origins of extant genomes. Reconstructions of ancestral proteomes by both Bayesian and ESP methods suggest that at least 70% of unique domain-superfamilies known in extant species were present in the UCA. In addition, identification of a vast majority (96%) of the mitochondrial superfamilies in the UCA proteome precludes a symbiotic hypothesis for the origin of eukaryotes. Accordingly, neither the archaeal origin of eukaryotes nor the bacterial origin of mitochondria is supported by the data. The proteomic complexity of the UCA suggests that the evolution of cellular phenotypes in the two primordial lineages, Akaryotes and Eukaryotes, was driven largely by duplication of common superfamilies as well as by loss of unique superfamilies. Finally, innovation of novel superfamilies has played a surprisingly small role in the evolution of Akaryotes and only a marginal role in the evolution of Eukaryotes.
Collapse
Affiliation(s)
- Ajith Harish
- Department of Cell and Molecular Biology, Structural and Molecular Biology Program, Uppsala University, Uppsala, Sweden.
| | - Charles G Kurland
- Department of Biology, Microbial Ecology Program, Lund University, Lund, Sweden.
| |
Collapse
|
5
|
Abstract
Receptor tyrosine kinases (RTKs) are transmembrane proteins involved in the control of fundamental cellular processes in metazoans. RTKs possess a general structure that includes an extracellular domain, a transmembrane domain and a highly conserved tyrosine kinase domain. RTKs are classified according to their variable extracellular ligand-binding domain. Studies of human RTK members have yielded a wealth of information elucidating their importance. Improper functioning of these enzymes due to mutations, mainly in the kinase domain, is often manifested in various human diseases and is known to be involved in several types of cancer. Here we summarize most of human RTKs, their cognate ligands, as well as related diseases and discuss the eventual use of certain RTKs as new therapeutic targets.
Collapse
Affiliation(s)
- Mouna Choura
- Molecular and Cellular Diagnosis Processes, Centre of Biotechnology of Sfax, University of Sfax , Route Sidi Mansour, Sfax , Tunisia
| | | |
Collapse
|
6
|
Nagy A, Patthy L. Reassessing domain architecture evolution of metazoan proteins: the contribution of different evolutionary mechanisms. Genes (Basel) 2011; 2:578-98. [PMID: 24710211 PMCID: PMC3927616 DOI: 10.3390/genes2030578] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2011] [Revised: 07/13/2011] [Accepted: 08/02/2011] [Indexed: 11/16/2022] Open
Abstract
In the accompanying papers we have shown that sequence errors of public databases and confusion of paralogs and epaktologs (proteins that are related only through the independent acquisition of the same domain types) significantly distort the picture that emerges from comparison of the domain architecture (DA) of multidomain Metazoan proteins since they introduce a strong bias in favor of terminal over internal DA change. The issue of whether terminal or internal DA changes occur with greater probability has very important implications for the DA evolution of multidomain proteins since gene fusion can add domains only at terminal positions, whereas domain-shuffling is capable of inserting domains both at internal and terminal positions. As a corollary, overestimation of terminal DA changes may be misinterpreted as evidence for a dominant role of gene fusion in DA evolution. In this manuscript we show that in several recent studies of DA evolution of Metazoa the authors used databases that are significantly contaminated with incomplete, abnormal and mispredicted sequences (e.g., UniProtKB/TrEMBL, EnsEMBL) and/or the authors failed to separate paralogs and epaktologs, explaining why these studies concluded that the major mechanism for gains of new domains in metazoan proteins is gene fusion. In contrast with the latter conclusion, our studies on high quality orthologous and paralogous Swiss-Prot sequences confirm that shuffling of mobile domains had a major role in the evolution of multidomain proteins of Metazoa and especially those formed in early vertebrates.
Collapse
Affiliation(s)
- Alinda Nagy
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest H-1113, Hungary.
| | - Laszlo Patthy
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest H-1113, Hungary.
| |
Collapse
|
7
|
Sawada R, Mitaku S. How are exons encoding transmembrane sequences distributed in the exon-intron structure of genes? Genes Cells 2010; 16:115-21. [PMID: 21143351 DOI: 10.1111/j.1365-2443.2010.01468.x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The exon-intron structure of eukaryotic genes raises a question about the distribution of transmembrane regions in membrane proteins. Were exons that encode transmembrane regions formed simply by inserting introns into preexisting genes or by some kind of exon shuffling? To answer this question, the exon-per-gene distribution was analyzed for all genes in 40 eukaryotic genomes with a particular focus on exons encoding transmembrane segments. In 21 higher multicellular eukaryotes, the percentage of multi-exon genes (those containing at least one intron) within all genes in a genome was high (>70%) and with a mean of 87%. When genes were grouped by the number of exons per gene in higher eukaryotes, good exponential distributions were obtained not only for all genes but also for the exons encoding transmembrane segments, leading to a constant ratio of membrane proteins independent of the exon-per-gene number. The positional distribution of transmembrane regions in single-pass membrane proteins showed that they are generally located in the amino or carboxyl terminal regions. This nonrandom distribution of transmembrane regions explains the constant ratio of membrane proteins to the exon-per-gene numbers because there are always two terminal (i.e., the amino and carboxyl) regions - independent of the length of sequences.
Collapse
Affiliation(s)
- Ryusuke Sawada
- Department of Computational Science and Engineering, Graduate School of Engineering, Nagoya University, Furocho, Chikusa-ku, Nagoya 464-8606, Japan.
| | | |
Collapse
|
8
|
Egesten A, Frick IM, Mörgelin M, Olin AI, Björck L. Binding of albumin promotes bacterial survival at the epithelial surface. J Biol Chem 2010; 286:2469-76. [PMID: 21098039 DOI: 10.1074/jbc.m110.148171] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Human serum albumin (HSA) is the dominating protein in human plasma. Many bacterial species, especially streptococci, express surface proteins that bind HSA with high specificity and affinity, but the biological consequences of these protein-protein interactions are poorly understood. Group G streptococci (GGS), carrying the HSA-binding protein G, colonize the skin and the mucosa of the upper respiratory tract, mostly without causing disease. In the case of bacterial invasion, pro-inflammatory cytokines are released that activate the epithelium to produce antibacterial peptides, in particular the chemokine MIG/CXCL9. In addition, the inflammation causes capillary leakage and extravasation of HSA and other plasma proteins, environmental changes at the epithelial surface to which the bacteria need to respond. In this study, we found that GGS adsorbed HSA from both saliva and plasma via binding to protein G and that HSA bound to protein G bound and inactivated the antibacterial MIG/CXCL9 peptide. Another surface protein of GGS, FOG, was found to mediate adherence of the bacteria to pharyngeal epithelial cells through interaction with glycosaminoglycans. This adherence was not affected by activation of the epithelium with a combination of IFN-γ and TNF-α, leading to the production of MIG/CXCL9. However, at the activated epithelial surface, adherent GGS were protected against killing by MIG/CXCL9 through protein G-dependent HSA coating. The findings identify a previously unknown bacterial survival strategy that helps to explain the evolution of HSA-binding proteins among bacterial species of the normal human microbiota.
Collapse
Affiliation(s)
- Arne Egesten
- Section for Respiratory Medicine and Allergology, Department of Clinical Sciences, Lund University and Lund University Hospital, SE-221 85 Lund, Sweden.
| | | | | | | | | |
Collapse
|
9
|
Betat H, Rammelt C, Mörl M. tRNA nucleotidyltransferases: ancient catalysts with an unusual mechanism of polymerization. Cell Mol Life Sci 2010; 67:1447-63. [PMID: 20155482 PMCID: PMC11115931 DOI: 10.1007/s00018-010-0271-4] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2009] [Revised: 12/14/2009] [Accepted: 01/14/2010] [Indexed: 10/19/2022]
Abstract
RNA polymerases are important enzymes involved in the realization of the genetic information encoded in the genome. Thereby, DNA sequences are used as templates to synthesize all types of RNA. Besides these classical polymerases, there exists another group of RNA polymerizing enzymes that do not depend on nucleic acid templates. Among those, tRNA nucleotidyltransferases show remarkable and unique features. These enzymes add the nucleotide triplet C-C-A to the 3'-end of tRNAs at an astonishing fidelity and are described as "CCA-adding enzymes". During this incorporation of exactly three nucleotides, the enzymes have to switch from CTP to ATP specificity. How these tasks are fulfilled by rather simple and small enzymes without the help of a nucleic acid template is a fascinating research area. Surprising results of biochemical and structural studies allow scientists to understand at least some of the mechanistic principles of the unique polymerization mode of these highly unusual enzymes.
Collapse
Affiliation(s)
- Heike Betat
- Institute for Biochemistry, University of Leipzig, Brüderstr. 34, 04103 Leipzig, Germany
| | - Christiane Rammelt
- Institute for Biochemistry, Martin Luther University Halle-Wittenberg, Kurt-Mothes-Str. 3, 06120 Halle, Germany
| | - Mario Mörl
- Institute for Biochemistry, University of Leipzig, Brüderstr. 34, 04103 Leipzig, Germany
| |
Collapse
|
10
|
De Kee DW, Gopalan V, Stoltzfus A. A Sequence-Based Model Accounts Largely for the Relationship of Intron Positions to Protein Structural Features. Mol Biol Evol 2007; 24:2158-68. [PMID: 17646255 DOI: 10.1093/molbev/msm151] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Claims of intron-structure correlations have played a major role in debates surrounding split gene origins. In the formative (as opposed to disruptive or "insertional") model of split gene origins, introns represent the scars of chimaeric gene assembly. When analyzed retrospectively, formative introns should tend to fall between modular units, if such units exist, or at least to exhibit a preference for sites favorable to chimaera formation. However, there is another possible source of preferences: under a disruptive model of split gene origins, fortuitous intron-structure correlations may arise because the gain of introns is biased with respect to flanking nucleotide sequences. To investigate the extent to which a sequence-biased intron gain model may account for the present-day distribution of introns, data on over 10,000 introns in eukaryotic protein-coding genes were integrated with structural data from a set of 1,851 nonredundant protein chains. The positions of introns with respect to secondary structures, solvent accessibility, and so-called "modules" were evaluated relative to the expectations of a null model, a disruptive model based on amino acid frequencies at splice junctions, and a formative model defined relative to these. The null model can be excluded for most structural features and is highly improbable when intron sites are grouped by reading frame phase. Phase-dependent correlations with secondary structure and side-chain surface accessibility are particularly strong. However, these phase-dependent correlations are explained largely by the sequence-based disruptive model.
Collapse
Affiliation(s)
- Danny W De Kee
- Center for Advanced Research in Biotechnology, Rockville, MD, USA
| | | | | |
Collapse
|
11
|
Ledger TN, Jaubert S, Bosselut N, Abad P, Rosso MN. Characterization of a new β-1,4-endoglucanase gene from the root-knot nematode Meloidogyne incognita and evolutionary scheme for phytonematode family 5 glycosyl hydrolases. Gene 2006; 382:121-8. [PMID: 16962258 DOI: 10.1016/j.gene.2006.06.023] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2006] [Revised: 06/15/2006] [Accepted: 06/19/2006] [Indexed: 11/28/2022]
Abstract
Cellulases from plant parasitic nematodes are encoded by multiple gene families and are thought to originate from horizontal gene transfer. Unraveling the evolution of these genes in the phylum will help understanding the evolution of plant parasitism in nematodes. Here we describe a new gene, named MI-eng-2, that encodes a family 5 glycosyl hydrolase (GHF5) with a predicted signal peptide and devoid of linker domain and cellulose-binding domain. The beta-1,4-endoglucanase activity of the protein MI-ENG-2 was confirmed in vitro and the transcription of the gene was localized in the secretory oesophageal glands of infective juveniles, suggesting that MI-ENG-2 is involved in plant cell wall degradation during parasitism. Phylogenetic and exon/intron structure analyses of beta-1,4-endoglucanase genes in the order Tylenchida strengthen the hypothesis that nematode GHF5 genes result from horizontal gene transfer of a bacterial gene with a cellulose-binding domain. GHF5 gene families in Tylenchida result from gene duplications associated with occasional loss of the cellulose-binding domain and the linker domain during their evolution.
Collapse
Affiliation(s)
- Terence Neil Ledger
- INRA-CNRS-UNSA, Plant-Microbe Interactions and Plant Health, 400, route des Chappes, BP 167, Sophia Antipolis 06903, cedex, France
| | | | | | | | | |
Collapse
|
12
|
Vibranovski MD, Sakabe NJ, de Oliveira RS, de Souza SJ. Signs of ancient and modern exon-shuffling are correlated to the distribution of ancient and modern domains along proteins. J Mol Evol 2005; 61:341-50. [PMID: 16034650 DOI: 10.1007/s00239-004-0318-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2004] [Accepted: 03/11/2005] [Indexed: 11/24/2022]
Abstract
Exon-shuffling is an important mechanism accounting for the origin of many new proteins in eukaryotes. However, its role in the creation of proteins in the ancestor of prokaryotes and eukaryotes is still debatable. Excess of symmetric exons is thought to represent evidence for exon-shuffling since the exchange of exons flanked by introns of the same phase does not disrupt the reading frame of the host gene. In this report, we found that there is a significant correlation between symmetric units of shuffling and the age of protein domains. Ancient domains, present in both prokaryotes and eukaryotes, are more frequently bounded by phase 0 introns and their distribution is biased towards the central part of proteins. Modern domains are more frequently bounded by phase 1 introns and are present predominantly at the ends of proteins. We propose a model in which shuffling of ancient domains mainly flanked by phase 0 introns was important in the ancestor of eukaryotes and prokaryotes, during the creation of the central part of proteins. Shuffling of modern domains, predominantly flanked by phase 1 introns, accounted for the origin of the extremities of proteins during eukaryotic evolution.
Collapse
|
13
|
Haenisch C, Diekmann H, Klinger M, Gennarini G, Kuwada JY, Stuermer CAO. The neuronal growth and regeneration associated Cntn1 (F3/F11/Contactin) gene is duplicated in fish: expression during development and retinal axon regeneration. Mol Cell Neurosci 2005; 28:361-74. [PMID: 15691716 DOI: 10.1016/j.mcn.2004.04.013] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2004] [Revised: 04/05/2004] [Accepted: 04/08/2004] [Indexed: 01/06/2023] Open
Abstract
The Cntn1 (Contactin/F3/F11) cell adhesion molecule is involved in axon growth and guidance, fasciculation, synapse formation, and myelination in birds and mammals. We identified Cntn1 genes in goldfish, zebrafish, and fugu, and provide evidence for a fish-specific duplication leading to Cntn1a and Cntn1b. Our analyses suggest a subfunctionalization for the Cntn1 paralogs in zebrafish compared to other vertebrates which have a single Cntn1 gene. Similar to Cntn1a, Cntn1b transcripts are found in subsets of sensory and motor neurons. However, Cntn1b is detected later and more restricted than Cntn1a. This spatio-temporal expression pattern of the two zebrafish Cntn1 paralogs suggests functions related to those of mammalian Cntn1. In adult goldfish, Cntn1b is expressed in oligodendrocytes and is upregulated in retinal ganglion cells after optic nerve transection, which is consistent with an additional role during regeneration.
Collapse
|
14
|
|
15
|
Bányai L, Patthy L. Evidence that human genes of modular proteins have retained significantly more ancestral introns than their fly or worm orthologues. FEBS Lett 2004; 565:127-32. [PMID: 15135065 DOI: 10.1016/j.febslet.2004.03.088] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2004] [Revised: 03/25/2004] [Accepted: 03/26/2004] [Indexed: 11/19/2022]
Abstract
Comparison of the exon-intron structures of human, fly and worm orthologues of mosaic genes assembled from class 1-1 modules by exon-shuffling has revealed that human genes retained significantly more of the original inter-module introns than their protostome orthologues. It is suggested that the much higher rate of intron loss in the worm- and insect lineages than in the chordate lineage reflects their greater tendency for genome compaction.
Collapse
Affiliation(s)
- László Bányai
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, P.O. Box 7, H-1518 Budapest, Hungary
| | | |
Collapse
|
16
|
Nagy I, Trexler M, Patthy L. Expression and characterization of the olfactomedin domain of human myocilin. Biochem Biophys Res Commun 2003; 302:554-61. [PMID: 12615070 DOI: 10.1016/s0006-291x(03)00198-0] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The olfactomedin-domain has been first identified in olfactomedin, an extracellular matrix protein of the olfactory neuroepithelium. Members of this extracellular domain-family have since been shown to be present in several metazoan proteins, such as latrophilins, myocilins, and noelins, but their biological function is unknown. The olfactomedin-domain of myocilin is of considerable interest, since mutations affecting this domain are associated with primary open angle glaucoma. In order to define structural features of this domain-type we have expressed the olfactomedin-domain of human myocilin in Pichia pastoris. The olfactomedin-domain contains a single disulphide-bond connecting Cys-245 and Cys-433 residues; secondary structure predictions and circular dichroism studies indicate that it consists primarily of beta-strands. It is noteworthy that the majority of mutations associated with severe forms of glaucoma affect residues that reside in conserved secondary structural elements of the olfactomedin-domain or are otherwise critical for the integrity of this protein-fold.
Collapse
Affiliation(s)
- Ildikó Nagy
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, P.O. Box 7, H-1518 Budapest, Hungary
| | | | | |
Collapse
|
17
|
|
18
|
Johansson MU, Frick IM, Nilsson H, Kraulis PJ, Hober S, Jonasson P, Linhult M, Nygren PA, Uhlén M, Björck L, Drakenberg T, Forsén S, Wikström M. Structure, specificity, and mode of interaction for bacterial albumin-binding modules. J Biol Chem 2002; 277:8114-20. [PMID: 11751858 DOI: 10.1074/jbc.m109943200] [Citation(s) in RCA: 76] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
We have determined the solution structure of an albumin binding domain of protein G, a surface protein of group C and G streptococci. We find that it folds into a left handed three-helix bundle similar to the albumin binding domain of protein PAB from Peptostreptococcus magnus. The two domains share 59% sequence identity, are thermally very stable, and bind to the same site on human serum albumin. The albumin binding site, the first determined for this structural motif known as the GA module, comprises residues spanning the first loop to the beginning of the third helix and includes the most conserved region of GA modules. The two GA modules have different affinities for albumin from different species, and their albumin binding patterns correspond directly to the host specificity of C/G streptococci and P. magnus, respectively. These studies of the evolution, structure, and binding properties of the GA module emphasize the power of bacterial adaptation and underline ecological and medical problems connected with the use of antibiotics.
Collapse
Affiliation(s)
- Maria U Johansson
- Department of Biophysical Chemistry, Lund University, P.O. Box 124, SE-221 00 Lund, Sweden.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Abstract
Much progress in understanding the evolution of new genes has been accomplished in the past few years. Molecular mechanisms such as illegitimate recombination and LINE element mediated 3' transduction underlying exon shuffling, a major process for generating new genes, are better understood. The identification of young genes in invertebrates and vertebrates has revealed a significant role of adaptive evolution acting on initially rudimentary gene structures created as if by evolutionary tinkers. New genes in humans and our primate relatives add a new component to the understanding of genetic divergence between humans and non-humans.
Collapse
Affiliation(s)
- M Long
- Department of Ecology and Evolution, The University of Chicago, 1101 East 57th Street, Chicago Illinois 60637, USA.
| |
Collapse
|
20
|
Abstract
Evolution of eukaryotes is mediated by sexual recombination of parental genomes. Crossovers occur in random, but homologous, positions at a frequency that depends on DNA length. As exons occupy only 1% of the human genome and introns about 24%, by far most of the crossovers occur between exons, rather than inside. The natural process of creating new combinations of exons by intronic recombination is called exon shuffling. Our group is developing in vitro formats for exon shuffling and applying these to the directed evolution of proteins. Based on the splice frame junctions, nine classes of exons and three classes of introns can be distinguished. Splice frame diagrams of natural genes show how the splice frame rules govern exon shuffling. Here, we review various approaches to constructing libraries of exon-shuffled genes. For example, exon shuffling of human pharmaceutical proteins can generate libraries in which all of the sequences are fully human, without the point mutations that raise concerns about immunogenicity.
Collapse
Affiliation(s)
- J A Kolkman
- Maxygen Inc., 515 Galveston Drive, Redwood City, CA 94063, USA
| | | |
Collapse
|
21
|
Long M, Rosenberg C. Testing the "proto-splice sites" model of intron origin: evidence from analysis of intron phase correlations. Mol Biol Evol 2000; 17:1789-96. [PMID: 11110894 DOI: 10.1093/oxfordjournals.molbev.a026279] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
A few nucleotide sites of nuclear exons that flank introns are often conserved. A hypothesis has suggested that these sites, called "proto-splice sites," are remnants of recognition signals for the insertion of introns in the early evolution of eukaryotic genes. This notion of proto-splice sites has been an important basis for the insertional theory of introns. This hypothesis predicts that the distribution of proto-splice sites would determine the distribution of intron phases, because the positions of introns are just a subset of the proto-splice sites. We previously tested this prediction by examining the proportions of the phases of proto-splice sites, revealing nothing in these proportion distributions similar to observed proportions of intron phases. Here, we provide a second independent test of the proto-splice site hypothesis, with regard to its prediction that the proto-splice sites would mimic intron phase correlations, using a CDS database we created from GenBank. We tested four hypothetical proto-splice sites G / G, AG / G, AG / GT, and C/AAG / R. Interestingly, while G / G and AG / GT site phase distributions are not consistent with actual introns, we observed that AG / G and C/AAG / R sites have a symmetric phase excess. However, the patterns of the excess are quite different from the actual intron phase distribution. In addition, particular amino acid repeats in proteins were found to partially contribute to the excess of symmetry at these two types of sites. The phase associations of all four sites are significantly different from those of intron phases. Furthermore, a general model of intron insertion into proto-splice sites was simulated by Monte Carlo simulation to investigate the probability that the random insertion of introns into AG / G and C/AAG / R sites could generate the observed intron phase distribution. The simulation showed that (1) no observed correlation of intron phases was statistically consistent with the phase distribution of proto-splice sites in the simulated virtual genes; (2) most conservatively, no simulation in 10,000 Monte Carlo experiments gave a pattern with an excess of symmetric (1, 1) exons larger than those of (0, 0) and (2, 2), a major statistical feature of intron phase distribution that is consistent with the directly observed cases of exon shuffling. Thus, these results reject the null hypothesis that introns are randomly inserted into preexisting proto-splice sites, as suggested by the insertional theory of introns.
Collapse
Affiliation(s)
- M Long
- Department of Ecology and Evolution, University of Chicago, Illinois 60637, USa.
| | | |
Collapse
|
22
|
Bowles J, Schepers G, Koopman P. Phylogeny of the SOX family of developmental transcription factors based on sequence and structural indicators. Dev Biol 2000; 227:239-55. [PMID: 11071752 DOI: 10.1006/dbio.2000.9883] [Citation(s) in RCA: 694] [Impact Index Per Article: 28.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Members of the SOX family of transcription factors are found throughout the animal kingdom, are characterized by the presence of a DNA-binding HMG domain, and are involved in a diverse range of developmental processes. Previous attempts to group SOX genes and deduce their structural, functional, and evolutionary relationships have relied largely on complete or partial HMG box sequence of a limited number of genes. In this study, we have used complete HMG domain sequence, full-length protein structure, and gene organization data to study the pattern of evolution within the family. For the first time, a substantial number of invertebrate SOX sequences have been included in the analysis. We find support for subdivision of the family into groups A-H, as has been suggested in some previous studies, and for the assignment of two new groups, I and J. For vertebrate genes, it appears that relatedness as suggested by HMG domain sequence is congruent with relatedness as indicated by overall structure of the full-length protein and intron-exon structure of the genes. Most of the SOX groups identified in vertebrates were represented by a single SOX sequence in each invertebrate species studied. We have named anonymous sequences and, where appropriate, have suggested systematic names for some previously identified sequences. In addition, we identify an HMG domain signature motif which may be considered representative of the SOX family. Based on our data, we propose a robust phylogeny of SOX genes that reflects their evolutionary history in metazoans.
Collapse
Affiliation(s)
- J Bowles
- Institute for Molecular Bioscience, University of Queensland, Brisbane, 4072, Australia
| | | | | |
Collapse
|
23
|
Robson P, Wright GM, Youson JH, Keeley FW. The structure and organization of lamprin genes: multiple-copy genes with alternative splicing and convergent evolution with insect structural proteins. Mol Biol Evol 2000; 17:1739-52. [PMID: 11070061 DOI: 10.1093/oxfordjournals.molbev.a026272] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Lamprin is a unique structural protein which forms the extracellular matrix of several cartilaginous structures found in the lamprey. Lamprin is noncollagenous in nature but shows sequence similarities to elastins and to insect structural proteins. Here, we characterize the structure and organization of lamprin genes, demonstrating the presence of multiple similar but not identical copies of the lamprin gene in the genome of the lamprey. In at least one species of lamprey, Lampetra richardsoni, the multiple gene copies are arranged in tandem in the genome in a head-to-tail orientation. Lamprin genes from Petromyzon marinus contain either seven or eight exons, with exon 4 being alternatively spliced in all genes, resulting in a total of six different lamprin transcripts. All exon junctions are of class 1,1. An unusual feature of the lamprin gene structure is the distribution of the 3' untranslated region sequence among multiple exons. A TATA box and cap sequence have been identified in upstream sequences in close proximity to the transcription start site, but no CAAT box could be identified. Sequence and gene structure comparisons between lamprins, elastins, and insect structural proteins suggest that the regions of sequence similarity are the result of a process of convergent evolution.
Collapse
Affiliation(s)
- P Robson
- Division of Cardiovascular Research, Hospital for Sick Children and Department of Biochemistry, University of Toronto, Toronto, Canada
| | | | | | | |
Collapse
|
24
|
Stenflo J, Stenberg Y, Muranyi A. Calcium-binding EGF-like modules in coagulation proteinases: function of the calcium ion in module interactions. BIOCHIMICA ET BIOPHYSICA ACTA 2000; 1477:51-63. [PMID: 10708848 DOI: 10.1016/s0167-4838(99)00262-9] [Citation(s) in RCA: 110] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Epidermal growth factor (EGF)-like modules are involved in protein-protein interactions and are found in numerous extracellular proteins and membrane proteins. Among these proteins are enzymes involved in blood coagulation, fibrinolysis and the complement system as well as matrix proteins and cell surface receptors such as the EGF precursor, the low density lipoprotein receptor and the developmentally important receptor, Notch. The coagulation enzymes, factors VII, IX and X and protein C, all have two EGF-like modules, whereas the cofactor of activated protein C, protein S, has four EGF-like modules in tandem. Certain of the cell surface receptors have numerous EGF modules in tandem. A subset of EGF modules bind one Ca(2+). The Ca(2+)-binding sequence motif is coupled to a sequence motif that brings about beta-hydroxylation of a particular Asp/Asn residue. Ca(2+)-binding to an EGF module is important to orient neighboring modules relative to each other in a manner that is required for biological activity. The Ca(2+) affinity of an EGF module is often influenced by its N-terminal neighbor, be it another EGF module or a module of another type. This can result in an increase in Ca(2+) affinity of several orders of magnitude. Point mutations in EGF modules that involve amino acids which are Ca(2+) ligands result in the biosynthesis of biologically inactive proteins. Such mutations have been identified, for instance, in factor IX, causing hemophilia B, in fibrillin, causing Marfan syndrome, and in the low density lipoprotein receptor, causing hypercholesterolemia. In this review the emphasis will be on the coagulation factors.
Collapse
Affiliation(s)
- J Stenflo
- Department of Clinical Chemistry, University of Lund, University Hospital, Malmö, SE-205 02, Malmö, Sweden.
| | | | | |
Collapse
|
25
|
Abstract
We present here HOBACGEN, a database system devoted to comparative genomics in bacteria. HOBACGEN contains all available protein genes from bacteria, archaea, and yeast, taken from SWISS-PROT/TrEMBL and classified into families. It also includes multiple alignments and phylogenetic trees built from these families. The database is organized under a client/server architecture with a client written in Java, which may run on any platform. This client integrates a graphical interface allowing users to select families according to various criteria and notably to select homologs common to a given set of taxa. This interface also allows users to visualize multiple alignments and trees associated to families. In tree displays, protein gene names are colored according to the taxonomy of the corresponding organisms. Users may access all information associated to sequences and multiple alignments by clicking on genes. This graphic tool thus gives a rapid and simple access to all data required to interpret homology relationships between genes and distinguish orthologs from paralogs. Instructions for installation of the client or the server are available at http://pbil.univ-lyon1. fr/databases/hobacgen.html.
Collapse
Affiliation(s)
- G Perrière
- Laboratoire de Biométrie et Biologie Evolutive, Unité Mixte de Recherche Centre National de la Recherche Scientifique (UMR CNRS) n( degrees ). 5558, Université Claude Bernard-Lyon 1, 69622 Villeurbanne Cedex, France.
| | | | | |
Collapse
|
26
|
Nishizawa K, Nishizawa M, Kim KS. Tendency for local repetitiveness in amino acid usages in modern proteins. J Mol Biol 1999; 294:937-53. [PMID: 10588898 DOI: 10.1006/jmbi.1999.3275] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Systematic analyses of human proteins show that neural and immune system-specific, and therefore, relatively "modern" proteins have a tendency for repetitive use of amino acids at a local scale ( approximately 1-20 residues), while ancient proteins (human homologues of Escherichia coli proteins) do not. Those protein subsegments which are unique based on homology search account for the repetitiveness. Simulation shows that such repetitiveness can be maintained by frequent duplication on a very short scale (one to two codons) in the presence of substitutive point mutation, while the latter tends to mitigate the repetitiveness. DNA analyses also show the presence of cryptic (i.e. "out of the codon frame") repetitiveness, which cannot fully be explained by features in protein sequences. Simulative modification of the amino acid sequences of immune system-specific proteins estimate that 2.4 duplication events occur during the period equivalent to ten events of substitution mutation. It is also suggested that the repetitiveness leads to longitudinal unevenness within a given peptide domain. Those peptide motifs which contain similarly charged residues are likely to be generated more frequently in the presence of the tendency for repetitiveness than in its absence. Therefore, the neutral propensity of DNA for duplication, which can also tend to generate repetitiveness in amino acid sequences, seems to be manifested primarily when the constraints on amino acid sequences are relatively weak, and yet may be positively contributing to generation of unevenness in modern proteins.
Collapse
Affiliation(s)
- K Nishizawa
- Department of Biochemistry, Teikyo University School of Medicine, Kaga, Itabashi, Tokyo, 173, Japan.
| | | | | |
Collapse
|
27
|
Abstract
SPARC (secreted protein, acidic and rich in cysteine) is a unique matricellular glycoprotein that is expressed by many different types of cells and is associated with development, remodeling, cell turnover, and tissue repair. Its principal functions in vitro are counteradhesion and antiproliferation, which proceed via different signaling pathways. SPARC consists of three domains, each of which has independent activity and unique properties. The extracellular calcium binding module and the follistatin-like module have been recently crystallized. Specific interactions between SPARC and growth factors, extracellular matrix proteins, and cell surface proteins contribute to the diverse activities described for SPARC in vivo and in vitro. The location of SPARC in the nuclear matrix of certain proliferating cells, but only in the cytosol of postmitotic neurons, indicates potential functions of SPARC as a nuclear protein, which might be involved in the regulation of cell cycle progression and mitosis. High levels of SPARC have been found in adult eye, and SPARC-null mice exhibit cataracts at 1-2 months of age. This animal model provides an excellent opportunity to confirm and explore some of the properties of SPARC, to investigate cataractogenesis, and to study SPARC-related family proteins, e.g., SC1/hevin, a counteradhesive matricellular protein that might functionally compensate for SPARC in certain tissues.(J Histochem Cytochem 47:1495-1505, 1999)
Collapse
Affiliation(s)
- Q Yan
- Department of Vascular Biology, Hope Heart Institute, Seattle, Washington 98122, USA
| | | |
Collapse
|
28
|
Nishizawa M, Nishizawa K. Local-scale repetitiveness in amino acid use in eukaryote protein sequences: A genomic factor in protein evolution. Proteins 1999. [DOI: 10.1002/(sici)1097-0134(19991101)37:2<284::aid-prot13>3.0.co;2-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
29
|
Abstract
Recent studies on the genomes of protists, plants, fungi and animals confirm that the increase in genome size and gene number in different eukaryotic lineages is paralleled by a general decrease in genome compactness and an increase in the number and size of introns. It may thus be predicted that exon-shuffling has become increasingly significant with the evolution of larger, less compact genomes. To test the validity of this prediction, we have analyzed the evolutionary distribution of modular proteins that have clearly evolved by intronic recombination. The results of this analysis indicate that modular multidomain proteins produced by exon-shuffling are restricted in their evolutionary distribution. Although such proteins are present in all major groups of metazoa from sponges to chordates, there is practically no evidence for the presence of related modular proteins in other groups of eukaryotes. The biological significance of this difference in the composition of the proteomes of animals, fungi, plants and protists is best appreciated when these modular proteins are classified with respect to their biological function. The majority of these proteins can be assigned to functional categories that are inextricably linked to multicellularity of animals, and are of absolute importance in permitting animals to function in an integrated fashion: constituents of the extracellular matrix, proteases involved in tissue remodelling processes, various proteins of body fluids, membrane-associated proteins mediating cell-cell and cell-matrix interactions, membrane associated receptor proteins regulating cell cell communications, etc. Although some basic types of modular proteins seem to be shared by all major groups of metazoa, there are also groups of modular proteins that appear to be restricted to certain evolutionary lineages. In summary, the results suggest that exon-shuffling acquired major significance at the time of metazoan radiation. It is interesting to note that the rise of exon-shuffling coincides with a spectacular burst of evolutionary creativity: the Big Bang of metazoan radiation. It seems probable that modular protein evolution by exon-shuffling has contributed significantly to this accelerated evolution of metazoa, since it facilitated the rapid construction of multidomain extracellular and cell surface proteins that are indispensable for multicellularity.
Collapse
Affiliation(s)
- L Patthy
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest.
| |
Collapse
|
30
|
Bányai L, Patthy L. The NTR module: domains of netrins, secreted frizzled related proteins, and type I procollagen C-proteinase enhancer protein are homologous with tissue inhibitors of metalloproteases. Protein Sci 1999; 8:1636-42. [PMID: 10452607 PMCID: PMC2144412 DOI: 10.1110/ps.8.8.1636] [Citation(s) in RCA: 137] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Using homology search, structure prediction, and structural characterization methods we show that the C-terminal domains of (1) netrins, (2) complement proteins C3, C4, C5, (3) secreted frizzled-related proteins, and (4) type I procollagen C-proteinase enhancer proteins (PCOLCEs) are homologous with the N-terminal domains of (5) tissue inhibitors of metalloproteinases (TIMPs). The proteins harboring this netrin module (NTR module) fulfill diverse biological roles ranging from axon guidance, regulation of Wnt signaling, to the control of the activity of metalloproteases. With the exception of TIMPs, it is not known at present what role the NTR modules play in these processes. In view of the fact that the NTR modules of TIMPs are involved in the inhibition of matrixin-type metalloproteases and that the NTR module of PCOLCEs is involved in the control of the activity of the astacin-type metalloprotease BMP1, it seems possible that interaction with metzincins could be a shared property of NTR modules and could be critical for the biological roles of the host proteins.
Collapse
Affiliation(s)
- L Bányai
- Institute of Enzymology, Biological Research Center, Hungarian Academy of Sciences, Budapest
| | | |
Collapse
|
31
|
Fürst DO, Obermann WM, van der Ven PF. Structure and assembly of the sarcomeric M band. Rev Physiol Biochem Pharmacol 1999; 138:163-202. [PMID: 10396141 DOI: 10.1007/bfb0119627] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Affiliation(s)
- D O Fürst
- Department of Cell Biology, University of Potsdam, Germany
| | | | | |
Collapse
|
32
|
Affiliation(s)
- D R Brigstock
- Department of Surgery, Ohio State University, Columbus 43210, USA.
| |
Collapse
|
33
|
|
34
|
Abstract
Sponges are the lowest extant metazoan phylum and for about a century they have been used as a model system to study cell adhesion. There are three classes of molecules in the extracellular matrix of vertebrates: collagens, proteoglycans, and adhesive glycoproteins, all of them have been identified in sponges. Species-specific cell recognition in sponges is mediated by supramolecular proteoglycan-like complexes termed aggregation factors, still to be identified in higher animals. Polyvalent glycosaminoglycan interactions are involved in the species-specificity, representing one of the few known examples of a regulatory role for carbohydrates. Aggregation factors mediate cell adhesion via a bifunctional activity that combines a calcium-dependent self-interaction of aggregation factor molecules plus a calcium-independent heterophilic interaction with cell surface receptors. Important cases of cell adhesion are the phenomena involved in histocompatibility reactions. A long-standing prediction has been that the evolutionary ancestors of histocompatibility systems might be found among primitive cell-cell interaction molecules. A surprising characteristic of sponges, considering their low phylogenetic position, is that they possess an exquisitely sophisticated histocompatibility system. Any grafting between two different sponge individuals (allograft) is almost invariably incompatible in the many species investigated, exhibiting a variety of transitive qualitatively and quantitatively different responses, which can only be explained by the existence of a highly polymorphic gene system. Individual variability of protein and glycan components in the aggregation factor of the red beard sponge, Microciona prolifera, matches the elevated sponge alloincompatibility, suggesting an involvement of the cell adhesion system in sponge allogeneic reactions and, therefore, an evolutionary relationship between cell adhesion and histocompatibility systems.
Collapse
|
35
|
Valcarce C, Björk I, Stenflo J. The epidermal growth factor precursor. A calcium-binding, beta-hydroxyasparagine containing modular protein present on the surface of platelets. EUROPEAN JOURNAL OF BIOCHEMISTRY 1999; 260:200-7. [PMID: 10091600 DOI: 10.1046/j.1432-1327.1999.00156.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Various human body fluids and secretions contain a soluble form of the epidermal growth factor (EGF) precursor. The EGF precursor molecule contains eight EGF modules in addition to EGF itself. Using monoclonal antibodies specific for the EGF modules 7 and 8, we have purified the soluble form of the EGF precursor from human urine to homogeneity. The protein was shown to have a molecular mass of about 160 kDa and the N-terminal sequence SAPNHWSXPE. EGF modules 2, 7 and 8 of the precursor have the consensus sequence for post-translational beta-hydroxylation of Asp/Asn residues. We identified the presence of erythro-beta-hydroxy-aspartic acid (Hya) in acid hydrolysates of the EGF precursor (2.4 M.M protein-1). As the DNA sequence encodes Asn in the corresponding position, the Hya represents erythro-beta-hydroxyasparagine (Hyn). The Hyn-containing modules have a consensus calcium-binding motif immediately N-terminal of the first Cys residue. The synthetic EGF module 2 (residues 356-395) of the EGF precursor was found to bind calcium with low affinity, Kd approximately 3.5 mM, i.e. similar to the affinity of other isolated calcium-binding EGF modules. EGF module 7, when part of the intact protein, was found to bind Ca2+ with a Kd approximately 0.2 microM, i.e. approximately 10(4)-fold higher than that of isolated EGF modules presumably due to the influence of neighboring modules. We have detected EGF precursor in platelet-rich plasma and demonstrated it to be associated to platelets. The platelets were found to have 30-160 EGF molecules each.
Collapse
Affiliation(s)
- C Valcarce
- Department of Clinical Chemistry, University of Lund, University Hospital, Malmö, Sweden
| | | | | |
Collapse
|
36
|
Fürst DO, Obermann WMJ, Ven PFM. Structure and assembly of the sarcomeric M Band. Rev Physiol Biochem Pharmacol 1999. [DOI: 10.1007/bf02346663] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
37
|
Teichmann SA, Park J, Chothia C. Structural assignments to the Mycoplasma genitalium proteins show extensive gene duplications and domain rearrangements. Proc Natl Acad Sci U S A 1998; 95:14658-63. [PMID: 9843945 PMCID: PMC24505 DOI: 10.1073/pnas.95.25.14658] [Citation(s) in RCA: 112] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The parasitic bacterium Mycoplasma genitalium has a small, reduced genome with close to a basic set of genes. As a first step toward determining the families of protein domains that form the products of these genes, we have used the multiple sequence programs PSI-BLAST and GEANFAMMER to match the sequences of the 467 gene products of M. genitalium to the sequences of the domains that form proteins of known structure [Protein Data Bank (PDB) sequences]. PDB sequences (274) match all of 106 M. genitalium sequences and some parts of another 85; thus, 41% of its total sequences are matched in all or part. The evolutionary relationships of the PDB domains that match M. genitalium are described in the structural classification of proteins (SCOP) database. Using this information, we show that the domains in the matched M. genitalium sequences come from 114 superfamilies and that 58% of them have arisen by gene duplication. This level of duplication is more than twice that found by using pairwise sequence comparisons. The PDB domain matches also describe the domain structure of the matched sequences: just over a quarter contain one domain and the rest have combinations of two or more domains.
Collapse
Affiliation(s)
- S A Teichmann
- Medical Research Council Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 2QH, United Kingdom.
| | | | | |
Collapse
|
38
|
Abstract
Does the intron/exon structure of eukaryotic genes belie their ancient assembly by exon-shuffling or have introns been inserted into preformed genes during eukaryotic evolution? These are the central questions in the ongoing 'introns-early' versus 'introns-late' controversy. The phylogenetic distribution of spliceosomal introns continues to strongly favor the intronslate theory. The introns-early theory, however, has claimed support from intron phase and protein structure correlations.
Collapse
Affiliation(s)
- J M Logsdon
- Department of Biochemistry, Dalhousie University, Halifax, Nova Scotia,B3H 4H7, Canada.
| |
Collapse
|
39
|
Affiliation(s)
- G D Schuler
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA
| |
Collapse
|
40
|
Abstract
Since natural proteins are the products of a long evolutionary process, the structural properties of present-day proteins should depend not only on physico-chemical constraints, but also on evolutionary constraints. Here we propose a model for protein evolution, in which membranes play a key role as a scaffold for supporting the gradual evolution from flexible polypeptides to well-folded proteins. We suggest that the folding process of present-day globular proteins is a relic of this putative evolutionary process. To test the hypothesis that membranes once acted as a cradle for the folding of globular proteins, extensive research on membrane proteins and the interactions of globular proteins with membranes will be required.
Collapse
Affiliation(s)
- N Doi
- Mitsubishi Kasei Institute of Life Sciences, Machida, Tokyo, Japan
| | | |
Collapse
|
41
|
Steiner F, Weber K, Fürst DO. Structure and expression of the gene encoding murine M-protein, a sarcomere-specific member of the immunoglobulin superfamily. Genomics 1998; 49:83-95. [PMID: 9570952 DOI: 10.1006/geno.1998.5220] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The complete exon-intron organization of the murine gene encoding M-protein, a structural protein of sarcomeric myofibrils, was determined. The gene is composed of 37 exons and 36 introns, spanning approximately 75 kb of DNA. Intron positions are related to the modular structure of M-protein, which is composed essentially of immunoglobulin and fibronectin type III domains. Almost all repeats follow a two exon-one domain structure. The beginning and end of each domain are defined by introns in phase I; internal introns are more divergent in position and very rarely use phase I. A single transcriptional start point was detected in both skeletal and cardiac muscle. Analysis of the prospective promoter region revealed several potential regulatory elements. CAT expression assays using promoter deletion constructs identified three regions that seem to be most important for the muscle-specific transcription activation of the M-protein gene. These results provide the first complete characterization of a gene for a member of the intracellular branch of the immunoglobulin superfamily.
Collapse
Affiliation(s)
- F Steiner
- Department of Biochemistry, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany
| | | | | |
Collapse
|
42
|
Intron-exon structures. ACTA ACUST UNITED AC 1998. [DOI: 10.1016/s1067-5701(98)80020-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
43
|
Plagge A, Brümmendorf T. The gene of the neural cell recognition molecule F11: conserved exon-intron arrangement in genes of neural members of the immunoglobulin superfamily. Gene 1997; 192:215-25. [PMID: 9224893 DOI: 10.1016/s0378-1119(97)00066-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The chicken neural glycoprotein F11 is a cell recognition molecule implicated in neurohistogenesis, in particular in the context of neurite outgrowth and fasciculation. F11 is a glycosyl-phosphatidylinositol-linked member of the immunoglobulin superfamily that is also termed contactin or F3 in humans and rodents, respectively. In this study, we report the complete structure of the F11 gene. It is composed of 23 exons distributed over more than 100 kb of genomic DNA and each of the ten domains of the F11 protein is encoded by two exons. The sizes of the introns vary by two orders of magnitude ranging from 150 bp to more than 15 kb. All interdomain introns are in phase one, i.e. are inserted after the first nucleotide of a codon, being consistent with assembly of a F11 progenitor gene via exon shuffling. The intradomain introns are localized at variable sites within the domains and have different intron phases. This study reveals a remarkable similarity of the F11 gene with the gene of axonin-1, a related neural immunoglobulin superfamily member which is also implicated in neurite outgrowth and fasciculation. The intron positions with respect to the protein domain organization are found to be identical, strongly suggesting that both genes are derived from a common ancestor that already had this exon-intron structure.
Collapse
Affiliation(s)
- A Plagge
- Max-Planck-Institut für Entwicklungsbiologie, Tübingen, Germany
| | | |
Collapse
|
44
|
Fuchs MA, Buta C. The role of peptide modules in protein evolution. Biophys Chem 1997; 66:203-10. [PMID: 17029875 DOI: 10.1016/s0301-4622(97)00067-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/1997] [Accepted: 04/24/1997] [Indexed: 11/18/2022]
Abstract
Protein evolution shows interesting strategies to be used in protein design. During evolution the creation of new proteins has been accomplished by combining different peptide modules, i.e. evolutionary successful stable folding units. Thereby, the evolution of proteins has been greatly enhanced. Today this mechanism of recombining optimized building blocks to design new proteins has been introduced into applied molecular evolution.
Collapse
Affiliation(s)
- M A Fuchs
- Max-Planck-Institute for Biophysical Chemistry, D-37077 Göttingen, Germany
| | | |
Collapse
|
45
|
Logsdon JM, Doolittle WF. Origin of antifreeze protein genes: a cool tale in molecular evolution. Proc Natl Acad Sci U S A 1997; 94:3485-7. [PMID: 9108001 PMCID: PMC34156 DOI: 10.1073/pnas.94.8.3485] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Affiliation(s)
- J M Logsdon
- Department of Biochemistry and Canadian Institute for Advanced Research, Dalhousie University, Halifax, NS
| | | |
Collapse
|
46
|
Kobe B, Deisenhofer J. Mechanism of ribonuclease inhibition by ribonuclease inhibitor protein based on the crystal structure of its complex with ribonuclease A. J Mol Biol 1996; 264:1028-43. [PMID: 9000628 DOI: 10.1006/jmbi.1996.0694] [Citation(s) in RCA: 168] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
We describe the mechanism of ribonuclease inhibition by ribonuclease inhibitor, a protein built of leucine-rich repeats, based on the crystal structure of the complex between the inhibitor and ribonuclease A. The structure was determined by molecular replacement and refined to an Rcryst of 19.4% at 2.5 A resolution. Ribonuclease A binds to the concave region of the inhibitor protein comprising its parallel beta-sheet and loops. The inhibitor covers the ribonuclease active site and directly contacts several active-site residues. The inhibitor only partially mimics the RNase-nucleotide interaction and does not utilize the p1 phosphate-binding pocket of ribonuclease A, where a sulfate ion remains bound. The 2550 A2 of accessible surface area buried upon complex formation may be one of the major contributors to the extremely tight association (Ki = 5.9 x 10(-14) M). The interaction is predominantly electrostatic; there is a high chemical complementarity with 18 putative hydrogen bonds and salt links, but the shape complementarity is lower than in most other protein-protein complexes. Ribonuclease inhibitor changes its conformation upon complex formation; the conformational change is unusual in that it is a plastic reorganization of the entire structure without any obvious hinge and reflects the conformational flexibility of the structure of the inhibitor. There is a good agreement between the crystal structure and other biochemical studies of the interaction. The structure suggests that the conformational flexibility of RI and an unusually large contact area that compensates for a lower degree of complementarity may be the principal reasons for the ability of RI to potently inhibit diverse ribonucleases. However, the inhibition is lost with amphibian ribonucleases that have substituted most residues corresponding to inhibitor-binding residues in RNase A, and with bovine seminal ribonuclease that prevents inhibitor binding by forming a dimer.
Collapse
Affiliation(s)
- B Kobe
- St. Vincent's Institute of Medical Research, Victoria, Australia
| | | |
Collapse
|
47
|
Drescher B, Spiess E, Schachner M, Probstmeier R. Structural analysis of the murine cell adhesion molecule L1 by electron microscopy and computer-assisted modelling. Eur J Neurosci 1996; 8:2467-78. [PMID: 8996796 DOI: 10.1111/j.1460-9568.1996.tb01541.x] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
In the present study we have analysed the morphology of two fragments with apparent molecular weights of 180 and 140 kDA (L1-180 and L1-140) derived from the extracellular region of the murine neural cell adhesion molecule L1. The fragment L1-180 consists of almost the entire extracellular part of the molecule, and is built up of six immunoglobulin-like and five fibronectin type III-like domains. Fragment L1-140 lacks one-half of the third, the fourth and the fifth fibronectin type III-like domains. By electron microscopic analysis of rotary-shadowed molecules, L1-140 and L1-180 revealed fibrillar structures 31-43 nm long and 7-12 nm wide with one pronounced globular terminal domain. As determined by complex formation with an L1 antibody, this terminal part of the molecule is formed by the fibronectin type III-like domains. The individual structures showed variation and complexity, and four distinct aspects were identified. These different forms probably represent two-dimensional projections of the same three-dimensional helical structure. Computer-assisted modelling of the L1 molecule, i.e. the protein backbone, showed no strong intramolecular interaction between the different fibronectin type III- or Ig-like domains, suggesting that the formation of the globular part of the molecule is probably achieved by protein-carbohydrate and/or carbohydrate-carbohydrates rather than protein-protein interactions. In addition, our model proposes that interactions occur within the interfaces between the different domains. The highly conserved amino acid residues in these regions point to the necessity of maintaining the orientation between the different domains.
Collapse
Affiliation(s)
- B Drescher
- Molekulare Biophysik 1, Deutsches Krebsforschungszentrum, Heidelberg, Germany
| | | | | | | |
Collapse
|
48
|
Abstract
Thanks to recent improvements in techniques used for the detection of homologies, it is now clear that module exchange played a major role in protein evolution. Analysis of the genes of various modular proteins has identified a large number of cases where gene assembly was facilitated by intronic recombination--i.e., the proteins were formed by exon shuffling. Studies of the principles and mechanistic details of exon shuffling, however, revealed that this powerful evolutionary mechanism could become significant only after the appearance of spliceosomal introns typical of higher eukaryotes. Although exon shuffling is the most efficient way of constructing modular proteins, recent studies on the evolution of multidomain proteins of prokaryotes emphasize that intronic recombination is not an absolute prerequisite of module exchange.
Collapse
Affiliation(s)
- L Patthy
- Institute of Enzymology, Hungarian Academy of Sciences, Budapest, Hungary
| |
Collapse
|
49
|
Joba W, Hoffmann W. Alternative splicing of repetitive units is responsible for the polydispersities of integumentary mucin B.1 (FIM-B.1) from Xenopus laevis. Glycoconj J 1996; 13:735-40. [PMID: 8910000 DOI: 10.1007/bf00702337] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Frog integumentary mucin B.1 (FIM-B.1) represents a polymorphic extracellular mosaic protein which contains tandemly arranged serine/threonine-rich modules as well as cysteine-rich domains. The latter are probably important for oligomerization of FIM-B.1 and have also been found in many proteins of the complement cascade as well as regions homologous to von Willebrand factor. The repetitive modules are targets for extensive O-glycosylation. Previous cDNA cloning experiments clearly established polydispersities within the same individual, which originate from deletions/insertions in the repetitive domain. Here, we analyse part of the corresponding genomic region. Each repetitive unit as well as the cysteine-rich domain is encoded by an individual class 1-1 exon typical of shuffled modules. Alternative splicing of these multiple cassettes creates the polydisperse FIM-B.1 transcripts.
Collapse
Affiliation(s)
- W Joba
- Max-Planck-Institut für Psychiatrie, Abteilung Neurochemie, Martinsried, Germany
| | | |
Collapse
|
50
|
Bassuk JA, Braun LP, Motamed K, Baneyx F, Sage EH. Renaturation of SPARC expressed in Escherichia coli requires isomerization of disulfide bonds for recovery of biological activity. Int J Biochem Cell Biol 1996; 28:1031-43. [PMID: 8930126 DOI: 10.1016/1357-2725(96)00036-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
SPARC (secreted protein acidic and rich in cysteine, also known as osteonectin and BM-40) belongs to a group of secreted macromolecules that modulate cellular interactions with the extracellular matrix. During vertebrate embryogenesis, as well as in tissues undergoing remodeling and repair, the expression pattern of SPARC is consistent with a fundamental role for this protein in tissue morphogenesis and cellular differentiation. Human SPARC was cloned by the polymerase chain reaction from an endothelial cell cDNA library and was expressed in Escherichia coli as a biologically active protein. Two forms of recombinant SPARC (rSPARC) were recovered from BL21(DE3) cells after transformation with the plasmid pSPARCwt: a soluble, monomeric form that is biologically active (Bassuk et al., 1996, Archiv. Biochem. Biophys. 325, 8-19), and an insoluble form sequestered in inclusion bodies. Aggregated rSPARC was unfolded by urea treatment, purified by nickel-chelate affinity chromatography, and renatured by gradual removal of the denaturant. Proper isomerization of the disulfide bonds was achieved in the presence of a glutathione redox couple. After final purification by high resolution gel filtration chromatography, a monomeric form of rSPARC displaying biological activity was obtained. The recombinant protein inhibited the spreading and synthesis of DNA by endothelial cells, two properties characteristic of the native protein. We conclude that the information for the correct folding of rSPARC resides in the primary structure of the protein, and suggest that post-translational modifications are required neither for folding nor for biological activity.
Collapse
Affiliation(s)
- J A Bassuk
- Department of Biological Structure, University of Washington, Seattle 98195, USA
| | | | | | | | | |
Collapse
|