Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Patthy L. Exons--original building blocks of proteins? Bioessays 1991;13:187-92. [PMID: 1859398 DOI: 10.1002/bies.950130408] [Citation(s) in RCA: 54] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Number

Cited by Other Article(s)

Patthy L. Exon Shuffling Played a Decisive Role in the Evolution of the Genetic Toolkit for the Multicellular Body Plan of Metazoa. Genes (Basel) 2021;12:382. [PMID: 33800339 PMCID: PMC8001218 DOI: 10.3390/genes12030382] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Revised: 03/01/2021] [Accepted: 03/04/2021] [Indexed: 11/30/2022] Open

Varga J, Dobson L, Tusnády GE. TOPDOM: database of conservatively located domains and motifs in proteins. Bioinformatics 2016;32:2725-6. [PMID: 27153630 PMCID: PMC5013901 DOI: 10.1093/bioinformatics/btw193] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2016] [Accepted: 04/04/2016] [Indexed: 11/14/2022] Open

Basu MK, Poliakov E, Rogozin IB. Domain mobility in proteins: functional and evolutionary implications. Brief Bioinform 2009;10:205-16. [PMID: 19151098 DOI: 10.1093/bib/bbn057] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Baertsch R, Diekhans M, Kent WJ, Haussler D, Brosius J. Retrocopy contributions to the evolution of the human genome. BMC Genomics 2008;9:466. [PMID: 18842134 PMCID: PMC2584115 DOI: 10.1186/1471-2164-9-466] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2008] [Accepted: 10/08/2008] [Indexed: 02/06/2023] Open

Abstract

Background

Evolution via point mutations is a relatively slow process and is unlikely to completely explain the differences between primates and other mammals. By contrast, 45% of the human genome is composed of retroposed elements, many of which were inserted in the primate lineage. A subset of retroposed mRNAs (retrocopies) shows strong evidence of expression in primates, often yielding functional retrogenes.

Results

To identify and analyze the relatively recently evolved retrogenes, we carried out BLASTZ alignments of all human mRNAs against the human genome and scored a set of features indicative of retroposition. Of over 12,000 putative retrocopy-derived genes that arose mainly in the primate lineage, 726 with strong evidence of transcript expression were examined in detail. These mRNA retroposition events fall into three categories: I) 34 retrocopies and antisense retrocopies that added potential protein coding space and UTRs to existing genes; II) 682 complete retrocopy duplications inserted into new loci; and III) an unexpected set of 13 retrocopies that contributed out-of-frame, or antisense sequences in combination with other types of transposed elements (SINEs, LINEs, LTRs), even unannotated sequence to form potentially novel genes with no homologs outside primates. In addition to their presence in human, several of the gene candidates also had potentially viable ORFs in chimpanzee, orangutan, and rhesus macaque, underscoring their potential of function.

Conclusion

mRNA-derived retrocopies provide raw material for the evolution of genes in a wide variety of ways, duplicating and amending the protein coding region of existing genes as well as generating the potential for new protein coding space, or non-protein coding RNAs, by unexpected contributions out of frame, in reverse orientation, or from previously non-protein coding sequence.

Collapse

Herpin A, Lelong C, Becker T, Favrel P, Cunningham C. A tolloid homologue from the Pacific oyster Crassostrea gigas. Gene Expr Patterns 2007;7:700-8. [PMID: 17433792 DOI: 10.1016/j.modgep.2007.03.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2007] [Revised: 02/23/2007] [Accepted: 03/01/2007] [Indexed: 10/23/2022]

de Roos ADG. Conserved intron positions in ancient protein modules. Biol Direct 2007;2:7. [PMID: 17288589 PMCID: PMC1800838 DOI: 10.1186/1745-6150-2-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2007] [Accepted: 02/08/2007] [Indexed: 12/31/2022] Open

Abstract

Background

The timing of the origin of introns is of crucial importance for an understanding of early genome architecture. The Exon theory of genes proposed a role for introns in the formation of multi-exon proteins by exon shuffling and predicts the presence of conserved splice sites in ancient genes. In this study, large-scale analysis of potential conserved splice sites was performed using an intron-exon database (ExInt) derived from GenBank.

Results

A set of conserved intron positions was found by matching identical splice sites sequences from distantly-related eukaryotic kingdoms. Most amino acid sequences with conserved introns were homologous to consensus sequences of functional domains from conserved proteins including kinases, phosphatases, small GTPases, transporters and matrix proteins. These included ancient proteins that originated before the eukaryote-prokaryote split, for instance the catalytic domain of protein phosphatase 2A where a total of eleven conserved introns were found. Using an experimental setup in which the relation between a splice site and the ancientness of its surrounding sequence could be studied, it was found that the presence of an intron was positively correlated to the ancientness of its surrounding sequence. Intron phase conservation was linked to the conservation of the gene sequence and not to the splice site sequence itself. However, no apparent differences in phase distribution were found between introns in conserved versus non-conserved sequences.

Conclusion

The data confirm an origin of introns deep in the eukaryotic branch and is in concordance with the presence of introns in the first functional protein modules in an 'Exon theory of genes' scenario. A model is proposed in which shuffling of primordial short exonic sequences led to the formation of the first functional protein modules, in line with hypotheses that see the formation of introns integral to the origins of genome evolution.

Reviewers

This article was reviewed by Scott Roy (nominated by Anthony Poole), Sandro de Souza (nominated by Manyuan Long), and Gáspár Jékely.

Collapse

Kim H, Sung S, Klein R. Expansion of symmetric exon-bordering domains does not explain evolution of lineage specific genes in mammals. Genetica 2006;131:59-68. [PMID: 17082903 DOI: 10.1007/s10709-006-9113-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2006] [Accepted: 09/26/2006] [Indexed: 10/24/2022]

Chabasse C, Bailly X, Sanchez S, Rousselot M, Zal F. Gene structure and molecular phylogeny of the linker chains from the giant annelid hexagonal bilayer hemoglobins. J Mol Evol 2006;63:365-74. [PMID: 16838215 DOI: 10.1007/s00239-005-0198-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2005] [Accepted: 03/31/2006] [Indexed: 10/24/2022]

Mason TA, McIlroy PJ, Shain DH. Structural model of an antistasin/notch-like fusion protein from the cocoon wall of the aquatic leech, Theromyzon tessulatum. J Mol Model 2006;12:829-34. [PMID: 16523290 DOI: 10.1007/s00894-006-0107-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2005] [Accepted: 01/11/2006] [Indexed: 11/26/2022]

Benito-Gutiérrez E, Garcia-Fernàndez J, Comella JX. Origin and evolution of the Trk family of neurotrophic receptors. Mol Cell Neurosci 2005;31:179-92. [PMID: 16253518 DOI: 10.1016/j.mcn.2005.09.007] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2005] [Revised: 08/11/2005] [Accepted: 09/08/2005] [Indexed: 01/19/2023] Open

Rádis-Baptista G, Kubo T, Oguiura N, Prieto da Silva ARB, Hayashi MAF, Oliveira EB, Yamane T. Identification of crotasin, a crotamine-related gene of Crotalus durissus terrificus. Toxicon 2004;43:751-9. [PMID: 15284009 DOI: 10.1016/j.toxicon.2004.02.023] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2003] [Accepted: 02/25/2004] [Indexed: 11/16/2022]

Brosius J. The contribution of RNAs and retroposition to evolutionary novelties. CONTEMPORARY ISSUES IN GENETICS AND EVOLUTION 2003. [DOI: 10.1007/978-94-010-0229-5_1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Overall CM. Molecular determinants of metalloproteinase substrate specificity: matrix metalloproteinase substrate binding domains, modules, and exosites. Mol Biotechnol 2002;22:51-86. [PMID: 12353914 DOI: 10.1385/mb:22:1:051] [Citation(s) in RCA: 357] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Li Y, Baldauf S, Lim EK, Bowles DJ. Phylogenetic analysis of the UDP-glycosyltransferase multigene family of Arabidopsis thaliana. J Biol Chem 2001;276:4338-43. [PMID: 11042215 DOI: 10.1074/jbc.m007447200] [Citation(s) in RCA: 291] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Binzak BA, Vockley JG, Jenkins RB, Vockley J. Structure and analysis of the human dimethylglycine dehydrogenase gene. Mol Genet Metab 2000;69:181-7. [PMID: 10767172 DOI: 10.1006/mgme.2000.2980] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Patthy L. Genome evolution and the evolution of exon-shuffling--a review. Gene 1999;238:103-14. [PMID: 10570989 DOI: 10.1016/s0378-1119(99)00228-0] [Citation(s) in RCA: 331] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Abstract

Recent studies on the genomes of protists, plants, fungi and animals confirm that the increase in genome size and gene number in different eukaryotic lineages is paralleled by a general decrease in genome compactness and an increase in the number and size of introns. It may thus be predicted that exon-shuffling has become increasingly significant with the evolution of larger, less compact genomes. To test the validity of this prediction, we have analyzed the evolutionary distribution of modular proteins that have clearly evolved by intronic recombination. The results of this analysis indicate that modular multidomain proteins produced by exon-shuffling are restricted in their evolutionary distribution. Although such proteins are present in all major groups of metazoa from sponges to chordates, there is practically no evidence for the presence of related modular proteins in other groups of eukaryotes. The biological significance of this difference in the composition of the proteomes of animals, fungi, plants and protists is best appreciated when these modular proteins are classified with respect to their biological function. The majority of these proteins can be assigned to functional categories that are inextricably linked to multicellularity of animals, and are of absolute importance in permitting animals to function in an integrated fashion: constituents of the extracellular matrix, proteases involved in tissue remodelling processes, various proteins of body fluids, membrane-associated proteins mediating cell-cell and cell-matrix interactions, membrane associated receptor proteins regulating cell cell communications, etc. Although some basic types of modular proteins seem to be shared by all major groups of metazoa, there are also groups of modular proteins that appear to be restricted to certain evolutionary lineages. In summary, the results suggest that exon-shuffling acquired major significance at the time of metazoan radiation. It is interesting to note that the rise of exon-shuffling coincides with a spectacular burst of evolutionary creativity: the Big Bang of metazoan radiation. It seems probable that modular protein evolution by exon-shuffling has contributed significantly to this accelerated evolution of metazoa, since it facilitated the rapid construction of multidomain extracellular and cell surface proteins that are indispensable for multicellularity.

Collapse

Talts JF, Wirl G, Dictor M, Muller WJ, Fässler R. Tenascin-C modulates tumor stroma and monocyte/macrophage recruitment but not tumor growth or metastasis in a mouse strain with spontaneous mammary cancer. J Cell Sci 1999;112 ( Pt 12):1855-64. [PMID: 10341205 DOI: 10.1242/jcs.112.12.1855] [Citation(s) in RCA: 59] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Campbell ID. The modular architecture of leukocyte cell-surface receptors. Immunol Rev 1998;163:11-8. [PMID: 9700498 DOI: 10.1111/j.1600-065x.1998.tb01184.x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Intron-exon structures. ACTA ACUST UNITED AC 1998. [DOI: 10.1016/s1067-5701(98)80020-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

The Atypical Serine Proteases of the Complement System**Received for publication on October 7, 1997. Adv Immunol 1998. [DOI: 10.1016/s0065-2776(08)60609-4] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]

Wakasugi K, Ishimori K, Morishima I. 'Module'-substituted globins: artificial exon shuffling among myoglobin, hemoglobin alpha- and beta-subunits. Biophys Chem 1997;68:265-73. [PMID: 9468623 DOI: 10.1016/s0301-4622(97)80556-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Hegyi H, Bork P. On the classification and evolution of protein modules. JOURNAL OF PROTEIN CHEMISTRY 1997;16:545-51. [PMID: 9246642 DOI: 10.1023/a:1026382032119] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Rzhetsky A, Ayala FJ, Hsu LC, Chang C, Yoshida A. Exon/intron structure of aldehyde dehydrogenase genes supports the "introns-late" theory. Proc Natl Acad Sci U S A 1997;94:6820-5. [PMID: 9192649 PMCID: PMC21242 DOI: 10.1073/pnas.94.13.6820] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open

Tremousaygue D, Bardet C, Dabos P, Regad F, Pelese F, Nazer R, Gander E, Lescure B. Genome DNA sequencing around the EF-1 alpha multigene locus of Arabidopsis thaliana indicates a high gene density and a shuffling of noncoding regions. Genome Res 1997;7:198-209. [PMID: 9074924 DOI: 10.1101/gr.7.3.198] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Patthy L. Exon shuffling and other ways of module exchange. Matrix Biol 1996;15:301-10; discussion 311-2. [PMID: 8981326 DOI: 10.1016/s0945-053x(96)90131-6] [Citation(s) in RCA: 79] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]

Long M, de Souza SJ, Rosenberg C, Gilbert W. Exon shuffling and the origin of the mitochondrial targeting function in plant cytochrome c1 precursor. Proc Natl Acad Sci U S A 1996;93:7727-31. [PMID: 8755543 PMCID: PMC38815 DOI: 10.1073/pnas.93.15.7727] [Citation(s) in RCA: 63] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

Bork P, Downing AK, Kieffer B, Campbell ID. Structure and distribution of modules in extracellular proteins. Q Rev Biophys 1996;29:119-67. [PMID: 8870072 DOI: 10.1017/s0033583500005783] [Citation(s) in RCA: 234] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]

Long M, Rosenberg C, Gilbert W. Intron phase correlations and the evolution of the intron/exon structure of genes. Proc Natl Acad Sci U S A 1995;92:12495-9. [PMID: 8618928 PMCID: PMC40384 DOI: 10.1073/pnas.92.26.12495] [Citation(s) in RCA: 186] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open

Kwiatowski J, Krawczyk M, Kornacki M, Bailey K, Ayala FJ. Evidence against the exon theory of genes derived from the triose-phosphate isomerase gene. Proc Natl Acad Sci U S A 1995;92:8503-6. [PMID: 7667319 PMCID: PMC41185 DOI: 10.1073/pnas.92.18.8503] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open

Teller JK, Baker PJ, Britton KL, Engel PC, Rice DW, Stillman TJ. Correlation of intron-exon organisation with the three-dimensional structure in glutamate dehydrogenase. BIOCHIMICA ET BIOPHYSICA ACTA 1995;1247:231-8. [PMID: 7696313 DOI: 10.1016/0167-4838(94)00240-h] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Strelets VB, Lim HA. Ancient splice junction shadows with relation to blocks in protein structure. Biosystems 1995;36:37-41. [PMID: 8527694 DOI: 10.1016/0303-2647(95)01525-p] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

Eisenhaber F, Persson B, Argos P. Protein structure prediction: recognition of primary, secondary, and tertiary structural features from amino acid sequence. Crit Rev Biochem Mol Biol 1995;30:1-94. [PMID: 7587278 DOI: 10.3109/10409239509085139] [Citation(s) in RCA: 97] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Strelets VB, Shindyalov IN, Lim HA. Analysis of peptides from known proteins: clusterization in sequence space. J Mol Evol 1994;39:625-30. [PMID: 7807551 DOI: 10.1007/bf00160408] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

Woessner JP, Molendijk AJ, van Egmond P, Klis FM, Goodenough UW, Haring MA. Domain conservation in several volvocalean cell wall proteins. PLANT MOLECULAR BIOLOGY 1994;26:947-960. [PMID: 8000007 DOI: 10.1007/bf00028861] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]

Stoltzfus A, Spencer DF, Zuker M, Logsdon JM, Doolittle WF. Testing the exon theory of genes: the evidence from protein structure. Science 1994;265:202-7. [PMID: 8023140 DOI: 10.1126/science.8023140] [Citation(s) in RCA: 158] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Introns and exons. Curr Opin Struct Biol 1994. [DOI: 10.1016/s0959-440x(94)90108-2] [Citation(s) in RCA: 73] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

White SH. The evolution of proteins from random amino acid sequences: II. Evidence from the statistical distributions of the lengths of modern protein sequences. J Mol Evol 1994;38:383-94. [PMID: 8007006 DOI: 10.1007/bf00163155] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Abstract

This paper continues an examination of the hypothesis that modern proteins evolved from random heteropeptide sequences. In support of the hypothesis, White and Jacobs (1993, J Mol Evol 36:79-95) have shown that any sequence chosen randomly from a large collection of nonhomologous proteins has a 90% or better chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. The goal of the present study was to investigate the possibility that the random-origin hypothesis could explain the lengths of modern protein sequences without invoking specific mechanisms such as gene duplication or exon splicing. The sets of sequences examined were taken from the 1989 PIR database and consisted of 1,792 "super-family" proteins selected to have little sequence identity, 623 E. coli sequences, and 398 human sequences. The length distributions of the proteins could be described with high significance by either of two closely related probability density functions: The gamma distribution with parameter 2 or the distribution for the sum of two exponential random independent variables. A simple theory for the distributions was developed which assumes that (1) protoprotein sequences had exponentially distributed random independent lengths, (2) the length dependence of protein stability determined which of these protoproteins could fold into compact primitive proteins and thereby attain the potential for biochemical activity, (3) the useful protein sequences were preserved by the primitive genome, and (4) the resulting distribution of sequence lengths is reflected by modern proteins. The theory successfully predicts the two observed distributions which can be distinguished by the functional form of the dependence of protein stability on length. The theory leads to three interesting conclusions. First, it predicts that a tetra-nucleotide was the signal for primitive translation termination. This prediction is entirely consistent with the observations of Brown et al. (1990a,b, Nucleic Acids Res 18:2079-2086 and 18: 6339-6345) which show that tetra-nucleotides (stop codon plus following nucleotide) are the actual signals for termination of translation in both prokaryotes and eukaryotes. Second, the strong dependence of statistical length distributions on sequence-termination signaling codes implies that the evolution of stop codons and translation-termination processes was as important as gene splicing in early evolution. Third, because the theory is based upon a simple no-exon stochastic model, it provides a plausible alternative to a limited universe of exons from which all proteins evolved by gene duplication and exon splicing (Dorit et al. 1990, Science 250:1377-1382).

Collapse

Dibb NJ. Why do genes have introns? FEBS Lett 1993;325:135-9. [PMID: 8513885 DOI: 10.1016/0014-5793(93)81429-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

White SH, Jacobs RE. The evolution of proteins from random amino acid sequences. I. Evidence from the lengthwise distribution of amino acids in modern protein sequences. J Mol Evol 1993;36:79-95. [PMID: 8433379 DOI: 10.1007/bf02407307] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Abstract

We examine in this paper one of the expected consequences of the hypothesis that modern proteins evolved from random heteropeptide sequences. Specifically, we investigate the lengthwise distributions of amino acids in a set of 1,789 protein sequences with little sequence identify using the run test statistic (ro) of Mood (1940, Ann. Math. Stat. 11, 367-392). The probability density of ro for a collection of random sequences has mean = 0 and variance = 1 [the N(0,1) distribution] and can be used to measure the tendency of amino acids of a given type to cluster together in a sequence relative to that of a random sequence. We implement the run test using binary representations of protein sequences in which the amino acids of interest are assigned a value of 1 and all others a value of 0. We consider individual amino acids and sets of various combinations of them based upon hydrophobicity (4 sets), charge (3 sets), volume (4 sets), and secondary structure propensity (3 sets). We find that any sequence chosen randomly has a 90% or greater chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. We regard this as strong support for the random-origin hypothesis. However, we do observe significant deviations from the random expectation as might be expected after billions years of evolution. Two important global trends are found: (1) Amino acids with a strong alpha-helix propensity show a strong tendency to cluster whereas those with beta-sheet or reverse-turn propensity do not. (2) Clustered rather than evenly distributed patterns tend to be preferred by the individual amino acids and this is particularly so for methionine. Finally, we consider the problem of reconciling the random nature of protein sequences with structurally meaningful periodic "patterns" that can be detected by sliding-window, autocorrelation, and Fourier analyses. Two examples, rhodopsin and bacteriorhodopsin, show that such patterns are a natural feature of random sequences.

Collapse

Nolan KF, Kaluz S, Higgins JM, Goundis D, Reid KB. Characterization of the human properdin gene. Biochem J 1992;287 ( Pt 1):291-7. [PMID: 1417780 PMCID: PMC1133157 DOI: 10.1042/bj2870291] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Gelfand MS. Statistical analysis and prediction of the exonic structure of human genes. J Mol Evol 1992;35:239-52. [PMID: 1518091 DOI: 10.1007/bf00178600] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Cavener DR. GMC oxidoreductases. A newly defined family of homologous proteins with diverse catalytic activities. J Mol Biol 1992;223:811-4. [PMID: 1542121 DOI: 10.1016/0022-2836(92)90992-s] [Citation(s) in RCA: 212] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Dorit RL, Gilbert W. The limited universe of exons. Curr Opin Genet Dev 1991;1:464-9. [PMID: 1822278 DOI: 10.1016/s0959-437x(05)80193-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

The limited universe of exons. Curr Opin Struct Biol 1991. [DOI: 10.1016/0959-440x(91)90093-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Palmer JD, Logsdon JM. The recent origins of introns. Curr Opin Genet Dev 1991;1:470-7. [PMID: 1822279 DOI: 10.1016/s0959-437x(05)80194-7] [Citation(s) in RCA: 202] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Bräuer C, Scheit KH. Characterization of the gene for the bovine seminal vesicle secretory protein SVSP109. BIOCHIMICA ET BIOPHYSICA ACTA 1991;1090:259-60. [PMID: 1932121 DOI: 10.1016/0167-4781(91)90113-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Dorit RL, Schoenbach L, Gilbert W. Response. Science 1991;253:679-80. [PMID: 17772372 DOI: 10.1126/science.253.5020.679] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]