1
|
Hudson RR. TESTING THE CONSTANT-RATE NEUTRAL ALLELE MODEL WITH PROTEIN SEQUENCE DATA. Evolution 2017; 37:203-217. [PMID: 28568026 DOI: 10.1111/j.1558-5646.1983.tb05528.x] [Citation(s) in RCA: 152] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/1981] [Revised: 03/31/1982] [Indexed: 11/29/2022]
Affiliation(s)
- Richard R. Hudson
- Department of Biology; University of Pennsylvania; Philadelphia Pennsylvania 19104
| |
Collapse
|
2
|
Zhu W, Cooper DN, Zhao Q, Wang Y, Liu R, Li Q, Férec C, Wang Y, Chen JM. Concurrent nucleotide substitution mutations in the human genome are characterized by a significantly decreased transition/transversion ratio. Hum Mutat 2015; 36:333-41. [PMID: 25546635 DOI: 10.1002/humu.22749] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2014] [Accepted: 12/17/2014] [Indexed: 01/16/2023]
Abstract
There is accumulating evidence that the number of multiple-nucleotide substitutions (MNS) occurring in closely spaced sites in eukaryotic genomes is significantly higher than would be predicted from the random accumulation of independently generated single-nucleotide substitutions (SNS). Although this excess can in principle be accounted for by the concept of transient hypermutability, a general mutational signature of concurrent MNS mutations has not so far been evident. Employing a dataset (N = 449) of "concurrent" double MNS mutations causing human inherited disease, we have identified just such a mutational signature: concurrently generated double MNS mutations exhibit a >twofold lower transition/transversion ratio (termed RTs/Tv ) than independently generated de novo SNS mutations (<0.80 vs. 2.10; P = 2.69 × 10(-14) ). We replicated this novel finding through a similar analysis employing two double MNS variant datasets with differing abundances of concurrent events (150,521 variants with both substitutions on the same haplotypic lineage vs. 94,875 variants whose component substitutions were on different haplotypic lineages) plus 5,430,874 SNS variants, all being derived from the whole-genome sequencing of seven Chinese individuals. Evaluation of the newly observed mutational signature in diverse contexts provides solid support for the postulated role of translesion synthesis DNA polymerases in transient hypermutability.
Collapse
Affiliation(s)
- Wenjuan Zhu
- Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, China
| | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Abstract
In 1994, Muse and Gaut (MG) and Goldman and Yang (GY) proposed evolutionary models that recognize the coding structure of the nucleotide sequences under study, by defining a Markovian substitution process with a state space consisting of the 61 sense codons (assuming the universal genetic code). Several variations and extensions to their models have since been proposed, but no general and flexible framework for contrasting the relative performance of alternative approaches has yet been applied. Here, we compute Bayes factors to evaluate the relative merit of several MG and GY styles of codon substitution models, including recent extensions acknowledging heterogeneous nonsynonymous rates across sites, as well as selective effects inducing uneven amino acid or codon preferences. Our results on three real data sets support a logical model construction following the MG formulation, allowing for a flexible account of global amino acid or codon preferences, while maintaining distinct parameters governing overall nucleotide propensities. Through posterior predictive checks, we highlight the importance of such a parameterization. Altogether, the framework presented here suggests a broad modeling project in the MG style, stressing the importance of combining and contrasting available model formulations and grounding developments in a sound probabilistic paradigm.
Collapse
|
4
|
KRAUS FRED, BROWN WESLEYM. Phylogenetic relationships of colubroid snakes based on mitochondrial DNA sequences. Zool J Linn Soc 2008. [DOI: 10.1111/j.1096-3642.1998.tb02159.x] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
5
|
Abstract
Across all kingdoms of biological life, protein-coding genes exhibit unequal usage of synonymous codons. Although alternative theories abound, translational selection has been accepted as an important mechanism that shapes the patterns of codon usage in prokaryotes and simple eukaryotes. Here we analyze patterns of codon usage across 74 diverse bacteriophages that infect E. coli, P. aeruginosa, and L. lactis as their primary host. We use the concept of a “genome landscape,” which helps reveal non-trivial, long-range patterns in codon usage across a genome. We develop a series of randomization tests that allow us to interrogate the significance of one aspect of codon usage, such as GC content, while controlling for another aspect, such as adaptation to host-preferred codons. We find that 33 phage genomes exhibit highly non-random patterns in their GC3-content, use of host-preferred codons, or both. We show that the head and tail proteins of these phages exhibit significant bias towards host-preferred codons, relative to the non-structural phage proteins. Our results support the hypothesis of translational selection on viral genes for host-preferred codons, over a broad range of bacteriophages. Any protein can be encoded by multiple, synonymous spellings. But organisms typically prefer one spelling over another—a phenomenon known as codon bias. Codon bias is generally understood to result from selection for synonymous spellings that increase the rate and accuracy of protein translation. In this work, we have examined the complete genomes of all sequenced viruses that infect the bacteria E. coli, P. aeruginosa, and L. lactis, and have found that many of these viral genomes also exhibit codon bias. Moreover, the degree of codon bias varies across the viral genome, as visualized using a technique called a “genome landscape.” By comparing the observed genomes to randomly drawn genomes, we demonstrate that the regions of high codon bias in these viral genomes often coincide with regions encoding structural proteins. Thus, the proteins that a virus needs to produce in high copy number utilize the same encoding as its host organism does for highly expressed proteins. Our results extend the translational theory of codon bias to the viral kingdom: parts of the viral genome are selected to obey the preferences of its host.
Collapse
|
6
|
Qin H, Wu WB, Comeron JM, Kreitman M, Li WH. Intragenic spatial patterns of codon usage bias in prokaryotic and eukaryotic genomes. Genetics 2005; 168:2245-60. [PMID: 15611189 PMCID: PMC1448744 DOI: 10.1534/genetics.104.030866] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
To study the roles of translational accuracy, translational efficiency, and the Hill-Robertson effect in codon usage bias, we studied the intragenic spatial distribution of synonymous codon usage bias in four prokaryotic (Escherichia coli, Bacillus subtilis, Sulfolobus tokodaii, and Thermotoga maritima) and two eukaryotic (Saccharomyces cerevisiae and Drosophila melanogaster) genomes. We generated supersequences at each codon position across genes in a genome and computed the overall bias at each codon position. By quantitatively evaluating the trend of spatial patterns using isotonic regression, we show that in yeast and prokaryotic genomes, codon usage bias increases along translational direction, which is consistent with purifying selection against nonsense errors. Fruit fly genes show a nearly symmetric M-shaped spatial pattern of codon usage bias, with less bias in the middle and both ends. The low codon usage bias in the middle region is best explained by interference (the Hill-Robertson effect) between selections at different codon positions. In both yeast and fruit fly, spatial patterns of codon usage bias are characteristically different from patterns of GC-content variations. Effect of expression level on the strength of codon usage bias is more conspicuous than its effect on the shape of the spatial distribution.
Collapse
Affiliation(s)
- Hong Qin
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA
| | | | | | | | | |
Collapse
|
7
|
Plotkin JB, Dushoff J. Codon bias and frequency-dependent selection on the hemagglutinin epitopes of influenza A virus. Proc Natl Acad Sci U S A 2003; 100:7152-7. [PMID: 12748378 PMCID: PMC165845 DOI: 10.1073/pnas.1132114100] [Citation(s) in RCA: 140] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Although the surface proteins of human influenza A virus evolve rapidly and continually produce antigenic variants, the internal viral genes acquire mutations very gradually. In this paper, we analyze the sequence evolution of three influenza A genes over the past two decades. We study codon usage as a discriminating signature of gene- and even residue-specific diversifying and purifying selection. Nonrandom codon choice can increase or decrease the effective local substitution rate. We demonstrate that the codons of hemagglutinin, particularly those in the antibody-combining regions, are significantly biased toward substitutional point mutations relative to the codons of other influenza virus genes. We discuss the evolutionary interpretation and implications of these biases for hemagglutinin's antigenic evolution. We also introduce information-theoretic methods that use sequence data to detect regions of recent positive selection and potential protein conformational changes.
Collapse
Affiliation(s)
- Joshua B Plotkin
- Institute for Advanced Study, Olden Lane, Princeton, NJ 08540, USA
| | | |
Collapse
|
8
|
Abstract
Changes in technology in the past decade have had such an impact on the way that molecular evolution research is done that it is difficult now to imagine working in a world without genomics or the Internet. In 1992, GenBank was less than a hundredth of its current size and was updated every three months on a huge spool of tape. Homology searches took 30 minutes and rarely found a hit. Now it is difficult to find sequences with only a few homologs to use as examples for teaching bioinformatics. For molecular evolution researchers, the genomics revolution has showered us with raw data and the information revolution has given us the wherewithal to analyze it. In broad terms, the most significant outcome from these changes has been our newfound ability to examine the evolution of genomes as a whole, enabling us to infer genome-wide evolutionary patterns and to identify subsets of genes whose evolution has been in some way atypical.
Collapse
Affiliation(s)
- Kenneth H Wolfe
- Department of Genetics, Smurfit Institute, University of Dublin, Trinity College, Dublin 2, Ireland.
| | | |
Collapse
|
9
|
Alvarez-Valin F, Tort JF, Bernardi G. Nonrandom spatial distribution of synonymous substitutions in the GP63 gene from Leishmania. Genetics 2000; 155:1683-92. [PMID: 10924466 PMCID: PMC1461213 DOI: 10.1093/genetics/155.4.1683] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
In this work we analyze the variability in substitution rates in the GP63 gene from Leishmania. By using a sliding window to estimate substitution rates along the gene, we found that the rate of synonymous substitutions along the GP63 gene is highly correlated with both the rate of amino acid substitution and codon bias. Furthermore, we show that comparisons involving genes that represent independent phylogenetic lines yield very similar divergence/conservation patterns, thus suggesting that deterministic forces (i.e., nonstochastic forces such as selection) generated these patterns. We present evidence indicating that the variability in substitution rates is unambiguously related to functionally relevant features. In particular, there is a clear relationship between rates and the tertiary structure of the encoded protein since all divergent segments are located on the surface of the molecule and facing one side (almost parallel to the cell membrane) on the exposed surface of the organism. Remarkably, the protein segments encoded by these variable regions encircle the active site in a funnel-like distribution. These results strongly suggest that the pattern of nucleotide divergence and, notably, of synonymous divergence is affected by functional constraints.
Collapse
Affiliation(s)
- F Alvarez-Valin
- Sección Biomatemática, Facultad de Ciencias, Montevideo 11400, Uruguay.
| | | | | |
Collapse
|
10
|
Leluk J. A new algorithm for analysis of the homology in protein primary structure. COMPUTERS & CHEMISTRY 1998; 22:123-31. [PMID: 9570113 DOI: 10.1016/s0097-8485(97)00035-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
A new algorithm for analysis of the homology and genetic semihomology in protein sequence is described. It assumes the close relation between the compared amino acids and their codons in related proteins. The algorithm is based on the network of the genetic relationship between amino acids and, thus differs from the commonly used statistical matrices. The results obtained by using this method are more comprehensive than used at present, and reflect the actual mechanism of protein differentiation and evolution. They concern: (1) location of homologous and semihomologous sites in compared proteins; (2) precise estimation of insertion/deletion gaps in non-homologous fragments; (3) analysis of internal homology and semihomology; (4) precise location of domains in multidomain proteins; (5) estimation of genetic code of non-homologous fragments; (6) construction of genetic probes; (7) studies on differentiation processes among related proteins; (8) estimation of the degree of relationship among related proteins; (9) studies on the evolution mechanism within homologous protein families and (10) confirmation of actual relationship of sequences showing low degree of homology.
Collapse
Affiliation(s)
- J Leluk
- Institute of Biochemistry and Molecular Biology, University of Wrocław, Poland.
| |
Collapse
|
11
|
Human Evolution. Hum Genet 1997. [DOI: 10.1007/978-3-662-03356-2_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
12
|
Abstract
The relative contribution of mutation and purifying selection to transition bias has not been quantitatively assessed in mitochondrial protein genes. The observed transition/transversion (s/v) ratio is (micros Ps)/(microv Pv), where micros and microv denote mutation rate of transitions and transversions, respectively, and Ps and Pv denote fixation probabilities of transitions and transversions, respectively. Because selection against synonymous transitions can be assumed to be roughly equal to that against synonymous transversions, Ps/Pv approximately 1 at fourfold degenerate sites, so that the s/v ratio at fourfold degenerate sites is approximately micros/microv, which is a measure of mutational contribution to transition bias. Similarly, the s/v ratio at nondegenerate sites is also an estimate of micros/microv if we assume that selection against nonsynonymous transitions is roughly equal to that against nonsynonymous transversions. In two mitochondrial genes, cytochrome oxidase subunit I (COI) and cytochrome b (cyt-b) in pocket gophers, the s/v ratio is about two at nondegenerate and fourfold degenerate sites for both the COI and the cyt-b genes. This implies that mutation contribution to transition bias is relatively small. In contrast, the s/v ratio is much greater at twofold degenerate sites, being 48 for COI and 40 for cyt-b. Given that the micros/microv ratio is about 2, the Ps/Pv ratio at twofold degenerate sites must be on the order of 20 or greater. This suggests a great effect of purifying selection on transition bias in mitochondrial protein genes because transitions are synonymous and transversions are nonsynonymous at twofold degenerate sites in mammalian mitochondrial genes. We also found that nonsynonymous mutations at twofold degenerate sites are more neutral than nonsynonymous mutations at nondegenerate sites, and that the COI gene is subject to stronger purifying selection than is the cyt-b gene. A model is presented to integrate the effect of purifying selection, codon bias, DNA repair and GC content on s/v ratio of protein-coding genes.
Collapse
Affiliation(s)
- X Xia
- Museum of Natural Science, Louisiana State University, Baton Rouge, LA 70803, USA
| | | | | |
Collapse
|
13
|
Mouchiroud D, Gautier C, Bernardi G. Frequencies of synonymous substitutions in mammals are gene-specific and correlated with frequencies of nonsynonymous substitutions. J Mol Evol 1995; 40:107-13. [PMID: 7714909 DOI: 10.1007/bf00166602] [Citation(s) in RCA: 66] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
The frequencies of synonymous substitutions of mammalian genes cover a much wider range than previously thought. We report here that the different frequencies found in homologous genes from a given mammalian pair are correlated with those in the same homologous genes from a different mammalian pair. This indicates that the frequencies of synonymous substitutions are gene-specific (as are the frequencies of nonsynonymous substitutions), or, in other words, that "fast" and "slow" genes in one mammal are fast and slow, respectively, in any other one. Moreover, the frequencies of synonymous substitutions are correlated with the frequencies of nonsynonymous substitution in the same genes.
Collapse
Affiliation(s)
- D Mouchiroud
- Laboratoire de Biométrie, Génétique et Biologie des Populations, U.R.A. 243, Université Claude Bernard, Villeurbanne, France
| | | | | |
Collapse
|
14
|
Abstract
Chimpanzee, tamarin, and marmoset interleukin-3 (IL-3) genes were cloned, sequenced, and expressed. Western blot analysis demonstrated that functional genes were isolated. IL-3 sequences were compared with those of mouse, rat, rhesus monkey, gibbon, and man. Multiple alignment of the IL-3 coding regions showed that only a few regions had been conserved during mammalian evolution, which are likely associated with functional domains of the IL-3 protein. Substitution rates for the various lineages were calculated and the numbers of synonymous and nonsynonymous substitutions were estimated separately. Distance matrices of the IL-3 coding regions were used to construct phylogenetic trees which revealed large differences in IL-3 evolution rate as well as a more rapid substitution rate for rodents and a rate slowdown during hominoid evolution. Extremes were rhesus monkey IL-3, which accumulated few synonymous substitutions, and gibbon IL-3, which had almost exclusively synonymous substitutions. In rhesus monkey IL-3, nonsynonymous substitutions outnumbered synonymous substitutions, which could not be readily explained by a random process of substitutions. We assume that during evolution of IL-3, the majority of the amino acid replacements and the impaired interspecies functional cross-reactivity originate from selection mechanisms with the most likely selective force being the structure of the heterodimeric IL.3 cell-surface receptor. Insight into IL-3 architecture and structural analysis of the IL-3 receptor are needed to analyze the unusually fast evolution of IL-3 in more detail.
Collapse
Affiliation(s)
- H Burger
- Department of Medical Oncology, Dr. Daniel den Hoed Cancer Center/Dijkzigt, University Hospital Rotterdam, The Netherlands
| | | | | | | |
Collapse
|
15
|
Abstract
A formal mathematical analysis of the substitution process in nucleotide sequence evolution was done in terms of the Markov process. By using matrix algebra theory, the theoretical foundation of Barry and Hartigan's (Stat. Sci. 2:191-210, 1987) and Lanave et al.'s (J. Mol. Evol. 20:86-93, 1984) methods was provided. Extensive computer simulation was used to compare the accuracy and effectiveness of various methods for estimating the evolutionary distance between two nucleotide sequences. It was shown that the multiparameter methods of Lanave et al.'s (J. Mol. Evol. 20:86-93, 1984), Gojobori et al.'s (J. Mol. Evol. 18:414-422, 1982), and Barry and Hartigan's (Stat. Sci. 2:191-210, 1987) are preferable to others for the purpose of phylogenetic analysis when the sequences are long. However, when sequences are short and the evolutionary distance is large, Tajima and Nei's (Mol. Biol. Evol. 1:269-285, 1984) method is superior to others.
Collapse
Affiliation(s)
- A Zharkikh
- Center for Demographic and Population Genetics, University of Texas, Houston 77225
| |
Collapse
|
16
|
Wolfe KH, Sharp PM. Mammalian gene evolution: nucleotide sequence divergence between mouse and rat. J Mol Evol 1993; 37:441-56. [PMID: 8308912 DOI: 10.1007/bf00178874] [Citation(s) in RCA: 147] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
As a paradigm of mammalian gene evolution, the nature and extent of DNA sequence divergence between homologous protein-coding genes from mouse and rat have been investigated. The data set examined includes 363 genes totalling 411 kilobases, making this by far the largest comparison conducted between a single pair of species. Mouse and rat genes are on average 93.4% identical in nucleotide sequence and 93.9% identical in amino acid sequence. Individual genes vary substantially in the extent of nonsynonymous nucleotide substitution, as expected from protein evolution studies; here the variation is characterized. The extent of synonymous (or silent) substitution also varies considerably among genes, though the coefficient of variation is about four times smaller than for nonsynonymous substitutions. A small number of genes mapped to the X-chromosome have a slower rate of molecular evolution than average, as predicted if molecular evolution is "male-driven." Base composition at silent sites varies from 33% to 95% G+C in different genes; mouse and rat homologues differ on average by only 1.7% in silent-site G+C, but it is shown that this is not necessarily due to any selective constraint on their base composition. Synonymous substitution rates and silent site base composition appear to be related (genes at intermediate G+C have on average higher rates), but the relationship is not as strong as in our earlier analyses. Rates of synonymous and nonsynonymous substitution are correlated, apparently because of an excess of substitutions involving adjacent pairs of nucleotides. Several factors suggest that synonymous codon usage in rodent genes is not subject to selection.
Collapse
Affiliation(s)
- K H Wolfe
- Department of Genetics, University of Dublin, Trinity College, Ireland
| | | |
Collapse
|
17
|
Abstract
A new statistical test has been developed to detect selection on silent sites. This test compares the codon usage within a gene and thus does not require knowledge of which genes are under the greatest selection, that there exist common trends in codon usage across genes, or that genes have the same mutation pattern. It also controls for mutational biases that might be introduced by the adjacent bases. The test was applied to 62 mammalian sequences, and significant codon usage biases were detected in all three species examined (humans, rats, and mice). However, these biases appear not to be the consequence of selection, but of the first base pair in the codon influencing the mutation pattern at the third position.
Collapse
Affiliation(s)
- A C Eyre-Walker
- Institute of Cell, Animal and Population Biology, University of Edinburgh, United Kingdom
| |
Collapse
|
18
|
Lee KY, Hopkins JD, Syvanen M. Evolved neomycin phosphotransferase from an isolate of Klebsiella pneumoniae. Mol Microbiol 1991; 5:2039-46. [PMID: 1662755 DOI: 10.1111/j.1365-2958.1991.tb00826.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
A new aminoglycoside resistance gene (aphA1-IAB) confers high-level resistance to neomycin. The sequence of aphA1-IAB is closely related to aphA1 found in the transposons Tn4352, Tn903 and Tn602. For example, aphA1-IAB differs from aphA1-903 at five nucleotides that result in four amino acid replacements. The enzyme encoded by aphA1-IAB has a significantly higher turnover number with neomycin, kanamycin and G418 as substrates than does the aphA1-903 enzyme. A parsimonious phylogenetic tree suggests that aphA1-IAB evolved from an ancestral form that is closely related or identical to the aphA1 found in Tn903. The excess of replacement substitutions over silent substitutions in aphA1-IAB, as well as its convergence toward aphA3 from Staphylococcus aureus, is indicative of selective evolution. Our hypothesis to explain these results is that aphA1-IAB evolved under the selective pressure of neomycin use in relatively recent times.
Collapse
Affiliation(s)
- K Y Lee
- Department of Medical Microbiology and Immunology, School of Medicine, University of California, Davis 95616
| | | | | |
Collapse
|
19
|
De Giorgi C, De Luca F, Saccone C. Mitochondrial DNA in the sea urchin Arbacia lixula: nucleotide sequence differences between two polymorphic molecules indicate asymmetry of mutations. Gene 1991; 103:249-52. [PMID: 1653758 DOI: 10.1016/0378-1119(91)90281-f] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Two polymorphic forms of mitochondrial DNA (mtDNA) extracted from Arbacia lixula eggs were cloned and the nucleotide sequences of specific regions determined. A comparison of the sequences of the sense strand of the two molecules demonstrates that all the differences are transitions and only of the A----G type. A change such as G----A (or A----G) on the sense mtDNA strand results from either a direct G----A (or A----G) mutation on that strand or a C----T (or T----C) on the complementary strand. None of the C----T (or T----C) changes were detected on the sense strand, which implies that the A----G mutation bias on the sense strand is not reversed for the other strand. Our observation indicates the existence of mechanisms acting asymmetrically on the two mtDNA strands, possibly during mtDNA replication.
Collapse
Affiliation(s)
- C De Giorgi
- Dipartimento di Biochimica e Biologia Molecolare, University of Bari, Italy
| | | | | |
Collapse
|
20
|
|
21
|
Cooper DN, Krawczak M. The mutational spectrum of single base-pair substitutions causing human genetic disease: patterns and predictions. Hum Genet 1990; 85:55-74. [PMID: 2192981 DOI: 10.1007/bf00276326] [Citation(s) in RCA: 275] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Reports of single base-pair substitutions that cause human genetic disease and that have been located and characterized in an unbiased fashion were collated; 32% of point mutations were CG----TG or CG----CA transitions consistent with a chemical model of mutation via methylation-mediated deamination. This represents a 12-fold higher frequency than that predicted from random expectation, confirming that CG dinucleotides are indeed hotspots of mutation causing human genetic disease. However, since CG also appears hypermutable irrespective of methylation-mediated deamination, a second mechanism may also be involved in generating CG mutations. The spectrum of point mutations occurring outwith CG dinucleotides is also non-random, at both the mono- and dinucleotide, levels. An intrinsic bias in clinical detection was excluded since frequencies of specific amino acid substitutions did not correlate with the 'chemical difference' between the amino acids exchanged. Instead, a strong correlation was observed with the mutational spectrum predicted from the experimentally measured mispairing frequencies of vertebrate DNA polymerases alpha and beta in vitro. This correlation appears to be independent of any difference in the efficiency of enzymatic proofreading/mismatch-repair mechanisms but is consistent with a physical model of mutation through nucleotide misincorporation as a result of transient misalignment of bases at the replication fork. This model is further supported by an observed correlation between dinucleotide mutability and stability, possibly because transient misalignment must be stabilized long enough for misincorporation to occur. Since point mutations in human genes causing genetic disease neither arise by random error nor are independent of their local sequence environment, predictive models may be considered. We present a computer model (MUTPRED) based upon empirical data; it is designed to predict the location of point mutations within gene coding regions causing human genetic disease. The mutational spectrum predicted for the human factor IX gene was shown to resemble closely the observed spectrum of point mutations causing haemophilia B. Further, the model was able to predict successfully the rank order of disease prevalence and/or mutation rates associated with various human autosomal dominant and sex-linked recessive conditions. Although still imperfect, this model nevertheless represents an initial attempt to relate the variable prevalence of human genetic disease to the mutability inherent in the nucleotide sequences of the underlying genes.
Collapse
Affiliation(s)
- D N Cooper
- Molecular Genetics Section, Thrombosis Research Institute, Chelsea, London, UK
| | | |
Collapse
|
22
|
Palumbi SR. Rates of molecular evolution and the fraction of nucleotide positions free to vary. J Mol Evol 1989; 29:180-7. [PMID: 2509718 DOI: 10.1007/bf02100116] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Selective constraints on DNA sequence change were incorporated into a model of DNA divergence by restricting substitutions to a subset of nucleotide positions. A simple model showed that both mutation rate and the fraction of nucleotide positions free to vary are strong determinants of DNA divergence over time. When divergence between two species approaches the fraction of positions free to vary, standard methods that correct for multiple mutations yield severe underestimates of the number of substitutions per site. A modified method appropriate for use with DNA sequence, restriction site, or thermal renaturation data is derived taking this fraction into account. The model also showed that the ratio of divergence in two gene classes (e.g., nuclear and mitochondrial) may vary widely over time even if the ratio of mutation rates remains constant. DNA sequence divergence data are used increasingly to detect differences in rates of molecular evolution. Often, variation in divergence rate is assumed to represent variation in mutation rate. The present model suggests that differing divergence rates among comparisons (either among gene classes or taxa) should be interpreted cautiously. Differences in the fraction of nucleotide positions free to vary can serve as an important alternative hypothesis to explain differences in DNA divergence rates.
Collapse
Affiliation(s)
- S R Palumbi
- Department of Zoology, University of Hawaii, Honolulu 96822
| |
Collapse
|
23
|
Ramharack R, Deeley RG. Structure and evolution of primate cytochrome c oxidase subunit II gene. J Biol Chem 1987. [DOI: 10.1016/s0021-9258(18)47897-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
|
24
|
Mottez E, Rogan PK, Manuelidis L. Conservation in the 5' region of the long interspersed mouse L1 repeat: implications of comparative sequence analysis. Nucleic Acids Res 1986; 14:3119-36. [PMID: 3008107 PMCID: PMC339725 DOI: 10.1093/nar/14.7.3119] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
A clone of 7.1kb corresponding to the mouse L1 interspersed repeat family was selected for homology to a human interspersed repeat. This clone fairly represents mouse genomic members. Mapping of the clone revealed one common element at both the 5' and 3' ends in a head to tail arrangement, suggesting that at least some long L1 family members are tandemly arranged; genomic studies confirmed the unexpected tandem arrangement of a minor proportion of L1 members. A short SmaI tandem repeat appears to define the 5' end of most L1 family members. SmaI repeats may maintain, via a recursive regulatory function, the transcriptional viability of L1 members after retroposition events. A 2.5kb portion of the mouse L1 repeat that has not been previously sequenced is presented. It is 55-70% homologous to a corresponding portion of the human KpnI repeat family. Comparative sequence analysis revealed that one common open reading frame may conserve potential coding function across species. A second open reading frame bears an asymmetric distribution of codon replacements unlike both genes and pseudogenes. This latter feature could be consistent with a proposed chromosome organization function that is unrelated to peptide expression.
Collapse
|
25
|
Vogel F, Motulsky AG. Human Evolution. Hum Genet 1986. [DOI: 10.1007/978-3-662-02489-8_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
26
|
Stuart CA. Phylogenetic distance from man correlates with immunologic cross-reactivity among liver insulin receptors. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY. B, COMPARATIVE BIOCHEMISTRY 1986; 84:167-72. [PMID: 2426033 DOI: 10.1016/0305-0491(86)90200-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The hepatic insulin receptors from evolutionarily diverse species were evaluated to determine their ability to bind human insulin and their immunologic cross-reactivity with the human insulin receptor. We found that the liver membranes of each species possessed insulin receptors with remarkably similar affinities for human insulin. An immunoassay specific for human insulin receptor showed that the amount of shared antigenic determinants differed widely among the species tested and in general decreased as the phylogenetic distance from man increased. These data suggest that the ability to bind insulin has been highly conserved during evolution despite considerable variation in the primary structure of the insulin receptor.
Collapse
|
27
|
Golding GB, Glickman BW. Sequence-directed mutagenesis: evidence from a phylogenetic history of human alpha-interferon genes. Proc Natl Acad Sci U S A 1985; 82:8577-81. [PMID: 3866242 PMCID: PMC390960 DOI: 10.1073/pnas.82.24.8577] [Citation(s) in RCA: 36] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
We have studied the potential contribution of template-dependent events to genetic variation in mammals by examining the sequence alterations that have occurred in the recent evolution of human interferon genes. Fifteen members of the human alpha-interferon gene family were aligned, and a phylogenetic history was inferred. Many multiple events are inferred to have occurred in the evolution of the interferon genes and for the majority of these local DNA sequences were present that were capable of serving as templates for their occurrence. We conclude that the DNA sequence has the potential to explain many of the inferred spontaneous events and to explain complex alterations to sequences--i.e., the joint occurrence of base substitutions and insertions/deletions. Thus, such a mechanism would often cause multiple sequence changes as a result of a single mutational event and would provide additional genetic variation for evolution. Sequence-directed mutations would depend upon the local DNA sequences and, hence, would not be random at the DNA level.
Collapse
|
28
|
Blaisdell BE. A method of estimating from two aligned present-day DNA sequences their ancestral composition and subsequent rates of substitution, possibly different in the two lineages, corrected for multiple and parallel substitutions at the same site. J Mol Evol 1985; 22:69-81. [PMID: 3932665 DOI: 10.1007/bf02105807] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The course of evolutionary change in DNA sequences has been modeled as a Markov process. The Markov process was represented by discrete time matrix methods. The parameters of the Markov transition matrices were estimated by least-squares direct-search optimization of the fit of the calculated divergence matrix to that observed for two aligned sequences. The Markov process corrected for multiple and parallel substitutions of bases at the same site. The method avoided the incorrect assumption of all previously described methods that the divergence between two present-day sequences is twice the divergence of either from the common and unknown ancestral sequence. The three previous methods were shown to be equivalent. The present method also avoided the undesirable assumptions that sequence composition has not changed with time and that the substitution rates in the two descendant lineages were the same. It permitted simultaneous estimation of ancestral sequence composition and, if applicable, of different substitution rates for the two descendant lineages, provided the total number of estimated parameters was less than 16. Properties of the Markov chain were discussed. It was proved for symmetric substitution matrices that all elements of the equilibrium divergence matrix equal 1/16, and that the total difference in the divergence matrix at epoch k equals the total change in the common substitution matrix at epoch 2k for all values of k. It was shown how to resolve an ambiguity in the assignment of two different substitution rates to the two descendant lineages when four or more similar sequences are available. The method was applied to the divergence matrix for codon site 3 for the mouse and rabbit beta-globins. This observed divergence matrix was significantly asymmetric and required at least two different substitution rates. This result could be achieved only by using different asymmetric substitution matrices for the two lineages.
Collapse
|
29
|
Frömmel C, Holzhütter HG. An estimate on the effect of point mutation and natural selection on the rate of amino acid replacement in proteins. J Mol Evol 1985; 21:233-57. [PMID: 6443130 DOI: 10.1007/bf02102357] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
We outline a method for estimating quantitatively the influence of point mutations and selection on the frequencies of codons and amino acids. We show how the mutation rate, i.e., the rate of amino acid replacement due to point mutation, can be affected by the codon usage as well as by the rates of the involved base exchanges. A comparison of the mutation rates calculated from reliable values of codon usage and base exchange probabilities with those that would be expected on the basis of chance reveals a notable suppression of replacements leading to tryptophan, glutamate, lysine, and methionine, and particularly of those leading to the termination codons. If selection constraints are neglected and only mutations are taken into account, the best agreement between expected and observed frequencies of both codons and amino acids is obtained for alpha = 1.13-1.15, where (Formula: see text). The "selection values" of codons and amino acids derived by our method show a pattern that partially deviates from others in the literature. For example, the selection pressure on methionine and cysteine turns out to be much more pronounced than expected if only the discrepancies between their observed and expected occurrences in proteins are considered. To estimate to what extent randomly occurring amino acid replacements are accepted by selection, we constructed an "acceptability matrix" from the well-established matrix of accepted point mutations. On the basis of this matrix "acceptability values" of the amino acids can be defined that correlate with their selection values. We also examine the significance of mutations and selection of amino acids with respect to their physicochemical properties and functions in proteins. The conservatism of amino acid replacements with respect to certain properties such as polarity can be brought about by the mutational process alone, whereas the conservatism with respect to other relevant properties--among them all measures of bulkiness--obviously is the result of additional selectional constraints on the evolution of protein structures.
Collapse
|
30
|
Fine LG, Badie-Dezfooly B, Lowe AG, Hamzeh A, Wells J, Salehmoghaddam S. Stimulation of Na+/H+ antiport is an early event in hypertrophy of renal proximal tubular cells. Proc Natl Acad Sci U S A 1985; 82:1736-40. [PMID: 3885217 PMCID: PMC397347 DOI: 10.1073/pnas.82.6.1736] [Citation(s) in RCA: 74] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Renal hypertrophy in vivo is achieved by an increase in protein content per cell and an increase in cell size with minimal hyperplasia. Hypertrophied renal tubular cells remain quiescent and demonstrate an increase in transcellular transport rates. This situation was simulated in vitro by exposing a confluent, quiescent primary culture of rabbit renal proximal tubular cells to either insulin, prostaglandin E1, or hypertonic NaCl for 24 or 48 hr. Protein per cell increased by 20-30% with little or no increase in [3H]thymidine incorporation into DNA. Mean cell volume was also increased in insulin- and hypertonic NaCl-treated but not in prostaglandin E1-treated cells. The lag period required to initiate DNA synthesis by a combination of insulin and hydrocortisone was the same in control and hypertrophied cells, indicating a quiescent state of the latter. Two hours of exposure to the growth stimuli increased amiloride-sensitive Na+ uptake, Na-dependent H+ efflux, and ouabain-sensitive Rb+ uptake, indicating that stimulation of Na+/H+ antiport (exchange) occurs as an early event in their action. Hypertrophied cells continued to demonstrate enhanced Na+/H+ antiport after the growth stimuli were removed for 3 hr, by which time their acute effects are reversed.
Collapse
|
31
|
Wu CI, Li WH. Evidence for higher rates of nucleotide substitution in rodents than in man. Proc Natl Acad Sci U S A 1985; 82:1741-5. [PMID: 3856856 PMCID: PMC397348 DOI: 10.1073/pnas.82.6.1741] [Citation(s) in RCA: 621] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
When the coding regions of 11 genes from rodents (mouse or rat) and man are compared with those from another mammalian species (usually bovine), it is found that rodents evolve significantly faster than man. The ratio of the number of nucleotide substitutions in the rodent lineage to that in the human lineage since their divergence is 2.0 for synonymous substitutions and 1.3 for nonsynonymous substitutions. Rodents also evolve faster in the 5' and 3' untranslated regions of five different mRNAs; the ratios are 2.6 and 3.1, respectively. The numbers of nucleotide substitutions between members of the beta-globin gene family that were duplicated before the man-mouse split are also higher in mouse than in man. The difference is, again, greater for synonymous substitutions than for nonsynonymous substitutions. This tendency is more consistent with the neutralist view of molecular evolution than with the selectionist view. A simple explanation for the higher rates in rodents is that rodents have shorter generation times and, thus, higher mutation rates. The implication of our findings for the study of molecular phylogeny is discussed.
Collapse
|
32
|
Abstract
The hypothesis that DNA strands complementary to the coding strand contain in phase coding sequences has been investigated. Statistical analysis of the 50 genes of bacteriophage T7 shows no significant correlation between patterns of codon usage on the coding and non-coding strands. In Bacillus and yeast genes the correlation observed is not different from that expected with random synonymous codon usage, while a high correlation seen in 52 E. coli genes can be explained in terms of an excess of RNY codons. A deficiency of UUA, CUA and UCA codons (complementary to termination) seems to be restricted to the E. coli genes, and may be due to low abundance of the relevant cognate tRNA species. Thus the analysis shows that the non-coding strand has the properties expected of a sequence complementary to a coding strand, with no indications that it encodes, or may have encoded, proteins.
Collapse
|
33
|
Lipman DJ, Wilbur WJ. Interaction of silent and replacement changes in eukaryotic coding sequences. J Mol Evol 1985; 21:161-7. [PMID: 6442990 DOI: 10.1007/bf02100090] [Citation(s) in RCA: 30] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
We examined the codon usages in well-conserved and less-well-conserved regions of vertebrate protein genes and found them to be similar. Despite this similarity, there is a statistically significant decrease in codon bias in the less-well-conserved regions. Our analysis suggests that although those codon changes initially fixed under amino acid replacements tend to follow the overall codon usage pattern, they also reduce the bias in codon usage. This decrease in codon bias leads one to predict that the rate of change of synonymous codons should be greater in those regions that are less well conserved at the amino acid level than in the better-conserved regions. Our analysis supports this prediction. Furthermore, we demonstrate a significantly elevated rate of change of synonymous codons among the adjacent codons 5' to amino acid replacement positions. This provides further support for the idea that there are contextual constraints on the choice of synonymous codons in eukaryotes.
Collapse
|
34
|
Sharp PM, Rogers MS, McConnell DJ. Selection pressures on codon usage in the complete genome of bacteriophage T7. J Mol Evol 1985; 21:150-60. [PMID: 6100189 DOI: 10.1007/bf02100089] [Citation(s) in RCA: 49] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
We searched the complete 39,936 base DNA sequence of bacteriophage T7 for nonrandomness that might be attributed to natural selection. Codon usage in the 50 genes of T7 is nonrandom, both over the whole code and among groups of synonymous codons. There is a great excess of purine- any base-pyrimidine (RNY) codons. Codon usage varies between genes, but from the pooled data for the whole genome (12,145 codons) certain putative selective constraints can be identified. Codon usage appears to be influenced by host tRNA abundance (particularly in highly expressed genes), tRNA-mRNA (one such interaction being perhaps responsible for maintaining the excess of RNY codons) and a lack of short palindromes. This last constraint is probably due to selection against host restriction enzyme recognition sites; this is the first report of an effect of this kind on codon usage. Selection against susceptibility to mutational damage does not appear to have been involved.
Collapse
|
35
|
Beverley SM, Wilson AC. Molecular evolution in Drosophila and the higher Diptera II. A time scale for fly evolution. J Mol Evol 1984; 21:1-13. [PMID: 6442354 DOI: 10.1007/bf02100622] [Citation(s) in RCA: 299] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
In this paper, we examine first the steadiness of the rate of evolutionary change in a larval hemolymph protein, LHP, in numerous Drosophila species. We estimated amino acid sequence divergence from immunological distances measured with the quantitative microcomplement fixation technique. Using tests not depending on knowledge of absolute times of divergence, we estimated the variance of the rate of evolutionary change to be at least 4 times as large as that for a process resembling radioactive decay. Thus, the rate of evolution of this protein is as uniform as that of vertebrate proteins. Our analysis indicates no acceleration of protein evolution in the lineages leading to Hawaiian drosophilines. Second, we give an explicit description of a procedure for calculating the absolute value of the mean rate of evolutionary change in this protein. This procedure is suggested for general use in calculating absolute rates of molecular evolution. The mean rate of evolution of LHP is about 1.2 immunological distance units per million years, which probably corresponds to a unit evolutionary period of 4 million years; LHP thus evolves at a rate comparable to that of mammalian hemoglobins. Finally, we utilize the calibrated rate of LHP evolution to derive a time scale of evolution in the Drosophilidae and higher Diptera.
Collapse
|
36
|
Abstract
We have analyzed the sequences of soybean leghemoglobin genes as an initial step toward understanding their mode of evolution. Alignment of the sequences of plant globin genes with those of animals reveals that based on the proportion of nucleotide substitutions that have occurred at the first, second, and third codon positions, the time of divergence of plant and animal globin gene families appears to be extremely remote (between 900 million and 1.4 billion years ago, if one assumes constancy of evolutionary rate in both the plant and animal lineages) and in addition to the normal regulatory sequences on the 5' end, an approximately 30-base-pair sequence, specific to globin genes, that surrounds the cap site is conserved between the plant and animal globin genes. Comparison of the leghemoglobin sequences with one another shows that the relative amount of sequence divergence in various coding and noncoding regions is roughly similar to that found for animal globin genes and as in animal globin genes, the positions of insertions and deletions in the intervening sequences often coincide with the locations of direct repeats. Thus, the mode of evolution of the plant globin genes appears to resemble, in many ways, that of their animal counterparts. We contrast the overall intergenic organization of the plant globin genes with that of animal genes, and discuss the possibility of the concerted evolution of the leghemoglobin genes.
Collapse
|
37
|
Santangelo F, Montecucchi PC, Gozzini L, Henschen A. Solid-phase synthesis of sauvagine-(17-40). INTERNATIONAL JOURNAL OF PEPTIDE AND PROTEIN RESEARCH 1983; 22:348-54. [PMID: 6605317 DOI: 10.1111/j.1399-3011.1983.tb02101.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
The solid-phase synthesis of the tetracosapeptide corresponding to the C-terminal amino acid sequence of sauvagine is described. After purification by gel filtration, the polypeptide appeared to possess an acceptable degree of homogeneity, as judged by different kinds of electrophoresis and chromatography, and by automated Edman degradation analysis. Preliminary pharmacological results indicate that the fragment-(17-40) is practically devoid of any sauvagine activity on the circulatory system and endocrine glands; a weak effect on gastric emptying delay has been demonstrated (1% of the natural product).
Collapse
|
38
|
Evolutionary relationships of vertebrate lactate dehydrogenase isozymes A4 (muscle), B4 (heart), and C4 (testis). J Biol Chem 1983. [DOI: 10.1016/s0021-9258(18)32327-5] [Citation(s) in RCA: 97] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
39
|
Smith TF, Waterman MS, Sadler JR. Statistical characterization of nucleic acid sequence functional domains. Nucleic Acids Res 1983; 11:2205-20. [PMID: 6835847 PMCID: PMC325873 DOI: 10.1093/nar/11.7.2205] [Citation(s) in RCA: 76] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
It has long been recognized that various genome classes were distinguishable on the basis of base composition and nearest neighbor frequencies. In addition Grantham et al. (8) have recently presented evidence that these distinctions are preserved at the level of codon usage. As discussed in this report it is now clear that these and related statistics can uniquely characterize the various functional domains of the genome. In particular peptide coding, intervening segments, structural RNA coding and mitochondrial domains of the vertebrate genome are uniquely characterizable. The statistical measures not only reflect understood functional differences among these domains but suggest others. The ability of these simple statistics of nucleic acid sequences to reflect so much of the encoded complex pattern information and/or effects of selective constraints is somewhat surprising. Here, we investigated the statistical measures most distinctive of the various domains and then linked them to our current understandings in so far as possible.
Collapse
|
40
|
Hanukoglu I, Tanese N, Fuchs E. Complementary DNA sequence of a human cytoplasmic actin. Interspecies divergence of 3' non-coding regions. J Mol Biol 1983; 163:673-8. [PMID: 6842590 DOI: 10.1016/0022-2836(83)90117-1] [Citation(s) in RCA: 99] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
We have isolated and sequenced a cloned complementary DNA insert complementary to the messenger RNA of a cytoplasmic actin expressed in human epidermal cells. This provides the first cytoplasmic actin complementary DNA sequence for a vertebrate organism. The actin amino acid sequence predicted from this complementary DNA is identical to that of a bovine cytoplasmic actin and shows 98 and 85% homology with a Dictyostelium and a yeast actin, respectively. The complementary DNA sequence indicates that the 3' end of the mRNA contains an unusually long (greater than 400 nucleotides) 3' non-translated region. A comparison of this 3' non-coding region with those of recently determined actin complementary DNA sequences from other species reveals little or no homology among these sequences. Thus, these results indicate that although the actin amino acid sequences are extremely conserved, the non-coding regions of the mRNAs diverge rapidly.
Collapse
|
41
|
Jones CW, Kafatos FC. Accepted mutations in a gene family: evolutionary diversification of duplicated DNA. J Mol Evol 1982; 19:87-103. [PMID: 7161811 DOI: 10.1007/bf02100227] [Citation(s) in RCA: 61] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
We report and compare the DNA sequences of 14 silkmoth (Antheraea polyphemus) chorion genes, derived from either cDNA or chromosomal DNA clones. Seven of these genes are members of the A multigene family, and seven are members of the B family. Where available, the previously reported (Jones and Kafatos 1980) intronic and extragenic flanking DNA sequences are also considered. Closely related sequences are compared, revealing the types of spontaneous mutations that were fixed during paralogous evolution. Segmental mutations (i.e. mutations other than substitutions) are nearly always interpretable as small duplications or deletions, related to small direct repeats. Segmental mutations are strongly constrained in the coding regions, although they do occur. Nucleotide substitutions also appear to be under selective constraints: relatively few substitutions leading to amino acid replacements are accepted, silent substitutions leading to some codons (especially purine-terminated ones) are disfavored, and different compositional biases are maintained in different parts of the sequences. Other sequence differences can be interpreted as indicative of neutral drift, including most differences in non-coding regions and most T/C transitions in third-base positions. In the non-coding regions, which are thought to be only loosely constrained by selection, transitions are observed more frequently than might be expected: they account for 52% of all substitutions, and they appear to be favored two to threefold over transversions when allowance is made for the skewed base composition of these regions.
Collapse
|
42
|
Golding GB, Strobeck C. Expected frequencies of codon use as a function of mutation rates and codon fitnesses. J Mol Evol 1982; 18:379-86. [PMID: 7175955 DOI: 10.1007/bf01840886] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
A method is shown to determine the expected pattern of codon use for any given set of mutation rates between nucleotides and any set of fitnesses for the codons. If it is assumed that mutations to stop codons are lethal then those codons which can mutate in one step to a stop codon tend to be used less frequently. This tendency is however, a very small one and is not likely to be observable within a single gene. Nor is it necessarily a general tendency. For example, the leucine pretermination codons may be used preferentially when mutations to proline are deleterious. It is shown that different mutation rates (eg: transitions occurring more frequently than transversions) may have as large an effect on codon usage as would strong selection for particular codons. For the model presented, an increase in the rate of transitions strongly decreases the expected frequency of UGG and CRR codons. Other codes are moderately affected by such a change in the mutation rates. Many other models can be examined using this method.
Collapse
|
43
|
Staden R. Automation of the computer handling of gel reading data produced by the shotgun method of DNA sequencing. Nucleic Acids Res 1982; 10:4731-51. [PMID: 7133997 PMCID: PMC321125 DOI: 10.1093/nar/10.15.4731] [Citation(s) in RCA: 590] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
This paper describes a computer method for handling gel reading data produced by the shotgun method of DNA sequencing. The method greatly reduces the time the sequencer needs to spend checking and editing his data and yet it produces a consensus sequence for which the accuracy of determination of every base can be clearly shown. The program can take a batch of new gel readings, screen them against vector sequences removing any that match, and then compare and align all the sequences to produce a final consensus. No information is lost in this process as alignments are achieved by making only insertions and because all the individual gel readings are added to a database from which they can be retrieved and displayed lined up one above the other. This allows the user to check on the alignments achieved by the program and if necessary change them. As each gel reading is added to the database the consensus is automatically updated accordingly and used for the next comparisons. This is a much faster process than comparing each new gel against every individual gel in the database.
Collapse
|
44
|
Brown WM, Prager EM, Wang A, Wilson AC. Mitochondrial DNA sequences of primates: tempo and mode of evolution. J Mol Evol 1982; 18:225-39. [PMID: 6284948 DOI: 10.1007/bf01734101] [Citation(s) in RCA: 812] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
We cloned and sequenced a segment of mitochondrial DNA from human, chimpanzee, gorilla, orangutan, and gibbon. This segment is 896 bp in length, contains the genes for three transfer RNAs and parts of two proteins, and is homologous in all 5 primates. The 5 sequences differ from one another by base substitutions at 283 positions and by a deletion of one base pair. The sequence differences range from 9 to 19% among species, in agreement with estimates from cleavage map comparisons, thus confirming that the rate of mtDNA evolution in primates is 5 to 10 times higher than in nuclear DNA. The most striking new finding to emerge from these comparisons is that transitions greatly outnumber transversions. Ninety-two percent of the differences among the most closely related species (human, chimpanzee, and gorilla) are transitions. For pairs of species with longer divergence times, the observed percentage of transitions falls until, in the case of comparisons between primates and non-primates, it reaches a value of 45. The time dependence is probably due to obliteration of the record of transitions by multiple substitutions at the same nucleotide site. This finding illustrates the importance of choosing closely related species for analysis of evolutionary process. The remarkable bias toward transitions in mtDNA evolution necessitates the revision of equations that correct for multiple substitutions at the same site. With revised equations, we calculated the incidence of silent and replacement substitutions in the two protein-coding genes. The silent substitution rate is 4 to 6 times higher than the replacement rate, indicating strong functional constraints at replacement sites. Moreover, the silent rate for these two genes is about 10% per million years, a value 10 times higher than the silent rate for the nuclear genes studied so far. In addition, the mean substitution rate in the three mitochondrial tRNA genes is at least 100 times higher than in nuclear tRNA genes. Finally, genealogical analysis of the sequence differences supports the view that the human lineage branched off only slightly before the gorilla and chimpanzee lineages diverged and strengthens the hypothesis that humans are more related to gorillas and chimpanzees than is the orangutan.
Collapse
|
45
|
Kaplan N, Risko K. A method for estimating rates of nucleotide substitution using DNA sequence data. Theor Popul Biol 1982; 21:318-28. [PMID: 7123502 DOI: 10.1016/0040-5809(82)90021-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
46
|
Brown GG, Simpson MV. Novel features of animal mtDNA evolution as shown by sequences of two rat cytochrome oxidase subunit II genes. Proc Natl Acad Sci U S A 1982; 79:3246-50. [PMID: 6285344 PMCID: PMC346392 DOI: 10.1073/pnas.79.10.3246] [Citation(s) in RCA: 102] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
The sequence of the region of the mitochondrial genome that encodes cytochrome oxidase subunit II (COII) has been determined for each of two closely related rat species, Rattus norvegicus and R. rattus. Comparison of the two sequences shows that 94.4% of the nucleotide substitutions are silent. The occurrence of this high proportion of silent substitutions leads us to propose that the rapid evolution of mtDNA relative to nuclear DNA is due only to silent changes and that amino acid-altering substitutions accumulate in nuclear and mtDNA at comparable rates. Other novel features of the nucleotide substitution pattern in the rat COII gene are a high transition/transversion ratio (8.0:1) and a strong bias toward C in equilibrium T transitions in the light strand. Comparison of the R. norvegicus COII sequence with the bovine and human sequences show that there may be selective constraints on some silent positions within the gene and that its rate of evolution may be different in different mammalian lineages.
Collapse
|
47
|
Smithies O, Engels WR, Devereux JR, Slightom JL, Shen S. Base substitutions, length differences and DNA strand asymmetries in the human G gamma and A gamma fetal globin gene region. Cell 1981; 26:345-53. [PMID: 6173131 DOI: 10.1016/0092-8674(81)90203-8] [Citation(s) in RCA: 60] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
We have studied differences arising subsequent to the 5 kilobase pair (kb) duplication that led to the human G gamma and A gamma fetal globin genes. The local occurrence of base substitutions in the duplicated 5 kb region correlates positively with the local AT base pair content. This correlation also occurs in two mouse beta-globin genes and in two mouse immunoglobulin genes. The relationship is valid for transcribed or nontranscribed DNA and for DNA that contains only coding sequences. Length differences in the fetal globin duplicated regions correlate positively with the occurrence of short direct repeats of greater than or equal 5 base pairs. Path analysis of the interrelationships of base composition, base substitutions, repeats and length differences provides an integrated view of the relative effects on chromosomal changes of these variables and of selection. The distributions along the chromosome of simple sequences and of base compositions show highly significant local asymmetries between the transcribed and nontranscribed strands of the DNA, which permit us to divide the fetal globin gene region into chromosomal domains. Comparable domains are present in DNA from other sources, including the mammalian viruses SV40 and polyoma virus strain A-2 in which some of the domains appear related to discrete functions.
Collapse
|
48
|
Nei M, Tateno Y. Statistical properties of the Jukes-Holmquist method of estimating the number of nucleotide substitutions: reply to Holmquist and Conroy's criticism. J Mol Evol 1981; 17:182-7. [PMID: 6167733 DOI: 10.1007/bf01733912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Conducting computer simulations, Nei and Tateno (1978) have shown that Jukes and Holmquist's (1972) method of estimating the number of nucleotide substitutions tends to give an overestimate and the estimate obtained has a large variance. Holmquist and Conroy (1980) repeated some parts of our simulation and claim that the overestimation of nucleotide substitutions in our paper occurred mainly because we used selected data. Examination of Holmquist and Conroy's simulation indicates that their results are essentially the same as ours when the Jukes-Holmquist method is used, but since they used a different method of computation their estimates of nucleotide substitutions differed substantially from ours. Another problem in Holmquist and Conroy's Letter is that they confused the expected number of nucleotide substitution with the number in a sample. This confusion has resulted in a number of unnecessary arguments. They also criticized our X2 measure, but this criticism is apparently due to a misunderstanding of the assumptions of our method and a failure to use our method in the way we described. We believe that our earlier conclusions remain unchanged.
Collapse
|
49
|
Kimura M. Was globin evolution very rapid in its early stages?: a dubious case against the rate-constancy hypothesis. J Mol Evol 1981; 17:110-3. [PMID: 7253035 DOI: 10.1007/bf01732682] [Citation(s) in RCA: 34] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
50
|
Abstract
The recent evaluation by Fitch (1980) of REH theory for macromolecular divergence is a severely erroneous and distorted analysis of our work over the past decade. We reply to those distortions here. At present, there is no factual basis for believing Fitch's assessment that corrections which move evolutionary estimates of total mutations fixed closer to the true distance must do so at the expense of an increased variance sufficient to compromise the value of the improvement. By direct calculation the variance in the estimates of total mutations fixed given by REH theory is comparable to that of other models now in the literature for the case in which genetic events are equiprobable. A general argument is given that suggests that, as we consider more and more carefully the selective, functional, and structural constraints on the evolution of genes and proteins, this variance may be expected to decrease toward a lower bound.
Collapse
|