151
|
Gojobori T, Nei M. Relative contributions of germline gene variation and somatic mutation to immunoglobulin diversity in the mouse. Mol Biol Evol 1986; 3:156-67. [PMID: 3444398 DOI: 10.1093/oxfordjournals.molbev.a040387] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The relative contributions of germline gene variation and somatic mutation to immunoglobulin diversity were studied by comparing germline gene sequences with their rearranged counterparts for the mouse VH, V kappa, and V lambda genes. The mutation rate at the amino acid level was estimated to be 7.0% in the first and second complementarity-determining regions (CDRs) and 2.0% in the framework regions (FRs). The difference in the mutation rate at the nucleotide level between the CDRs and FRs was of the same order of magnitude as that for the amino acid level. Analysis of amino acid diversity or nucleotide diversity indicated that the contribution of somatic mutation to immunoglobulin diversity is approximately 5%. However, the contribution of somatic mutation to the number of different amino acid sequences of immunoglobulins is much larger than that estimated by the analysis of amino acid diversity, and more than 90% of the different immunoglobulins seem to be generated by somatic mutation. Examination of the pattern of nucleotide substitution has suggested that clonal selection after somatic mutation may not be as strong as generally believed.
Collapse
|
152
|
Abstract
The nucleotide sequences of four genes of the influenza A virus (nonstructural protein, matrix protein, and a few subtypes of hemagglutinin and neuraminidase) are compiled for a large number of strains isolated from various locations and years, and the evolutionary relationship of the sequences is investigated. It is shown that all of these genes or subtypes are highly polymorphic and that the polymorphic sequences (alleles) are subject to rapid turnover in the population, their average age being much less than that of higher organisms. Phylogenetic analysis suggests that most polymorphic sequences within a subtype or a gene appeared during the last 80 years and that the divergence among the subtypes of hemagglutinin genes might have occurred during the last 300 years. The high degree of polymorphism in this RNA virus is caused by an extremely high rate of mutation, estimated to be 0.01/nucleotide site/year. Despite the high rate of mutation, most influenza virus genes are apparently subject to purifying selection, and the rate of nucleotide substitution is substantially lower than the mutation rate. There is considerable variation in the substitution rate among different genes, and the rate seems to be lower in nonhuman viral strains than in human strains. The difference might be responsible for the so-called freezing effect in some viral strains.
Collapse
|
153
|
Stephens JC, Nei M. Phylogenetic analysis of polymorphic DNA sequences at the Adh locus in Drosophila melanogaster and its sibling species. J Mol Evol 1985; 22:289-300. [PMID: 3003368 DOI: 10.1007/bf02115684] [Citation(s) in RCA: 37] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Recent sequencing of over 2300 nucleotides containing the alcohol dehydrogenase (Adh) locus in each of 11 Drosophila melanogaster lines makes it possible to estimate the approximate age of the electrophoretic "fast-slow" polymorphism. Our estimates, based on various possible patterns of evolution, range from 610,000 to 3,500,000 years, with 1,000,000 years as a reasonable point estimate. Furthermore, comparison of these sequences with those of the homologous region of D. simulans and D. mauritiana allows us to infer the pattern of evolutionary change of the D. melanogaster sequences. The integrity of the Adh-f electrophoretic alleles as a single lineage is supported by both unweighted pair-group method (UPGMA) and parsimony analyses. However, considerable divergence among the Adh-s lines seems to have preceded the origin of the Adh-f allele. Comparisons of the sequences of D. melanogaster genes with those of D. simulans and D. mauritiana genes suggest that the split between the latter two species occurred more recently than the divergence of some of the present-day Adh-s genes in D. melanogaster. The phylogenetic analyses of the D. melanogaster sequences show that the fast-slow distinction is not perfect, and suggest that intragenic recombination or gene conversion occurred in the evolution of this locus. We extended conventional phylogenetic analyses by using a statistical technique for detecting and characterizing recombination events. We show that the pattern of differentiation of DNA sequences in D. melanogaster is roughly compatible with the neutral theory of molecular evolution.
Collapse
|
154
|
Abstract
A mathematical theory is developed for computing the probability that m genes sampled from one population (species) and n genes sampled from another are derived from l genes that existed at the time of population splitting. The expected time of divergence between the two most closely related genes sampled from two different populations and the time of divergence (coalescence) of all genes sampled are studied by using this theory. It is shown that the time of divergence between the two most closely related genes can be used as an approximate estimate of the time of population splitting (T) only when T identical to t/(2N) is small, where t and N are the number of generations and the effective population size, respectively. The variance of Nei and Li's estimate (d) of the number of net nucleotide differences between two populations is also studied. It is shown that the standard error (Sd) of d is larger than the mean when T is small (T much less than 1). In this case, Sd is reduced considerably by increasing sample size. When T is large (T greater than 1), however, a large proportion of the variance of d is caused by stochastic factors, and increase in the sample size does not help to reduce Sd. To reduce the stochastic variance of d, one must use data from many independent unlinked gene loci.
Collapse
|
155
|
Nei M, Tajima F. Evolutionary change of restriction cleavage sites and phylogenetic inference for man and apes. Mol Biol Evol 1985; 2:189-205. [PMID: 2835574 DOI: 10.1093/oxfordjournals.molbev.a040345] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
A mathematical theory for the evolutionary change of restriction endonuclease cleavage sites is developed, and the probabilities of various types of restriction-site changes are evaluated. A computer simulation is also conducted to study properties of the evolutionary change of restriction sites. These studies indicate that parsimony methods of constructing phylogenetic trees often make erroneous inferences about evolutionary changes of restriction sites unless the number of nucleotide substitutions per site is less than 0.01 for all branches of the tree. This introduces a systematic error in estimating the number of mutational changes for each branch and, consequently, in constructing phylogenetic trees. Therefore, parsimony methods should be used only in cases where nucleotide sequences are closely related. Reexamination of Ferris et al.'s data on restriction-site differences of mitochondrial DNAs does not support Templeton's conclusions regarding the phylogenetic tree for man and apes and the molecular clock hypothesis. Templeton's claim that Nei and Li's method of estimating the number of nucleotide substitutions per site is seriously affected by parallel losses and loss-gains of restriction sites is also unsupported.
Collapse
|
156
|
Nei M, Stephens JC, Saitou N. Methods for computing the standard errors of branching points in an evolutionary tree and their application to molecular data from humans and apes. Mol Biol Evol 1985; 2:66-85. [PMID: 2897060 DOI: 10.1093/oxfordjournals.molbev.a040333] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Statistical methods for computing the standard errors of the branching points of an evolutionary tree are developed. These methods are for the unweighted pair-group method-determined (UPGMA) trees reconstructed from molecular data such as amino acid sequences, nucleotide sequences, restriction-sites data, and electrophoretic distances. They were applied to data for the human, chimpanzee, gorilla, orangutan, and gibbon species. Among the four different sets of data used, DNA sequences for an 895-nucleotide segment of mitochondrial DNA (Brown et al. 1982) gave the most reliable tree, whereas electrophoretic data (Bruce and Ayala 1979) gave the least reliable one. The DNA sequence data suggested that the chimpanzee is the closest and that the gorilla is the next closest to the human species. The orangutan and gibbon are more distantly related to man than is the gorilla. This topology of the tree is in agreement with that for the tree obtained from chromosomal studies and DNA-hybridization experiments. However, the difference between the branching point for the human and the chimpanzee species and that for the gorilla species and the human-chimpanzee group is not statistically significant. In addition to this analysis, various factors that affect the accuracy of an estimated tree are discussed.
Collapse
|
157
|
Roychoudhury AK, Nei M. Genetic relationships between Indians and their neighboring populations. Hum Hered 1985; 35:201-6. [PMID: 4029959 DOI: 10.1159/000153545] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Using gene frequency data for 18 protein and blood group loci, we studied the genetic relationships of four Indian subcontinent populations (peoples from Punjab, Gujarati, Andhra Pradesh, and Bangladesh) with their neighboring populations (Iranians, Afghans, Sinhalese in Sri Lanka, Nepalese, Bhutanese, Malays, Bataks in northern Sumatra, and Chinese). The results obtained indicate that the four Indian subcontinent populations and the Sinhalese are genetically closer to Iranians and Afghans (Caucasoid) than to the other neighboring Mongoloid populations. Genetic distance analysis shows a clear-cut dichotomy between the Caucasoid and Mongoloid populations.
Collapse
|
158
|
Abstract
A mathematical formula for estimating the average number of nucleotide substitutions per site (delta) between two homologous DNA sequences is developed by taking into account unequal rates of substitution among different nucleotide pairs. Although this formula is obtained for the equal-input model of nucleotide substitution, computer simulations have shown that it gives a reasonably good estimate for a wide range of nucleotide substitution patterns as long as delta is equal to or smaller than 1. Furthermore, the frequency of cases to which the formula is inapplicable is much lower than that for other similar methods recently proposed. This point is illustrated using insulin genes. A statistical method for estimating the number of nucleotide changes due to deletion and insertion is also developed. Application of this method to globin gene data indicates that the number of nucleotide changes per site increases with evolutionary time but the pattern of the increase is quite irregular.
Collapse
|
159
|
Tajima F, Nei M. Note on genetic drift and estimation of effective population size. Genetics 1984; 106:569-74. [PMID: 6706114 PMCID: PMC1224257 DOI: 10.1093/genetics/106.3.569] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
|
160
|
Abstract
With the aim of understanding the concerted evolution of the immunoglobulin VH multigene family, a phylogenetic tree for the DNA sequences of 16 mouse and five human germ line genes was constructed. This tree indicates that all genes in this family have undergone substantial evolutionary divergence. The most closely related genes so far identified in the mouse genome seem to have diverged about 6 million years (MY) ago, whereas the most distantly related genes diverged about 300 MY ago. This suggests that gene duplication caused by unequal crossing-over or gene conversion occurs very slowly in this gene family. The rate of occurrence of gene duplication in the VH gene family has been estimated to be 5 x 10(-7) per gene per year, which seems to be at least about 100 times lower than that for the rRNA gene family. This low rate of concerted evolution in the VH gene family helps retain intergenic genetic variability that in turn contributes to antibody diversity. Because of accumulation of destructive mutations, however, about one-third of the mouse and human VH genes seem to have become nonfunctional. Many of these pseudogenes have apparently originated recently, but some of them seem to have existed in the genome for more than 10 MY. The rate of nucleotide substitution for the complementarity-determining regions (CDRs) is as high as that of pseudogenes. This suggests that there is virtually no purifying selection operating in the CDRs and that germ line mutations are effectively used for generating antibody diversity.
Collapse
|
161
|
Abstract
Valentin's criticism of Majumder and Nei's [1983] paper is apparently based on his misunderstanding of the latter authors' attitude and approach to paternity test problems.
Collapse
|
162
|
Nei M, Tajima F. Maximum likelihood estimation of the number of nucleotide substitutions from restriction sites data. Genetics 1983; 105:207-17. [PMID: 6311668 PMCID: PMC1202146 DOI: 10.1093/genetics/105.1.207] [Citation(s) in RCA: 346] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
A simple method of the maximum likelihood estimation of the number of nucleotide substitutions is presented for the case where restriction sites data from many different restriction enzymes are available. An iteration method, based on nucleotide counting, is also developed. This method is simpler than the maximum likelihood method but gives the same estimate. A formula for computing the variance of a maximum likelihood estimate is also presented.
Collapse
|
163
|
Abstract
Considering the multinomial sampling of genotypes, unbiased estimators of various gene diversity measures in subdivided populations are presented. Using these quantities, formulae for estimating Wright's fixation indices (FIS, FIT, and FST) from a finite sample are developed.
Collapse
|
164
|
Nei M, Tajima F, Tateno Y. Accuracy of estimated phylogenetic trees from molecular data. II. Gene frequency data. J Mol Evol 1983; 19:153-70. [PMID: 6571220 DOI: 10.1007/bf02300753] [Citation(s) in RCA: 1284] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The accuracies and efficiencies of three different methods of making phylogenetic trees from gene frequency data were examined by using computer simulation. The methods examined are UPGMA, Farris' (1972) method, and Tateno et al.'s (1982) modified Farris method. In the computer simulation eight species (or populations) were assumed to evolve according to a given model tree, and the evolutionary changes of allele frequencies were followed by using the infinite-allele model. At the end of the simulated evolution five genetic distance measures (Nei's standard and minimum distances, Rogers' distance, Cavalli-Sforza's f theta, and the modified Cavalli-Sforza distance) were computed for all pairs of species, and the distance matrix obtained for each distance measure was used for reconstructing a phylogenetic tree. The phylogenetic tree obtained was then compared with the model tree. The results obtained indicate that in all tree-making methods examined the accuracies of both the topology and branch lengths of a reconstructed tree (rooted tree) are very low when the number of loci used is less than 20 but gradually increase with increasing number of loci. When the expected number of gene substitutions (M) for the shortest branch is 0.1 or more per locus and 30 or more loci are used, the topological error as measured by the distortion index (dT) is not great, but the probability of obtaining the correct topology (P) is less than 0.5 even with 60 loci. When M is as small as 0.004, P is substantially lower. In obtaining a good topology (small dT and high P) UPGMA and the modified Farris method generally show a better performance than the Farris method. The poor performance of the Farris method is observed even when Rogers' distance which obeys the triangle inequality is used. The main reason for this seems to be that the Farris method often gives overestimates of branch lengths. For estimating the expected branch lengths of the true tree UPGMA shows the best performance. For this purpose Nei's standard distance gives a better result than the others because of its linear relationship with the number of gene substitutions. Rogers' or Cavalli-Sforza's distance gives a phylogenetic tree in which the parts near the root are condensed and the other parts are elongated. It is recommended that more than 30 loci, including both polymorphic and monomorphic loci, be used for making phylogenetic trees. The conclusions from this study seem to apply also to data on nucleotide differences obtained by the restriction enzyme techniques.
Collapse
|
165
|
Abstract
Mathematical models are presented for the evolution of postmating and premating reproductive isolation. In the case of postmating isolation it is assumed that hybrid sterility or inviability is caused by incompatibility of alleles at one or two loci, and evolution of reproductive isolation occurs by random fixation of different incompatibility alleles in different populations. Mutations are assumed to occur following either the stepwise mutation model or the infinite-allele model. Computer simulations by using Itô's stochastic differential equations have shown that in the model used the reproductive isolation mechanism evolves faster in small populations than in large populations when the mutation rate remains the same. In populations of a given size it evolves faster when the number of loci involved is large than when this is small. In general, however, evolution of isolation mechanisms is a very slow process, and it would take thousands to millions of generations if the mutation rate is of the order of 10(-5) per generation. Since gene substitution occurs as a stochastic process, the time required for the establishment of reproductive isolation has a large variance. Although the average time of evolution of isolation mechanisms is very long, substitution of incompatibility genes in a population occurs rather quickly once it starts. The intrapopulational fertility or viability is always very high. In the model of premating isolation it is assumed that mating preference or compatibility is determined by male- and female-limited characters, each of which is controlled by a single locus with multiple alleles, and mating occurs only when the male and female characters are compatible with each other. Computer simulations have shown that the dynamics of evolution of premating isolation mechanism is very similar to that of postmating isolation mechanism, and the mean and variance of the time required for establishment of premating isolation are very large. Theoretical predictions obtained from the present study about the speed of evolution of reproductive isolation are consistent with empirical data available from vertebrate organisms.
Collapse
|
166
|
Ryman N, Chakraborty R, Nei M. Differences in the relative distribution of human gene diversity between electrophoretic and red and white cell antigen loci. Hum Hered 1983; 33:93-102. [PMID: 6862458 DOI: 10.1159/000153357] [Citation(s) in RCA: 36] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Gene frequency data for 25 loci (2 HLA loci, 9 blood group loci, and 14 electrophoretically detectable loci) were collected from the literature of 18 human populations from all over the world. The data were subjected to a hierarchical gene diversity analysis to provide an estimate of the relative distribution of genetic variation between and within populations and population groups for different types of loci. Two different ways of grouping the populations, i.e., according to anthropological criteria and to a cluster analysis based on gene frequency data, gave essentially the same results. For all loci combined approximately 86% of total gene diversity was found within populations, 3% was associated with differences between populations within groups, and 11% related to group differences. These results are very similar to those obtained in previous studies based on fewer loci and different sets of populations. The distribution of genetic variation is different for different types of loci. The HLA loci give a picture very similar to that of the electrophoretic loci while the blood group loci have a substantially larger fraction of the total gene diversity distributed between populations or population groups.
Collapse
|
167
|
Majumder PP, Nei M. A note on positive identification of paternity by using genetic markers. Hum Hered 1983; 33:29-35. [PMID: 6573297 DOI: 10.1159/000153343] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
To see the efficiency of the statistical methods for positive identification of fathers using genetic markers, the statistical properties of the paternity index are studied algebraically and numerically. It is found that the currently used statistical methods are not powerful enough to discriminate between true fathers and non-excluded non-fathers, and, more often than not, may lead to false attributions of paternity. It is, therefore, suggested that exclusion of paternity is the only conclusive evidence that can be accepted by courts of law until better methods are devised.
Collapse
|
168
|
Takei K, Hagiwara H, Nakamura S, Nei M, Sugihara K. Diffusion of 36Cl atom in 36Cl labeled silver chloride crystal from inner lattice site to surface. RADIOISOTOPES 1982; 31:654-6. [PMID: 7170354 DOI: 10.3769/radioisotopes.31.12_654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
169
|
Tateno Y, Nei M, Tajima F. Accuracy of estimated phylogenetic trees from molecular data. I. Distantly related species. J Mol Evol 1982; 18:387-404. [PMID: 7175956 DOI: 10.1007/bf01840887] [Citation(s) in RCA: 141] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
The accuracies and efficiencies of four different methods for constructing phylogenetic trees from molecular data were examined by using computer simulation. The methods examined are UPGMA, Fitch and Margoliash's (1967) (F/M) method, Farris' (1972) method, and the modified Farris method (Tateno, Nei, and Tajima, this paper). In the computer simulation, eight OTUs (32 OTUs in one case) were assumed to evolve according to a given model tree, and the evolutionary change of a sequence of 300 nucleotides was followed. The nucleotide substitution in this sequence was assumed to occur following the Poisson distribution, negative binomial distribution or a model of temporally varying rate. Estimates of nucleotide substitutions (genetic distances) were then computed for all pairs of the nucleotide sequences that were generated at the end of the evolution considered, and from these estimates a phylogenetic tree was reconstructed and compared with the true model tree. The results of this comparison indicate that when the coefficient of variation of branch length is large the Farris and modified Farris methods tend to be better than UPGMA and the F/M method for obtaining a good topology. For estimating the number of nucleotide substitutions for each branch of the tree, however, the modified Farris method shows a better performance than the Farris method. When the coefficient of variation of branch length is small, however, UPGMA shows the best performance among the four methods examined. Nevertheless, any tree-making method is likely to make errors in obtaining the correct topology with a high probability, unless all branch lengths of the true tree are sufficiently long. It is also shown that the agreement between patristic and observed genetic distances is not a good indicator of the goodness of the tree obtained.
Collapse
|
170
|
Gojobori T, Ishii K, Nei M. Estimation of average number of nucleotide substitutions when the rate of substitution varies with nucleotide. J Mol Evol 1982; 18:414-23. [PMID: 7175958 DOI: 10.1007/bf01840889] [Citation(s) in RCA: 193] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
A formal mathematical analysis of Kimura's (1981) six-parameter model of nucleotide substitution for the case of unequal substitution rates among different pairs of nucleotides is conducted, a new formulae for estimating the number of nucleotide substitutions and its standard error are obtained. By using computer simulation, the validities and utilities of Jukes and Cantor's (1969) one-parameter formula, Takahata and Kimura's (1981) four-parameter formula, and our six-parameter formula for estimating the number of nucleotide substitutions are examined under three different schemes of nucleotide substitution. It is shown that the one-parameter and four-parameter formulae often give underestimates when the number of nucleotide substitutions is large, whereas the six-parameter formula generally gives a good estimate for all the three substitution schemes examined. However, when the number of nucleotide substitutions is large, the six-parameter and four-parameter formulae are often inapplicable unless the number of nucleotides compared is extremely large. It is also shown that as long as the mean number of nucleotide substitutions is smaller than one per nucleotide site the three formulae give more or less the same estimate regardless of the substitution scheme used.
Collapse
|
171
|
Chakravarti A, Nei M. Utility and efficiency of linked marker genes for genetic counseling. II. Identification of linkage phase by offspring phenotypes. Am J Hum Genet 1982; 34:531-51. [PMID: 6954847 PMCID: PMC1685357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
For a linked marker locus to be useful for genetic counseling, the counselee must be heterozygous for both disease and marker loci and his or her linkage phase must be known. It is shown that when the phenotypes of the counselee's previous children for the disease and marker loci are known, the linkage phase can often be inferred with a high probability, and thus it is possible to conduct genetic counseling. To evaluate the utility of linked marker genes for genetic counseling, the accuracy of prediction of the risk for a prospective child with a given marker gene to develop the genetic disease and the proportion of families in which a particular marker locus can be used for genetic counseling are studied for X-linked recessive, autosomal dominant, and autosomal recessive diseases. In the case of X-linked genetic diseases, information from children is very useful for determining the linkage phase of the counselee and predicting the genetic disease. In the case of autosomal dominant diseases, not all children are informative, but if the number of children is large, the phenotypes of children are often more informative than the information from grandparents. In the case of autosomal recessive diseases, information from grandparents is usually useless, since they show a normal phenotype for the disease locus. If we use information on the phenotypes of children, however, the linkage phase of the counselee and the risk of a prospective child can be inferred with a high probability. The proportion of informative families depends on the dominance relationship and frequencies of marker alleles, and the number of children. In general, codominant markers are more useful than are dominant markers, and a locus with high heterozygosity is more useful than is a locus with low heterozygosity.
Collapse
|
172
|
Tajima F, Nei M. Biases of the estimates of DNA divergence obtained by the restriction enzyme technique. J Mol Evol 1982; 18:115-20. [PMID: 6284946 DOI: 10.1007/bf01810830] [Citation(s) in RCA: 70] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
A mathematical formula for the relationship between the average number of nucleotide substitutions per site and the proportion of shared restriction sites between two homologous nucleons is developed by taking into account the unequal rates of substitution among different pairs of nucleotides. Using this formula, the possible amount of bias of the estimate of the number of nucleotide substitutions obtained by the Upholt-Nei-Li formula for restriction site data is investigated. The results obtained indicate that the bias depends upon the nucleotides in the recognition sequence of the restriction enzyme used, the unequal rates of substitution among different nucleotides, and the unequal nucleotide frequencies, but the primary factor is the unequal rates of nucleotide substitution. The amount of bias is generally larger for four-base enzymes than for six-base enzymes. However, when many restriction enzymes are used for the study of DNA divergence, the bias is unlikely to be very large unless the rate of substitution greatly varies from nucleotide to nucleotide.
Collapse
|
173
|
Nei M, Li WH, Tajima F, Narain P. Polymorphism and evolution of the Rh blood groups. JINRUI IDENGAKU ZASSHI. THE JAPANESE JOURNAL OF HUMAN GENETICS 1981; 26:263-78. [PMID: 6808203 DOI: 10.1007/bf01876357] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
174
|
Abstract
On the neutral mutation hypothesis, the rate of nucleotide substitution is expected to be higher for functionally less important genes or parts of genes than for functionally more important genes, as the latter would be subject to stronger purifying (negative) selectio. On the other hand, selectionists believe that most nucleotide substitutions are caused by positive darwinian selection, in which case the rate of nucleotide substitution in functionally unimportant genes or parts of genes is expected to be relatively lower because the mutations in these regions of DNA would not produce any significant selective advantages. Kimura and Jukes have argued that the higher substitution rate observed at the third positions of codons than at the first two positions supports the neutral mutation hypothesis, as most third-position substitutions are synonymous and do not change the amino acids encoded, although others have discussed the possibility that third-position substitutions are subject to positive darwinian selection. Recently, Kimura noted that the mouse globin pseudogene, psi alpha 3, evolved faster than the normal mouse alpha 1 gene, although he did not compute the substitution rate. Here, we present a method of computing the rate of nucleotide substitution for pseudogenes, and report that the three recently discovered pseudogenes show an extremely high rate of nucleotide substitution. As these pseudogenes apparently have no function, this finding strongly supports the neutral mutation hypothesis.
Collapse
|
175
|
Abstract
The nucleotide sequence of a segment of U1 and U3b small RNAs (sRNAs) is shown to have a high complementarity with the nucleotide sequence of a part of the leader region of almost all eukaryotic genes studied so far. The complementary region of U3b is located in the unpaired segment of the secondary structure of U3b constructed by Reddy et al. (1979). A similar complementarity is also observed between these RNAs and the leader regions of eukaryotic viruses, but the complementary region is not always identical with that for eukaryotic genes. Complementarity is also observed between the 3' end of 18S rRNA and a segment of U1 or U3b which is almost contiguous to the region complementary with mRNA. These observations suggest that U1 and U3b may be involved in mRNA processing and transport in the nucleus or in translation in the cytoplasm. In addition to U1 and U3b, another sRNA, i.e., 4.5S RNAI, is shown to have segments which are homologous to the Hogness box of the flanking region of gene and the Proudfoot-Brownlee (PB) box of mRNA near the poly(A) attachment site. The two segments which are complementary with these boxes are located almost contiguously on a co-joined loop of the secondary structure of 4.5S RNAI constructed by Ro-Choi et al. (1972). Since the Hogness box and PB box are both considered as a recognition site by the RNA polymerase, it is possible that 4.5S RNAI is involved in mediating gene transcription.
Collapse
|