1
|
Duchêne DA, Duchêne S, Stiller J, Heller R, Ho SYW. ClockstaRX: Testing Molecular Clock Hypotheses With Genomic Data. Genome Biol Evol 2024; 16:evae064. [PMID: 38526019 PMCID: PMC10999959 DOI: 10.1093/gbe/evae064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 01/11/2024] [Accepted: 03/21/2024] [Indexed: 03/26/2024] Open
Abstract
Phylogenomic data provide valuable opportunities for studying evolutionary rates and timescales. These analyses require theoretical and statistical tools based on molecular clocks. We present ClockstaRX, a flexible platform for exploring and testing evolutionary rate signals in phylogenomic data. Here, information about evolutionary rates in branches across gene trees is placed in Euclidean space, allowing data transformation, visualization, and hypothesis testing. ClockstaRX implements formal tests for identifying groups of loci and branches that make a large contribution to patterns of rate variation. This information can then be used to test for drivers of genomic evolutionary rates or to inform models for molecular dating. Drawing on the results of a simulation study, we recommend forms of data exploration and filtering that might be useful prior to molecular-clock analyses.
Collapse
Affiliation(s)
- David A Duchêne
- Center for Evolutionary Hologenomics, University of Copenhagen, Copenhagen 1352, Denmark
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen 1352, Denmark
| | - Sebastián Duchêne
- Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, VIC 3010, Australia
| | - Josefin Stiller
- Villum Centre for Biodiversity Genomics, University of Copenhagen, 2100 Copenhagen, Denmark
| | - Rasmus Heller
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen 2100, Denmark
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
2
|
Castellano D, James J, Eyre-Walker A. Nearly Neutral Evolution across the Drosophila melanogaster Genome. Mol Biol Evol 2019; 35:2685-2694. [PMID: 30418639 DOI: 10.1093/molbev/msy164] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Under the nearly neutral theory of molecular evolution, the proportion of effectively neutral mutations is expected to depend upon the effective population size (Ne). Here, we investigate whether this is the case across the genome of Drosophila melanogaster using polymorphism data from North American and African lines. We show that the ratio of the number of nonsynonymous and synonymous polymorphisms is negatively correlated to the number of synonymous polymorphisms, even when the nonindependence is accounted for. The relationship is such that the proportion of effectively neutral nonsynonymous mutations increases by ∼45% as Ne is halved. However, we also show that this relationship is steeper than expected from an independent estimate of the distribution of fitness effects from the site frequency spectrum. We investigate a number of potential explanations for this and show, using simulation, that this is consistent with a model of genetic hitchhiking: Genetic hitchhiking depresses diversity at neutral and weakly selected sites, but has little effect on the diversity of strongly selected sites.
Collapse
Affiliation(s)
- David Castellano
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Jennifer James
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| | - Adam Eyre-Walker
- School of Life Sciences, University of Sussex, Brighton, United Kingdom
| |
Collapse
|
3
|
Bininda-Emonds ORP. Fast Genes and Slow Clades: Comparative Rates of Molecular Evolution in Mammals. Evol Bioinform Online 2017. [DOI: 10.1177/117693430700300008] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Although interest in the rate of molecular evolution and the molecular clock remains high, our knowledge for most groups in these areas is derived largely from a patchwork of studies limited in both their taxon coverage and the number of genes examined. Using a comprehensive molecular data set of 44 genes (18 nDNA, 11 tRNA and 15 additional mtDNA genes) together with a virtually complete and dated phylogeny of extant mammals, I 1) describe differences in the rate of molecular evolution (i.e. substitution rate) within this group in an explicit phylogenetic and quantitative framework and 2) present the first attempt to localize the phylogenetic positions of any rate shifts. Significant rate differences were few and confirmed several long-held trends, including a progressive rate slowdown within hominids and a reduced substitution rate within Cetacea. However, many new patterns were also uncovered, including the mammalian orders being characterized generally by basal rate slowdowns. A link between substitution rate and the size of a clade (which derives from its net speciation rate) is also suggested, with the species-poor major clades (“orders”) showing more decreased rates that often extend throughout the entire clade. Significant rate increases were rare, with the rates within (murid) rodents being fast, but not significantly so with respect to other mammals as a whole. Despite clear lineage-specific differences, rates generally change gradually along these lineages, supporting the potential existence of a local molecular clock in mammals. Together, these results will lay the foundation for a broad-scale analysis to establish the correlates and causes of the rate of molecular evolution in mammals.
Collapse
Affiliation(s)
- Olaf R. P. Bininda-Emonds
- Lehrstuhl für Tierzucht, Technical University of Munich, Hochfeldweg 1, 85354 Freising–Weihenstephan, Germany
| |
Collapse
|
4
|
Braverman JM, Hamilton MB, Johnson BA. Patterns of Substitution Rate Variation at Many Nuclear Loci in Two Species Trios in the Brassicaceae Partitioned with ANOVA. J Mol Evol 2016; 83:97-109. [PMID: 27592229 DOI: 10.1007/s00239-016-9752-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Accepted: 07/14/2016] [Indexed: 01/09/2023]
Abstract
There are marked variations among loci and among lineages in rates of nucleotide substitution. The generation time hypothesis (GTH) is a neutral explanation for substitution rate heterogeneity that has genomewide application, predicting that species with shorter generation times accumulate DNA sequence substitutions faster than species with longer generation times do since faster genome replication provides more opportunities for mutations to occur and reach fixation by genetic drift. Relatively few studies have rigorously evaluated the GTH in plants, and there are numerous alternative hypotheses for plant substitution rate variation. One major challenge has been finding pairs of closely related plant species with contrasting generation times and appropriate outgroup taxa that all also have DNA sequence data for numerous loci. To test for causes of rate variation, we obtained sequence data for 256 genes for Arabidopsis thaliana, normally reproducing every year, and the biennial Arabidopsis lyrata with three closely related outgroup taxa (Brassica rapa, Capsella grandiflora, and Neslia paniculata) as well as the biennial Brassica oleracea and the annual B. rapa lineage with the outgroup N. paniculata. A sign test indicated that more loci than expected by chance have faster rates of substitution on the branch leading to the annual than to the perennial for one three-species trio but not another. Tajima's 1D and 2D tests, and a likelihood ratio test that incorporated saturation correction, rejected rate homogeneity for up to 26 genes (up to 14 genes when correcting for multiple tests), consistently showing faster rates for the annual lineage in the Arabidopsis species trio. ANOVA showed significant rate heterogeneity between the Arabidopsis and Brassica species trios (about 6 % of rate variation) and among loci (about 26-32 % of rate variation). The lineage-by-locus interaction which would be caused by locus- and lineage-specific natural selection explained about 13 % of substitution rate variation in one ANOVA model using substitution rates from genes partitioned into odd and even codons but was not a significant effect without partitioned genes. Annual/perennial lineage and species trio by annual/perennial lineage each explained about 1 % of substitution rate variation.
Collapse
Affiliation(s)
- John M Braverman
- Department of Biology, Saint Joseph's University, Philadelphia, PA, USA.
| | | | - Brent A Johnson
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY, USA
| |
Collapse
|
5
|
James JE, Piganeau G, Eyre‐Walker A. The rate of adaptive evolution in animal mitochondria. Mol Ecol 2016; 25:67-78. [PMID: 26578312 PMCID: PMC4737298 DOI: 10.1111/mec.13475] [Citation(s) in RCA: 80] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2015] [Accepted: 11/10/2015] [Indexed: 11/28/2022]
Abstract
We have investigated whether there is adaptive evolution in mitochondrial DNA, using an extensive data set containing over 500 animal species from a wide range of taxonomic groups. We apply a variety of McDonald-Kreitman style methods to the data. We find that the evolution of mitochondrial DNA is dominated by slightly deleterious mutations, a finding which is supported by a number of previous studies. However, when we control for the presence of deleterious mutations using a new method, we find that mitochondria undergo a significant amount of adaptive evolution, with an estimated 26% (95% confidence intervals: 5.7-45%) of nonsynonymous substitutions fixed by adaptive evolution. We further find some weak evidence that the rate of adaptive evolution is correlated to synonymous diversity. We interpret this as evidence that at least some adaptive evolution is limited by the supply of mutations.
Collapse
Affiliation(s)
| | - Gwenael Piganeau
- UPMC Univ Paris 06UMR 7232Observatoire OceanologiqueAvenue de FontauléBP 44, 66651 Banyuls‐sur‐MerFrance
- CNRSUMR 7232Observatoire OceanologiqueAvenue de FontauléBP 44, 66651 Banyuls‐sur‐MerFrance
| | | |
Collapse
|
6
|
Murray GGR, Weinert LA, Rhule EL, Welch JJ. The Phylogeny of Rickettsia Using Different Evolutionary Signatures: How Tree-Like is Bacterial Evolution? Syst Biol 2015; 65:265-79. [PMID: 26559010 PMCID: PMC4748751 DOI: 10.1093/sysbio/syv084] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Accepted: 11/04/2015] [Indexed: 11/14/2022] Open
Abstract
Rickettsia is a genus of intracellular bacteria whose hosts and transmission strategies are both impressively diverse, and this is reflected in a highly dynamic genome. Some previous studies have described the evolutionary history of Rickettsia as non-tree-like, due to incongruity between phylogenetic reconstructions using different portions of the genome. Here, we reconstruct the Rickettsia phylogeny using whole-genome data, including two new genomes from previously unsampled host groups. We find that a single topology, which is supported by multiple sources of phylogenetic signal, well describes the evolutionary history of the core genome. We do observe extensive incongruence between individual gene trees, but analyses of simulations over a single topology and interspersed partitions of sites show that this is more plausibly attributed to systematic error than to horizontal gene transfer. Some conflicting placements also result from phylogenetic analyses of accessory genome content (i.e., gene presence/absence), but we argue that these are also due to systematic error, stemming from convergent genome reduction, which cannot be accommodated by existing phylogenetic methods. Our results show that, even within a single genus, tests for gene exchange based on phylogenetic incongruence may be susceptible to false positives.
Collapse
Affiliation(s)
- Gemma G R Murray
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK; and
| | - Lucy A Weinert
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| | - Emma L Rhule
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK; and
| | - John J Welch
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK; and
| |
Collapse
|
7
|
Duchêne S, Ho SYW. Mammalian genome evolution is governed by multiple pacemakers. ACTA ACUST UNITED AC 2015; 31:2061-5. [PMID: 25725495 DOI: 10.1093/bioinformatics/btv121] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Accepted: 02/20/2015] [Indexed: 11/14/2022]
Abstract
UNLABELLED Genomic evolution is shaped by a dynamic combination of mutation, selection and genetic drift. These processes lead to evolutionary rate variation across loci and among lineages. In turn, interactions between these two forms of rate variation can produce residual effects, whereby the pattern of among-lineage rate heterogeneity varies across loci. The nature of rate variation is encapsulated in the pacemaker models of genome evolution, which differ in the degree of importance assigned to residual effects: none (Universal Pacemaker), some (Multiple Pacemaker) or total (Degenerate Multiple Pacemaker). Here we use a phylogenetic method to partition the rate variation across loci, allowing comparison of these pacemaker models. Our analysis of 431 genes from 29 mammalian taxa reveals that rate variation across these genes can be explained by 13 pacemakers, consistent with the Multiple Pacemaker model. We find no evidence that these pacemakers correspond to gene function. Our results have important consequences for understanding the factors driving genomic evolution and for molecular-clock analyses. AVAILABILITY AND IMPLEMENTATION ClockstaR-G is freely available for download from github (https://github.com/sebastianduchene/clockstarg).
Collapse
Affiliation(s)
- Sebastián Duchêne
- School of Biological Sciences, University of Sydney, Sydney, NSW 2006, Australia
| | - Simon Y W Ho
- School of Biological Sciences, University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
8
|
Ho SYW, Duchêne S. Molecular-clock methods for estimating evolutionary rates and timescales. Mol Ecol 2014; 23:5947-65. [DOI: 10.1111/mec.12953] [Citation(s) in RCA: 225] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2014] [Revised: 09/29/2014] [Accepted: 09/30/2014] [Indexed: 11/29/2022]
Affiliation(s)
- Simon Y. W. Ho
- School of Biological Sciences; University of Sydney; Sydney NSW 2006 Australia
| | - Sebastián Duchêne
- School of Biological Sciences; University of Sydney; Sydney NSW 2006 Australia
| |
Collapse
|
9
|
Ho SYW. The changing face of the molecular evolutionary clock. Trends Ecol Evol 2014; 29:496-503. [PMID: 25086668 DOI: 10.1016/j.tree.2014.07.004] [Citation(s) in RCA: 100] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2014] [Revised: 07/03/2014] [Accepted: 07/08/2014] [Indexed: 11/30/2022]
Abstract
The molecular clock has played an important role in biological research, both as a description of the evolutionary process and as a tool for inferring evolutionary timescales. Genomic data have provided valuable insights into the molecular clock, allowing the patterns and causes of evolutionary rate variation to be characterized in increasing detail. I explain how genome sequences offer exciting opportunities for estimating the timescale of the Tree of Life. I describe the different approaches that have been used to deal with the computational and statistical challenges encountered in molecular clock analyses of genomic data. Finally, I offer a perspective on the future of molecular clocks, highlighting some of the key limitations and the most promising research directions.
Collapse
Affiliation(s)
- Simon Y W Ho
- School of Biological Sciences, University of Sydney, Sydney, NSW, Australia.
| |
Collapse
|
10
|
Statistical and theoretical considerations for the platform re-location water maze. J Neurosci Methods 2011; 198:44-52. [DOI: 10.1016/j.jneumeth.2011.03.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2010] [Revised: 03/04/2011] [Accepted: 03/04/2011] [Indexed: 11/21/2022]
|
11
|
Stoletzki N, Eyre-Walker A. The positive correlation between dN/dS and dS in mammals is due to runs of adjacent substitutions. Mol Biol Evol 2010; 28:1371-80. [PMID: 21115654 DOI: 10.1093/molbev/msq320] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
A positive correlation between ω, the ratio of the nonsynonymous and synonymous substitution rates, and dS, the synonymous substitution rate has recently been reported. This correlation is unexpected under simple evolutionary models. Here, we investigate two explanations for this correlation: first, whether it is a consequence of a statistical bias in the estimation of ω and second, whether it is due to substitutions at adjacent sites. Using simulations, we show that estimates of ω are biased when levels of divergence are low. This is true using the methods of Yang and Nielsen, Nei and Gojobori, and Muse and Gaut. Although the bias could generate a positive correlation between ω and dS, we show that it is unlikely to be the main determinant. Instead we show that the correlation is reduced when genes that are high quality in sequence, annotation, and alignment are used. The remaining--likely genuine--positive correlation appears to be due to adjacent tandem substitutions; single substitutions, though far more numerous, do not contribute to the correlation. Genuine adjacent substitutions may be due to mutation or selection.
Collapse
Affiliation(s)
- Nina Stoletzki
- Centre for Study of Evolution, School of Life Sciences, University of Sussex, Brighton, United Kingdom.
| | | |
Collapse
|
12
|
Egan AN, Doyle J. A comparison of global, gene-specific, and relaxed clock methods in a comparative genomics framework: dating the polyploid history of soybean (Glycine max). Syst Biol 2010; 59:534-47. [PMID: 20705909 DOI: 10.1093/sysbio/syq041] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
It is widely recognized that many genes and lineages do not adhere to a molecular clock, yet molecular clocks are commonly used to date divergences in comparative genomic studies. We test the application of a molecular clock across genes and lineages in a phylogenetic framework utilizing 12 genes linked in a 1-Mb region on chromosome 13 of soybean (Glycine max); homoeologous copies of these genes formed by polyploidy in Glycine; and orthologous copies in G. tomentella, Phaseolus vulgaris, and Medicago truncatula. We compare divergence dates estimated by two methods each in three frameworks: a global molecular clock with a single rate across genes and lineages using full and approximate likelihood methods based on synonymous substitutions, a gene-specific clock assuming rate constancy over lineages but allowing a different rate for each gene, and a relaxed molecular clock where rates may vary across genes and lineages estimated under penalized likelihood and Bayesian inference. We use the cumulative variance across genes as a means of quantifying precision. Our results suggest that divergence dating methods produce results that are correlated, but that older nodes are more variable and more difficult to estimate with precision and accuracy. We also find that models incorporating less rate heterogeneity estimate older dates of divergence than more complex models, as node age increases. A mixed model nested analysis of variance testing the effects of framework, method, and gene found that framework had a significant effect on the divergence date estimates but that most variation among dates is due to variation among genes, suggesting a need to further characterize and understand the evolutionary phenomena underlying rate variation within genomes, among genes, and across lineages.
Collapse
Affiliation(s)
- Ashley N Egan
- Department of Plant Biology, L.H. Bailey Hortorium, Cornell University, 412 Mann Library Building, Ithaca, NY 14853, USA.
| | | |
Collapse
|
13
|
Thomas JA, Welch JJ, Lanfear R, Bromham L. A Generation Time Effect on the Rate of Molecular Evolution in Invertebrates. Mol Biol Evol 2010; 27:1173-80. [DOI: 10.1093/molbev/msq009] [Citation(s) in RCA: 172] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
14
|
Abstract
Although protein evolution can be approximated as a "molecular evolutionary clock," it is well known that sequence change departs from a clock-like Poisson expectation. Through studying the deviations from a molecular clock, insight can be gained into the forces shaping evolution at the level of proteins. Generally, substitution patterns that show greater variance than the Poisson expectation are said to be "overdispersed." Overdispersion of sequence change may result from temporal variation in the rate at which amino acid substitutions occur on a phylogeny. By comparing the genomes of four species of yeast, five species of Drosophila, and five species of mammals, we show that the extent of overdispersion shows a strong negative correlation with the effective population size of these organisms. Yeast proteins show very little overdispersion, while mammalian proteins show substantial overdispersion. Additionally, X-linked genes, which have reduced effective population size, have gene products that show increased overdispersion in both Drosophila and mammals. Our research suggests that mutational robustness is more pervasive in organisms with large population sizes and that robustness acts to stabilize the molecular evolutionary clock of sequence change.
Collapse
|
15
|
Bedford T, Hartl DL. Overdispersion of the molecular clock: temporal variation of gene-specific substitution rates in Drosophila. Mol Biol Evol 2008; 25:1631-8. [PMID: 18480070 DOI: 10.1093/molbev/msn112] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Simple models of molecular evolution assume that sequences evolve by a Poisson process in which nucleotide or amino acid substitutions occur as rare independent events. In these models, the expected ratio of the variance to the mean of substitution counts equals 1, and substitution processes with a ratio greater than 1 are called overdispersed. Comparing the genomes of 10 closely related species of Drosophila, we extend earlier evidence for overdispersion in amino acid replacements as well as in four-fold synonymous substitutions. The observed deviation from the Poisson expectation can be described as a linear function of the rate at which substitutions occur on a phylogeny, which implies that deviations from the Poisson expectation arise from gene-specific temporal variation in substitution rates. Amino acid sequences show greater temporal variation in substitution rates than do four-fold synonymous sequences. Our findings provide a general phenomenological framework for understanding overdispersion in the molecular clock. Also, the presence of substantial variation in gene-specific substitution rates has broad implications for work in phylogeny reconstruction and evolutionary rate estimation.
Collapse
Affiliation(s)
- Trevor Bedford
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | | |
Collapse
|
16
|
Welch JJ, Bininda-Emonds ORP, Bromham L. Correlates of substitution rate variation in mammalian protein-coding sequences. BMC Evol Biol 2008; 8:53. [PMID: 18284663 PMCID: PMC2289806 DOI: 10.1186/1471-2148-8-53] [Citation(s) in RCA: 132] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2007] [Accepted: 02/19/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Rates of molecular evolution in different lineages can vary widely, and some of this variation might be predictable from aspects of species' biology. Investigating such predictable rate variation can help us to understand the causes of molecular evolution, and could also help to improve molecular dating methods. Here we present a comprehensive study of the life history correlates of substitution rate variation across the mammals, comparing results for mitochondrial and nuclear loci, and for synonymous and non-synonymous sites. We use phylogenetic comparative methods, refined to take into account the special nature of substitution rate data. Particular attention is paid to the widespread correlations between the components of mammalian life history, which can complicate the interpretation of results. RESULTS We find that mitochondrial synonymous substitution rates, estimated from the 9 longest mitochondrial genes, show strong negative correlations with body mass and with maximum recorded lifespan. But lifespan is the sole variable to remain after multiple regression and model simplification. Nuclear synonymous substitution rates, estimated from 6 genes, show strong negative correlations with body mass and generation time, and a strong positive correlation with fecundity. In contrast to the mitochondrial results, the same trends are evident in rates of nonsynonymous substitution. CONCLUSION A substantial proportion of variation in mammalian substitution rates can be explained by aspects of their life history, implying that molecular and life history evolution are closely interlinked in this group. The strength and consistency of the nuclear body mass effect suggests that molecular dating studies may have been systematically misled, but also that methods could be improved by incorporating the finding as a priori information. Mitochondrial synonymous rates also show the body mass effect, but for apparently quite different reasons, and the strength of the relationship with maximum lifespan provides support for the hypothesis that mtDNA damage is causally linked to aging.
Collapse
Affiliation(s)
- John J Welch
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, West Mains Rd., Edinburgh EH9 3JT, UK.
| | | | | |
Collapse
|
17
|
McPartland JM, Norris RW, Kilpatrick CW. Tempo and mode in the endocannaboinoid system. J Mol Evol 2007; 65:267-76. [PMID: 17676365 DOI: 10.1007/s00239-007-9004-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2006] [Accepted: 04/26/2007] [Indexed: 10/23/2022]
Abstract
The best-known endocannabinoid ligands, anandamide and 2-AG, signal at least seven receptors and involve ten metabolic enzymes. Genes for the receptors and enzymes were examined for heterogeneities in tempo (relative rate of evolution, RRE) and mode (selection pressure, Ka/Ks) in six organisms with sequenced genomes. BLAST identified orthologs as reciprocal best hits, and nucleotide alignments were performed with ClustalX and MacClade. Two bioinformatics platforms, LiKaKs (a distance-based LWL85 model) and SNAP (a parsimony-based NG86 model) made pairwise comparisons of orthologs in murids (rat and mouse) and primates (human and macaque). Mean RRE of the 18 endocannabinoid genes was significantly greater in murids than primates, whereas mean Ka/Ks did not differ significantly. Next we used FUGE (tree-based maximum-likelihood model) to compute human lineage-specific Ka/Ks calculations for 18 genes, which ranged from 1.11 to 0.00, in rank order from highest to lowest: PTPN22, NAAA, TRPV1, TRPA1, NAPE-PLD, MAGL, PPARgamma, FAAH1, COX2, FAAH2, ABDH4, CB2, GPR55, DAGLbeta, PPARalpha, TRPV4, CB1, DAGLalpha; differences were significant (p < 0.0001). Rat and mouse presented different rank orders (e.g., GPR55 generated the greatest Ka/Ks ratio). The 18 genes were then tested for recent positive selection (within 10,000 yr) using an extended haplotype homozygosity analysis of SNP data from the HapMap database. Significant evidence (p < 0.05) of a positive "selective sweep" was exhibited by PTPN22, TRPV1, NAPE-PLD, and DAGLalpha. In conclusion, the endocannabinoid system is collectively under strong purifying selection, although some genes show evidence of adaptive evolution.
Collapse
|
18
|
Fontanillas E, Welch JJ, Thomas JA, Bromham L. The influence of body size and net diversification rate on molecular evolution during the radiation of animal phyla. BMC Evol Biol 2007; 7:95. [PMID: 17592650 PMCID: PMC1929056 DOI: 10.1186/1471-2148-7-95] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2007] [Accepted: 06/26/2007] [Indexed: 11/23/2022] Open
Abstract
Background Molecular clock dates, which place the origin of animal phyla deep in the Precambrian, have been used to reject the hypothesis of a rapid evolutionary radiation of animal phyla supported by the fossil record. One possible explanation of the discrepancy is the potential for fast substitution rates early in the metazoan radiation. However, concerted rate variation, occurring simultaneously in multiple lineages, cannot be detected by "clock tests", and so another way to explore such variation is to look for correlated changes between rates and other biological factors. Here we investigate two possible causes of fast early rates: change in average body size or diversification rate of deep metazoan lineages. Results For nine genes for phylogenetically independent comparisons between 50 metazoan phyla, orders, and classes, we find a significant correlation between average body size and rate of molecular evolution of mitochondrial genes. The data also indicate that diversification rate may have a positive effect on rates of mitochondrial molecular evolution. Conclusion If average body sizes were significantly smaller in the early history of the Metazoa, and if rates of diversification were much higher, then it is possible that mitochondrial genes have undergone a slow-down in evolutionary rate, which could affect date estimates made from these genes.
Collapse
Affiliation(s)
- Eric Fontanillas
- Centre for the Study of Evolution, School of Life Sciences, University of Sussex, Falmer, Brighton, BN1 9QG, UK
- Centre for Macroevolution and Macroecology, School of Botany and Zoology, Australian National University, Canberra, A.C.T. 0200 Australia
| | - John J Welch
- Centre for the Study of Evolution, School of Life Sciences, University of Sussex, Falmer, Brighton, BN1 9QG, UK
- Institute of Evolutionary Biology; School of Biological Sciences; University of Edinburgh, West Mains Rd., Edinburgh, EH9 3JT, UK
| | - Jessica A Thomas
- Centre for the Study of Evolution, School of Life Sciences, University of Sussex, Falmer, Brighton, BN1 9QG, UK
- Centre for Macroevolution and Macroecology, School of Botany and Zoology, Australian National University, Canberra, A.C.T. 0200 Australia
| | - Lindell Bromham
- Centre for the Study of Evolution, School of Life Sciences, University of Sussex, Falmer, Brighton, BN1 9QG, UK
- Centre for Macroevolution and Macroecology, School of Botany and Zoology, Australian National University, Canberra, A.C.T. 0200 Australia
| |
Collapse
|
19
|
Li T, Chamberlin SG, Caraco MD, Liberles DA, Gaucher EA, Benner SA. Analysis of transitions at two-fold redundant sites in mammalian genomes. Transition redundant approach-to-equilibrium (TREx) distance metrics. BMC Evol Biol 2006; 6:25. [PMID: 16545144 PMCID: PMC1435776 DOI: 10.1186/1471-2148-6-25] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2005] [Accepted: 03/20/2006] [Indexed: 11/10/2022] Open
Abstract
Background The exchange of nucleotides at synonymous sites in a gene encoding a protein is believed to have little impact on the fitness of a host organism. This should be especially true for synonymous transitions, where a pyrimidine nucleotide is replaced by another pyrimidine, or a purine is replaced by another purine. This suggests that transition redundant exchange (TREx) processes at the third position of conserved two-fold codon systems might offer the best approximation for a neutral molecular clock, serving to examine, within coding regions, theories that require neutrality, determine whether transition rate constants differ within genes in a single lineage, and correlate dates of events recorded in genomes with dates in the geological and paleontological records. To date, TREx analysis of the yeast genome has recognized correlated duplications that established a new metabolic strategies in fungi, and supported analyses of functional change in aromatases in pigs. TREx dating has limitations, however. Multiple transitions at synonymous sites may cause equilibration and loss of information. Further, to be useful to correlate events in the genomic record, different genes within a genome must suffer transitions at similar rates. Results A formalism to analyze divergence at two fold redundant codon systems is presented. This formalism exploits two-state approach-to-equilibrium kinetics from chemistry. This formalism captures, in a single equation, the possibility of multiple substitutions at individual sites, avoiding any need to "correct" for these. The formalism also connects specific rate constants for transitions to specific approximations in an underlying evolutionary model, including assumptions that transition rate constants are invariant at different sites, in different genes, in different lineages, and at different times. Therefore, the formalism supports analyses that evaluate these approximations. Transitions at synonymous sites within two-fold redundant coding systems were examined in the mouse, rat, and human genomes. The key metric (f2), the fraction of those sites that holds the same nucleotide, was measured for putative ortholog pairs. A transition redundant exchange (TREx) distance was calculated from f2 for these pairs. Pyrimidine-pyrimidine transitions at these sites occur approximately 14% faster than purine-purine transitions in various lineages. Transition rate constants were similar in different genes within the same lineages; within a set of orthologs, the f2 distribution is only modest overdispersed. No correlation between disparity and overdispersion is observed. In rodents, evidence was found for greater conservation of TREx sites in genes on the X chromosome, accounting for a small part of the overdispersion, however. Conclusion The TREx metric is useful to analyze the history of transition rate constants within these mammals over the past 100 million years. The TREx metric estimates the extent to which silent nucleotide substitutions accumulate in different genes, on different chromosomes, with different compositions, in different lineages, and at different times.
Collapse
Affiliation(s)
- Tang Li
- Foundation for Applied Molecular Evolution, Gainesville FL 32604, USA
| | | | - M Daniel Caraco
- Foundation for Applied Molecular Evolution, Gainesville FL 32604, USA
| | - David A Liberles
- Department of Molecular Biology, University of Wyoming, Laramie, WY 82071, USA
| | - Eric A Gaucher
- Foundation for Applied Molecular Evolution, Gainesville FL 32604, USA
| | - Steven A Benner
- Foundation for Applied Molecular Evolution, Gainesville FL 32604, USA
| |
Collapse
|
20
|
Berlin S, Brandström M, Backström N, Axelsson E, Smith NGC, Ellegren H. Substitution Rate Heterogeneity and the Male Mutation Bias. J Mol Evol 2006; 62:226-33. [PMID: 16474985 DOI: 10.1007/s00239-005-0103-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2005] [Accepted: 10/18/2005] [Indexed: 10/25/2022]
Abstract
Germline mutation rates have been found to be higher in males than in females in many organisms, a likely consequence of cell division being more frequent in spermatogenesis than in oogenesis. If the majority of mutations are due to DNA replication error, the male-to-female mutation rate ratio (alpha(m)) is expected to be similar to the ratio of the number of germ line cell divisions in males and females (c), an assumption that can be tested with proper estimates of alpha(m) and c. Alpha(m) is usually estimated by comparing substitution rates in putatively neutral sequences on the sex chromosomes. However, substantial regional variation in substitution rates across chromosomes may bias estimates of alpha(m) based on the substitution rates of short sequences. To investigate regional substitution rate variation, we estimated sequence divergence in 16 gametologous introns located on the Z and W chromosomes of five bird species of the order Galliformes. Intron ends and potentially conserved blocks were excluded to reduce the effect of using sequences subject to negative selection. We found significant substitution rate variation within Z chromosome (G15 = 37.6, p = 0.0010) as well as within W chromosome introns (G15 = 44.0, p = 0.0001). This heterogeneity also affected the estimates of alpha(m), which varied significantly, from 1.53 to 3.51, among the introns (ANOVA: F(13,14) = 2.68, p = 0.04). Our results suggest the importance of using extensive data sets from several genomic regions to avoid the effects of regional mutation rate variation and to ensure accurate estimates of alpha(m).
Collapse
Affiliation(s)
- Sofia Berlin
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18 D, Uppsala, SE-752 36, Sweden
| | | | | | | | | | | |
Collapse
|
21
|
Schmitz J, Piskurek O, Zischler H. Forty million years of independent evolution: a mitochondrial gene and its corresponding nuclear pseudogene in primates. J Mol Evol 2005; 61:1-11. [PMID: 16007490 DOI: 10.1007/s00239-004-0293-3] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2004] [Accepted: 02/25/2005] [Indexed: 10/25/2022]
Abstract
Sequences from nuclear mitochondrial pseudogenes (numts) that originated by transfer of genetic information from mitochondria to the nucleus offer a unique opportunity to compare different regimes of molecular evolution. Analyzing a 1621-nt-long numt of the rRNA specifying mitochondrial DNA residing on human chromosome 3 and its corresponding mitochondrial gene in 18 anthropoid primates, we were able to retrace about 40 MY of primate rDNA evolutionary history. The results illustrate strengths and weaknesses of mtDNA data sets in reconstructing and dating the phylogenetic history of primates. We were able to show the following. In contrast to numt-DNA, (1) the nucleotide composition of mtDNA changed dramatically in the different primate lineages. This is assumed to lead to significant misinterpretations of the mitochondrial evolutionary history. (2) Due to the nucleotide compositional plasticity of primate mtDNA, the phylogenetic reconstruction combining mitochondrial and nuclear sequences is unlikely to yield reliable information for either tree topologies or branch lengths. This is because a major part of the underlying sequence evolution model--the nucleotide composition--is undergoing dramatic change in different mitochondrial lineages. We propose that this problem is also expressed in the occasional unexpected long branches leading to the "common ancestor" of orthologous numt sequences of different primate taxa. (3) The heterogeneous and lineage-specific evolution of mitochondrial sequences in primates renders molecular dating based on primate mtDNA problematic, whereas the numt sequences provide a much more reliable base for dating.
Collapse
Affiliation(s)
- Jürgen Schmitz
- Institute of Experimental Pathology, ZMBE, University of Münster, Von-Esmarch-Str. 56,, D-48149 , Münster, Germany.
| | | | | |
Collapse
|
22
|
Abstract
BACKGROUND A frequent observation in molecular evolution is that amino-acid substitution rates show an index of dispersion (that is, ratio of variance to mean) substantially larger than one. This observation has been termed the overdispersed molecular clock. On the basis of in silico protein-evolution experiments, Bastolla and coworkers recently proposed an explanation for this observation: Proteins drift in neutral space, and can temporarily get trapped in regions of substantially reduced neutrality. In these regions, substitution rates are suppressed, which results in an overall substitution process that is not Poissonian. However, the simulation method of Bastolla et al. is representative only for cases in which the product of mutation rate micro and population size Ne is small. How the substitution process behaves when micro Ne is large is not known. RESULTS Here, I study the behavior of the molecular clock in in silico protein evolution as a function of mutation rate and population size. I find that the index of dispersion decays with increasing micro Ne, and approaches 1 for large micro Ne. This observation can be explained with the selective pressure for mutational robustness, which is effective when micro Ne is large. This pressure keeps the population out of low-neutrality traps, and thus steadies the ticking of the molecular clock. CONCLUSIONS The molecular clock in neutral protein evolution can fall into two distinct regimes, a strongly overdispersed one for small micro Ne, and a mostly Poissonian one for large micro Ne. The former is relevant for the majority of organisms in the plant and animal kingdom, and the latter may be relevant for RNA viruses.
Collapse
Affiliation(s)
- Claus O Wilke
- Keck Graduate Institute of Applied Life Sciences, 535 Watson Drive, Claremont, California 91711, USA.
| |
Collapse
|