1
|
Gupta R, Verma SD. Two-Dimensional Fluctuation Correlation Spectroscopy (2D-FlucCS): A Method to Determine the Origin of Relaxation Rate Dispersion. ACS Meas Sci Au 2024; 4:153-162. [PMID: 38645580 PMCID: PMC11027202 DOI: 10.1021/acsmeasuresciau.3c00048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 01/11/2024] [Accepted: 01/11/2024] [Indexed: 04/23/2024]
Abstract
Relaxation rate dispersion, i.e., nonexponential or multicomponent kinetics, is observed in complex systems when measuring relaxation kinetics. Often, the origin of rate dispersion is associated with the heterogeneity in the system. However, both homogeneous (where all molecules experience the same rate but inherently nonexponential) and heterogeneous (where all molecules experience different rates) systems can exhibit rate dispersion. A multidimensional correlation analysis method has been demonstrated to detect and quantify rate dispersion observed in molecular rotation, diffusion, solvation, and reaction kinetics. One-dimensional (1D) autocorrelation function detects rate dispersion and measures its extent. Two-dimensional (2D) autocorrelation function measures the origin of rate dispersion and distinguishes homogeneous from heterogeneous. In a heterogeneous system, implicitly there exist subensembles of molecules experiencing different rates. A three-dimensional (3D) autocorrelation function measures subensemble exchange if present and reveals if the system possesses static or dynamic heterogeneity. This perspective discusses the principles, applications, and potential and also presents a future outlook of two-dimensional fluctuation correlation spectroscopy (2D-FlucCS). The method is applicable to any experiment or simulation where a time series of fluctuation in an observable (emission, scattering, current, etc.) around a mean value can be obtained in steady state (equilibrium or nonequilibrium), provided the system is ergodic.
Collapse
Affiliation(s)
- Ruchir Gupta
- Spectroscopy and Dynamics
Visualization Laboratory, Department of Chemistry, Indian Institute of Science Education and Research Bhopal, Bhauri, Bhopal 462066, Madhya Pradesh, India
| | - Sachin Dev Verma
- Spectroscopy and Dynamics
Visualization Laboratory, Department of Chemistry, Indian Institute of Science Education and Research Bhopal, Bhauri, Bhopal 462066, Madhya Pradesh, India
| |
Collapse
|
2
|
Duchêne DA, Duchêne S, Stiller J, Heller R, Ho SYW. ClockstaRX: Testing Molecular Clock Hypotheses With Genomic Data. Genome Biol Evol 2024; 16:evae064. [PMID: 38526019 PMCID: PMC10999959 DOI: 10.1093/gbe/evae064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 01/11/2024] [Accepted: 03/21/2024] [Indexed: 03/26/2024] Open
Abstract
Phylogenomic data provide valuable opportunities for studying evolutionary rates and timescales. These analyses require theoretical and statistical tools based on molecular clocks. We present ClockstaRX, a flexible platform for exploring and testing evolutionary rate signals in phylogenomic data. Here, information about evolutionary rates in branches across gene trees is placed in Euclidean space, allowing data transformation, visualization, and hypothesis testing. ClockstaRX implements formal tests for identifying groups of loci and branches that make a large contribution to patterns of rate variation. This information can then be used to test for drivers of genomic evolutionary rates or to inform models for molecular dating. Drawing on the results of a simulation study, we recommend forms of data exploration and filtering that might be useful prior to molecular-clock analyses.
Collapse
Affiliation(s)
- David A Duchêne
- Center for Evolutionary Hologenomics, University of Copenhagen, Copenhagen 1352, Denmark
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen 1352, Denmark
| | - Sebastián Duchêne
- Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, VIC 3010, Australia
| | - Josefin Stiller
- Villum Centre for Biodiversity Genomics, University of Copenhagen, 2100 Copenhagen, Denmark
| | - Rasmus Heller
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen 2100, Denmark
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW 2006, Australia
| |
Collapse
|
3
|
Barido-Sottani J, Morlon H. The ClaDS rate-heterogeneous birth-death prior for full phylogenetic inference in BEAST2. Syst Biol 2023; 72:1180-1187. [PMID: 37161619 PMCID: PMC10627560 DOI: 10.1093/sysbio/syad027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 01/16/2023] [Accepted: 04/24/2023] [Indexed: 05/11/2023] Open
Abstract
Bayesian phylogenetic inference requires a tree prior, which models the underlying diversification process that gives rise to the phylogeny. Existing birth-death diversification models include a wide range of features, for instance, lineage-specific variations in speciation and extinction (SSE) rates. While across-lineage variation in SSE rates is widespread in empirical datasets, few heterogeneous rate models have been implemented as tree priors for Bayesian phylogenetic inference. As a consequence, rate heterogeneity is typically ignored when reconstructing phylogenies, and rate heterogeneity is usually investigated on fixed trees. In this paper, we present a new BEAST2 package implementing the cladogenetic diversification rate shift (ClaDS) model as a tree prior. ClaDS is a birth-death diversification model designed to capture small progressive variations in birth and death rates along a phylogeny. Unlike previous implementations of ClaDS, which were designed to be used with fixed, user-chosen phylogenies, our package is implemented in the BEAST2 framework and thus allows full phylogenetic inference, where the phylogeny and model parameters are co-estimated from a molecular alignment. Our package provides all necessary components of the inference, including a new tree object and operators to propose moves to the Monte-Carlo Markov chain. It also includes a graphical interface through BEAUti. We validate our implementation of the package by comparing the produced distributions to simulated data and show an empirical example of the full inference, using a dataset of cetaceans.
Collapse
Affiliation(s)
- Joëlle Barido-Sottani
- Institut de Biologie de l’ENS (IBENS), École normale supérieure, CNRS, INSERM, Université PSL, 75005 Paris, France
| | - Hélène Morlon
- Institut de Biologie de l’ENS (IBENS), École normale supérieure, CNRS, INSERM, Université PSL, 75005 Paris, France
| |
Collapse
|
4
|
Shafir A, Halabi K, Escudero M, Mayrose I. A non-homogeneous model of chromosome-number evolution to reveal shifts in the transition patterns across the phylogeny. New Phytol 2023; 238:1733-1744. [PMID: 36759331 DOI: 10.1111/nph.18805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 02/06/2023] [Indexed: 06/18/2023]
Abstract
Changes in chromosome numbers, including polyploidy and dysploidy events, play a key role in eukaryote evolution as they could expediate reproductive isolation and have the potential to foster phenotypic diversification. Deciphering the pattern of chromosome-number change within a phylogeny currently relies on probabilistic evolutionary models. All currently available models assume time homogeneity, such that the transition rates are identical throughout the phylogeny. Here, we develop heterogeneous models of chromosome-number evolution that allow multiple transition regimes to operate in distinct parts of the phylogeny. The partition of the phylogeny to distinct transition regimes may be specified by the researcher or, alternatively, identified using a sequential testing approach. Once the number and locations of shifts in the transition pattern are determined, a second search phase identifies regimes with similar transition dynamics, which could indicate on convergent evolution. Using simulations, we study the performance of the developed model to detect shifts in patterns of chromosome-number evolution and demonstrate its applicability by analyzing the evolution of chromosome numbers within the Cyperaceae plant family. The developed model extends the capabilities of probabilistic models of chromosome-number evolution and should be particularly helpful for the analyses of large phylogenies that include multiple distinct subclades.
Collapse
Affiliation(s)
- Anat Shafir
- School of Plant Sciences and Food Security, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Keren Halabi
- School of Plant Sciences and Food Security, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Marcial Escudero
- Department of Plant Biology and Ecology, University of Seville, Reina Mercedes, ES-41012, Seville, Spain
| | - Itay Mayrose
- School of Plant Sciences and Food Security, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, 69978, Israel
| |
Collapse
|
5
|
Kriebel R, Rose JP, Drew BT, González-Gallegos JG, Celep F, Heeg L, Mahdjoub MM, Sytsma KJ. Model selection, hummingbird natural history, and biological hypotheses: a response to Sazatornil et al. Evolution 2023; 77:646-653. [PMID: 36626811 DOI: 10.1093/evolut/qpac023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 10/19/2022] [Accepted: 11/11/2022] [Indexed: 01/12/2023]
Abstract
We have previously suggested that a shift from bee to hummingbird pollination, in concert with floral architecture modifications, occurred at the crown of Salvia subgenus Calosphace in North America ca. 20 mya (Kriebel et al. 2020 and references therein). Sazatornil et al. (2022), using a hidden states model, challenged these assertions, arguing that bees were the ancestral pollinator of subg. Calosphace and claiming that hummingbirds could not have been the ancestral pollinator of subg. Calosphace because hummingbirds were not contemporaneous with crown subg. Calosphace in North America. Here, using a variety of models, we demonstrate that most analyses support hummingbirds as ancestral pollinators of subg. Calosphace and show that Sazatornil et al. (2022) erroneously concluded that hummingbirds were absent from North America ca. 20 mya. We contend that "biological realism" - based on timing and placement of hummingbirds in Mexico ca. 20 mya and the correlative evolution of hummingbird associated floral traits - must be considered when comparing models based on fit and complexity, including hidden states models.
Collapse
Affiliation(s)
- Ricardo Kriebel
- Department of Botany, California Academy of Sciences, San Francisco, CA, United States.,Department of Botany, University of Wisconsin-Madison, Madison, WI, United States
| | - Jeffrey P Rose
- Department of Biology, University of Nebraska at Kearney, Kearney, NE, United States
| | - Bryan T Drew
- Department of Biology, University of Nebraska at Kearney, Kearney, NE, United States
| | | | - Ferhat Celep
- Kırıkkale University, Faculty of Arts and Sciences, Department of Biology, Yahşiyan, Turkey
| | - Luciann Heeg
- Department of Botany, California Academy of Sciences, San Francisco, CA, United States
| | - Mohamed M Mahdjoub
- Department of Biology, Faculty of Natural and Life Sciences and Earth Sciences, University of Bouira, Bouira, Algeria
| | - Kenneth J Sytsma
- Department of Botany, California Academy of Sciences, San Francisco, CA, United States
| |
Collapse
|
6
|
Boyko JD, Beaulieu JM. Reducing the biases in false correlations between discrete characters. Syst Biol 2022:6730956. [PMID: 36173613 DOI: 10.1093/sysbio/syac066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Indexed: 11/12/2022] Open
Abstract
The correlation between two characters is often interpreted as evidence that there exists a significant and biologically important relationship between them. However, Maddison and FitzJohn (2015) recently pointed out that evidence of correlated evolution between two categorical characters is often spurious, particularly, when the dependent relationship stems from a single replicate deep in time. Here we will show that there may, in fact, be a statistical solution to the problem posed by Maddison and FitzJohn (2015) naturally embedded within the expanded model space afforded by the hidden Markov model (HMM) framework. We demonstrate that the problem of single unreplicated evolutionary events manifests itself as rate heterogeneity within our models and that this is the source of the false correlation. Therefore, we argue that this problem is better understood as model misspecification rather than a failure of comparative methods to account for phylogenetic pseudoreplication. We utilize HMMs to develop a multi-rate independent model which, when implemented, drastically reduces support for correlation. The problem itself extends beyond categorical character evolution, but we believe that the practical solution presented here may lend itself to future extensions in other areas of comparative biology.
Collapse
Affiliation(s)
- James D Boyko
- Department of Biological Sciences, University of Arkansas, Fayetteville, Arkansas, 72701 USA
| | - Jeremy M Beaulieu
- Department of Biological Sciences, University of Arkansas, Fayetteville, Arkansas, 72701 USA
| |
Collapse
|
7
|
Ping J, Hao J, Li J, Yang Y, Su Y, Wang T. Loss of the IR region in conifer plastomes: Changes in the selection pressure and substitution rate of protein-coding genes. Ecol Evol 2022; 12:e8499. [PMID: 35136556 PMCID: PMC8809450 DOI: 10.1002/ece3.8499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 12/08/2021] [Accepted: 12/14/2021] [Indexed: 11/10/2022] Open
Abstract
Plastid genomes (plastomes) have a quadripartite structure, but some species have drastically reduced or lost inverted repeat (IR) regions. IR regions are important for genome stability and the evolution rate. In the evolutionary process of gymnosperms, the typical IRs of conifers were lost, possibly affecting the evolutionary rate and selection pressure of genomic protein-coding genes. In this study, we selected 78 gymnosperm species (51 genera, 13 families) for evolutionary analysis. The selection pressure analysis results showed that negative selection effects were detected in all 50 common genes. Among them, six genes in conifers had higher ω values than non-conifers, and 12 genes had lower ω values. The evolutionary rate analysis results showed that 9 of 50 common genes differed between conifers and non-conifers. It is more obvious that in non-conifers, the rates of psbA (trst, trsv, ratio, dN, dS, and ω) were 2.6- to 3.1-fold of conifers. In conifers, trsv, ratio, dN, dS, and ω of ycf2 were 1.2- to 3.6-fold of non-conifers. In addition, the evolution rate of ycf2 in the IR was significantly reduced. psbA is undergoing dynamic change, with an abnormally high evolution rate as a small portion of it enters the IR region. Although conifers have lost the typical IR regions, we detected no change in the substitution rate or selection pressure of most protein-coding genes due to gene function, plant habitat, or newly acquired IRs.
Collapse
Affiliation(s)
- Jingyao Ping
- College of Life SciencesSouth China Agricultural UniversityGuangzhouChina
| | - Jing Hao
- College of Life SciencesSouth China Agricultural UniversityGuangzhouChina
| | - Jinye Li
- College of Life SciencesSouth China Agricultural UniversityGuangzhouChina
| | - Yiqing Yang
- College of Life Science and TechnologyCentral South University of Forestry and TechnologyChangshaChina
| | - Yingjuan Su
- School of Life SciencesSun Yat‐sen UniversityGuangzhouChina
- Research Institute of Sun Yat‐sen UniversityShenzhenChina
| | - Ting Wang
- College of Life SciencesSouth China Agricultural UniversityGuangzhouChina
| |
Collapse
|
8
|
Abstract
I analyzed various site pattern combinations in a 4-OTU case to identify sources of starless bias and parameter-estimation bias in likelihood-based phylogenetic methods, and reported three significant contributions. First, the likelihood method is counterintuitive in that it may not generate a star tree with sequences that are equidistant from each other. This behaviour, dubbed starless bias, happens in a 4-OTU tree when there is an excess (i.e., more than expected from a star tree and a substitution model) of conflicting phylogenetic signals supporting the three resolved topologies equally. Special site pattern combinations leading to rejection of a star tree, when sequences are equidistant from each other, were identified. Second, fitting gamma distribution to model rate heterogeneity over sites is strongly confounded with tree topology, especially in conjunction with the starless bias. I present examples to show dramatic differences in the estimated shape parameter α between a star tree and a resolved tree. There may be no rate heterogeneity over sites (with the estimated α > 10000) when a star tree is imposed, but α < 1 (suggesting strong rate heterogeneity over sites) when an (incorrect) resolved tree is imposed. Thus, the dependence of “rate heterogeneity” on tree topology implies that “rate heterogeneity” is not a sequence-specific feature, cautioning against interpreting a small α to mean that some sites are under strong purifying selection and others not. Thirdly, because there is no existing (and working) likelihood method for evaluating a star tree with continuous gamma-distributed rate, I have implemented the method for JC69 in a self-contained R script for a four-OTU tree (star or resolved), in addition to another R script assuming a constant rate over sites. These R scripts should be useful for teaching and exploring likelihood methods in phylogenetics.
Collapse
Affiliation(s)
- Xuhua Xia
- Department of Biology, University of Ottawa, Ottawa, Canada, K1N 6N5.,Ottawa Institute of Systems Biology, Ottawa, Canada, K1H 8M5
| |
Collapse
|
9
|
Chen S, Saito N, Encabo JR, Yamada K, Choi IR, Kishima Y. Ancient Endogenous Pararetroviruses in Oryza Genomes Provide Insights into the Heterogeneity of Viral Gene Macroevolution. Genome Biol Evol 2018; 10:2686-2696. [PMID: 30239708 PMCID: PMC6179347 DOI: 10.1093/gbe/evy207] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/16/2018] [Indexed: 12/13/2022] Open
Abstract
Endogenous viral sequences in eukaryotic genomes, such as those derived from plant pararetroviruses (PRVs), can serve as genomic fossils to study viral macroevolution. Many aspects of viral evolutionary rates are heterogeneous, including substitution rate differences between genes. However, the evolutionary dynamics of this viral gene rate heterogeneity (GRH) have been rarely examined. Characterizing such GRH may help to elucidate viral adaptive evolution. In this study, based on robust phylogenetic analysis, we determined an ancient endogenous PRV group in Oryza genomes in the range of being 2.41-15.00 Myr old. We subsequently used this ancient endogenous PRV group and three younger groups to estimate the GRH of PRVs. Long-term substitution rates for the most conserved gene and a divergent gene were 2.69 × 10-8 to 8.07 × 10-8 and 4.72 × 10-8 to 1.42 × 10-7 substitutions/site/year, respectively. On the basis of a direct comparison, a long-term GRH of 1.83-fold was identified between these two genes, which is unexpectedly low and lower than the short-term GRH (>3.40-fold) of PRVs calculated using published data. The lower long-term GRH of PRVs was due to the slightly faster rate decay of divergent genes than of conserved genes during evolution. To the best of our knowledge, we quantified for the first time the long-term GRH of viral genes using paleovirological analyses, and proposed that the GRH of PRVs might be heterogeneous on time scales (time-dependent GRH). Our findings provide special insights into viral gene macroevolution and should encourage a more detailed examination of the viral GRH.
Collapse
Affiliation(s)
- Sunlu Chen
- Laboratory of Plant Breeding, Research Faculty of Agriculture, Hokkaido University, Sapporo, Japan
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, College of Agriculture, Nanjing Agricultural University, Nanjing, China
| | - Nozomi Saito
- Laboratory of Plant Breeding, Research Faculty of Agriculture, Hokkaido University, Sapporo, Japan
| | - Jaymee R Encabo
- Laboratory of Plant Breeding, Research Faculty of Agriculture, Hokkaido University, Sapporo, Japan
- Rice Breeding Platform, International Rice Research Institute, Los Baños, Laguna, Philippines
- Microbiology Division, Institute of Biological Sciences, University of the Philippines Los Baños, Los Baños, Laguna, Philippines
| | - Kanae Yamada
- Laboratory of Plant Breeding, Research Faculty of Agriculture, Hokkaido University, Sapporo, Japan
| | - Il-Ryong Choi
- Rice Breeding Platform, International Rice Research Institute, Los Baños, Laguna, Philippines
| | - Yuji Kishima
- Laboratory of Plant Breeding, Research Faculty of Agriculture, Hokkaido University, Sapporo, Japan
| |
Collapse
|
10
|
Chira AM, Cooney CR, Bright JA, Capp EJR, Hughes EC, Moody CJA, Nouri LO, Varley ZK, Thomas GH. Correlates of rate heterogeneity in avian ecomorphological traits. Ecol Lett 2018; 21:1505-1514. [PMID: 30133084 PMCID: PMC6175488 DOI: 10.1111/ele.13131] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Revised: 05/22/2018] [Accepted: 07/05/2018] [Indexed: 12/13/2022]
Abstract
Heterogeneity in rates of trait evolution is widespread, but it remains unclear which processes drive fast and slow character divergence across global radiations. Here, we test multiple hypotheses for explaining rate variation in an ecomorphological trait (beak shape) across a globally distributed group (birds). We find low support that variation in evolutionary rates of species is correlated with life history, environmental mutagenic factors, range size, number of competitors, or living on islands. Indeed, after controlling for the negative effect of species' age, 80% of variation in species‐specific evolutionary rates remains unexplained. At the clade level, high evolutionary rates are associated with unusual phenotypes or high species richness. Taken together, these results imply that macroevolutionary rates of ecomorphological traits are governed by both ecological opportunity in distinct adaptive zones and niche differentiation among closely related species.
Collapse
Affiliation(s)
- A M Chira
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, S10 2TN, UK
| | - C R Cooney
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, S10 2TN, UK
| | - J A Bright
- School of Geosciences, University of South Florida, Tampa, FL, USA
| | - E J R Capp
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, S10 2TN, UK
| | - E C Hughes
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, S10 2TN, UK
| | - C J A Moody
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, S10 2TN, UK
| | - L O Nouri
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, S10 2TN, UK
| | - Z K Varley
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, S10 2TN, UK
| | - G H Thomas
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield, S10 2TN, UK.,Bird Group, Department of Life Sciences, The Natural History Museum, Tring, Hertfordshire, UK
| |
Collapse
|
11
|
Foster CSP, Ho SYW. Strategies for Partitioning Clock Models in Phylogenomic Dating: Application to the Angiosperm Evolutionary Timescale. Genome Biol Evol 2018; 9:2752-2763. [PMID: 29036288 PMCID: PMC5647803 DOI: 10.1093/gbe/evx198] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/25/2017] [Indexed: 12/14/2022] Open
Abstract
Evolutionary timescales can be inferred from molecular sequence data using a Bayesian phylogenetic approach. In these methods, the molecular clock is often calibrated using fossil data. The uncertainty in these fossil calibrations is important because it determines the limiting posterior distribution for divergence-time estimates as the sequence length tends to infinity. Here, we investigate how the accuracy and precision of Bayesian divergence-time estimates improve with the increased clock-partitioning of genome-scale data into clock-subsets. We focus on a data set comprising plastome-scale sequences of 52 angiosperm taxa. There was little difference among the Bayesian date estimates whether we chose clock-subsets based on patterns of among-lineage rate heterogeneity or relative rates across genes, or by random assignment. Increasing the degree of clock-partitioning usually led to an improvement in the precision of divergence-time estimates, but this increase was asymptotic to a limit presumably imposed by fossil calibrations. Our clock-partitioning approaches yielded highly precise age estimates for several key nodes in the angiosperm phylogeny. For example, when partitioning the data into 20 clock-subsets based on patterns of among-lineage rate heterogeneity, we inferred crown angiosperms to have arisen 198–178 Ma. This demonstrates that judicious clock-partitioning can improve the precision of molecular dating based on phylogenomic data, but the meaning of this increased precision should be considered critically.
Collapse
Affiliation(s)
- Charles S P Foster
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales 2006, Australia
| | - Simon Y W Ho
- School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales 2006, Australia
| |
Collapse
|
12
|
Smith SA, Pease JB. Heterogeneous molecular processes among the causes of how sequence similarity scores can fail to recapitulate phylogeny. Brief Bioinform 2017; 18:451-457. [PMID: 27103098 PMCID: PMC5429007 DOI: 10.1093/bib/bbw034] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2016] [Indexed: 11/24/2022] Open
Abstract
Sequence similarity tools like Basic Local Alignment Search Tool (BLAST) are essential components of many functional genetic, genomic, phylogenetic and bioinformatic studies. Many modern analysis pipelines use significant sequence similarity scores (p- or E-values) and the ranked order of BLAST matches to test a wide range of hypotheses concerning homology, orthology, the timing of de novo gene birth/death and gene family expansion/contraction. Despite significant contrary findings, many of these tests still implicitly assume that stronger or higher-ranked E-value scores imply closer phylogenetic relationships between sequences. Here, we demonstrate that even though a general relationship does exist between the phylogenetic distance of two sequences and their E-value, significant and misleading errors occur in both the completeness and the order of results under realistic evolutionary scenarios. These results provide additional details to past evidence showing that studies should avoid drawing direct inferences of evolutionary relatedness from measures of sequence similarity alone, and should instead, where possible, use more rigorous phylogeny-based methods.
Collapse
Affiliation(s)
- Stephen A Smith
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, USA
- Corresponding author: Stephen A. Smith, Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan 48109, USA. E-mail:
| | - James B Pease
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
13
|
Salomo K, Smith JF, Feild TS, Samain MS, Bond L, Davidson C, Zimmers J, Neinhuis C, Wanke S. The Emergence of Earliest Angiosperms may be Earlier than Fossil Evidence Indicates. Syst Bot 2017; 42:607-619. [PMID: 29398773 PMCID: PMC5792071 DOI: 10.1600/036364417x696438] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
Gaps between molecular ages and fossils undermine the validity of time-calibrated molecular phylogenies. An example of the time gap surrounds the age of angiosperm's origin. We calculate molecular ages of the earliest flowering plant lineages using 22 fossil calibrations (101 genera, 40 families). Our results reveal the origin of angiosperms at the late Permian, ~275 million years ago. Different prior probability curves of molecular age calculations on dense calibration point distributions had little effect on overall age estimates compared to the effects of altered calibration points. The same is true for reasonable root age constraints. We conclude that our age estimates based on multiple datasets, priors, and calibration points are robust and the true ages are likely between our extremes. Our results, when integrated with the ecophysiological evolution of early angiosperms, imply that the ecology of the earliest angiosperms is critical to understand the pre-Cretaceous evolution of flowering plants.
Collapse
Affiliation(s)
- Karsten Salomo
- Technische Universität Dresden, Technische Universität Dresden, Zellescher Weg 20b, 01062 Dresden, Germany
| | - James F. Smith
- Department of Biological Sciences, Boise State University, 1910 University Drive, Boise, Idaho, 83725, U. S. A
| | - Taylor S. Feild
- School of Marine and Tropical Biology, James Cook University, Townsville, Queensland, Australia
| | - Marie-Stéphanie Samain
- Instituto de Ecología A.C., Centro Regional del Bajío, Avenida Lázaro Cárdenas 253, 61600 Pátzcuaro, Mexico
- Ghent University, Department of Biology, Research Group Spermatophytes, K. L. Ledeganckstraat 35, B-9000 Gent, Belgium
| | - Laura Bond
- Department of Biological Sciences, Boise State University, 1910 University Drive, Boise, Idaho, 83725, U. S. A
| | | | - Jay Zimmers
- Department of Biological Sciences, Boise State University, 1910 University Drive, Boise, Idaho, 83725, U. S. A
| | - Christoph Neinhuis
- Technische Universität Dresden, Technische Universität Dresden, Zellescher Weg 20b, 01062 Dresden, Germany
| | - Stefan Wanke
- Technische Universität Dresden, Technische Universität Dresden, Zellescher Weg 20b, 01062 Dresden, Germany
| |
Collapse
|
14
|
Li FW, Kuo LY, Pryer KM, Rothfels CJ. Genes Translocated into the Plastid Inverted Repeat Show Decelerated Substitution Rates and Elevated GC Content. Genome Biol Evol 2016; 8:2452-8. [PMID: 27401175 PMCID: PMC5010901 DOI: 10.1093/gbe/evw167] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Plant chloroplast genomes (plastomes) are characterized by an inverted repeat (IR) region and two larger single copy (SC) regions. Patterns of molecular evolution in the IR and SC regions differ, most notably by a reduced rate of nucleotide substitution in the IR compared to the SC region. In addition, the organization and structure of plastomes is fluid, and rearrangements through time have repeatedly shuffled genes into and out of the IR, providing recurrent natural experiments on how chloroplast genome structure can impact rates and patterns of molecular evolution. Here we examine four loci (psbA, ycf2, rps7, and rps12 exon 2-3) that were translocated from the SC into the IR during fern evolution. We use a model-based method, within a phylogenetic context, to test for substitution rate shifts. All four loci show a significant, 2- to 3-fold deceleration in their substitution rate following translocation into the IR, a phenomenon not observed in any other, nontranslocated plastid genes. Also, we show that after translocation, the GC content of the third codon position and of the noncoding regions is significantly increased, implying that gene conversion within the IR is GC-biased. Taken together, our results suggest that the IR region not only reduces substitution rates, but also impacts nucleotide composition. This finding highlights a potential vulnerability of correlating substitution rate heterogeneity with organismal life history traits without knowledge of the underlying genome structure.
Collapse
Affiliation(s)
- Fay-Wei Li
- University Herbarium and Department of Integrative Biology, University of California, Berkeley Department of Biology, Duke University, Durham
| | - Li-Yaung Kuo
- Institute of Ecology and Evolutionary Biology, National Taiwan University, Taipei
| | | | - Carl J Rothfels
- University Herbarium and Department of Integrative Biology, University of California, Berkeley
| |
Collapse
|
15
|
Abstract
Phylogenetic inference artifacts can occur when sequence evolution deviates from assumptions made by the models used to analyze them. The combination of strong model assumption violations and highly heterogeneous lineage evolutionary rates can become problematic in phylogenetic inference, and lead to the well-described long-branch attraction (LBA) artifact. Here, we define an objective criterion for assessing lineage evolutionary rate heterogeneity among predefined lineages: the result of a likelihood ratio test between a model in which the lineages evolve at the same rate (homogeneous model) and a model in which different lineage rates are allowed (heterogeneous model). We implement this criterion in the algorithm Locus Specific Sequence Subsampling (LS³), aimed at reducing the effects of LBA in multi-gene datasets. For each gene, LS³ sequentially removes the fastest-evolving taxon of the ingroup and tests for lineage rate homogeneity until all lineages have uniform evolutionary rates. The sequences excluded from the homogeneously evolving taxon subset are flagged as potentially problematic. The software implementation provides the user with the possibility to remove the flagged sequences for generating a new concatenated alignment. We tested LS³ with simulations and two real datasets containing LBA artifacts: a nucleotide dataset regarding the position of Glires within mammals and an amino-acid dataset concerning the position of nematodes within bilaterians. The initially incorrect phylogenies were corrected in all cases upon removing data flagged by LS³.
Collapse
Affiliation(s)
- Carlos J Rivera-Rivera
- Department of Genetics and Evolution, University of Geneva, Geneva, Switzerland Institute of Genetics and Genomics in Geneva (iGE3), Geneva, Switzerland
| | - Juan I Montoya-Burgos
- Department of Genetics and Evolution, University of Geneva, Geneva, Switzerland Institute of Genetics and Genomics in Geneva (iGE3), Geneva, Switzerland
| |
Collapse
|
16
|
Barrett CF, Baker WJ, Comer JR, Conran JG, Lahmeyer SC, Leebens-Mack JH, Li J, Lim GS, Mayfield-Jones DR, Perez L, Medina J, Pires JC, Santos C, Wm Stevenson D, Zomlefer WB, Davis JI. Plastid genomes reveal support for deep phylogenetic relationships and extensive rate variation among palms and other commelinid monocots. New Phytol 2016; 209:855-70. [PMID: 26350789 DOI: 10.1111/nph.13617] [Citation(s) in RCA: 126] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2015] [Accepted: 07/23/2015] [Indexed: 05/03/2023]
Abstract
Despite progress based on multilocus, phylogenetic studies of the palms (order Arecales, family Arecaceae), uncertainty remains in resolution/support among major clades and for the placement of the palms among the commelinid monocots. Palms and related commelinids represent a classic case of substitution rate heterogeneity that has not been investigated in the genomic era. To address questions of relationships, support and rate variation among palms and commelinid relatives, 39 plastomes representing the palms and related family Dasypogonaceae were generated via genome skimming and integrated within a monocot-wide matrix for phylogenetic and molecular evolutionary analyses. Support was strong for 'deep' relationships among the commelinid orders, among the five palm subfamilies, and among tribes of the subfamily Coryphoideae. Additionally, there was extreme heterogeneity in the plastid substitution rates across the commelinid orders indicated by model based analyses, with c. 22 rate shifts, and significant departure from a global clock. To date, this study represents the most comprehensively sampled matrix of plastomes assembled for monocot angiosperms, providing genome-scale support for phylogenetic relationships of monocot angiosperms, and lays the phylogenetic groundwork for comparative analyses of the drivers and correlates of such drastic differences in substitution rates across a diverse and significant clade.
Collapse
Affiliation(s)
- Craig F Barrett
- Department of Biological Sciences, California State University, Los Angeles, CA, 90032, USA
- Division of Plant and Soil Sciences, West Virginia University, Morgantown, WV, 26506, USA
| | | | - Jason R Comer
- Department of Plant Biology, University of Georgia, Athens, GA, 30602, USA
| | - John G Conran
- Department of Genetics and Evolution, School of Biological Sciences, University of Adelaide, Adelaide, 5005, Australia
| | - Sean C Lahmeyer
- Herbarium, The Huntington Library, Art Collection, and Botanical Gardens, San Marino, CA, 91108, USA
| | | | - Jeff Li
- Graduate Program in Genetics, Genomics, and Bioinformatics, University of California, Riverside, CA, 92521, USA
| | - Gwynne S Lim
- L. H. Bailey Hortorium and Plant Biology Section, Cornell University, Ithaca, NY, 14853, USA
| | - Dustin R Mayfield-Jones
- Donald Danforth Plant Science Center, St Louis, MO, 63132, USA
- Division of Biological Sciences, Bond Life Sciences Center, University of Missouri, Columbia, MO, 65211, USA
| | - Leticia Perez
- Department of Biological Sciences, California State University, Los Angeles, CA, 90032, USA
| | - Jesus Medina
- Department of Biological Sciences, California State University, Los Angeles, CA, 90032, USA
| | - J Chris Pires
- Division of Biological Sciences, Bond Life Sciences Center, University of Missouri, Columbia, MO, 65211, USA
| | - Cristian Santos
- Department of Biological Sciences, California State University, Los Angeles, CA, 90032, USA
| | - Dennis Wm Stevenson
- Pfizer Laboratory of Molecular Systematics, New York Botanical Garden, Bronx, NY, 10458, USA
| | - Wendy B Zomlefer
- Herbarium, The Huntington Library, Art Collection, and Botanical Gardens, San Marino, CA, 91108, USA
| | - Jerrold I Davis
- L. H. Bailey Hortorium and Plant Biology Section, Cornell University, Ithaca, NY, 14853, USA
| |
Collapse
|
17
|
Abstract
The molecular clock has played an important role in biological research, both as a description of the evolutionary process and as a tool for inferring evolutionary timescales. Genomic data have provided valuable insights into the molecular clock, allowing the patterns and causes of evolutionary rate variation to be characterized in increasing detail. I explain how genome sequences offer exciting opportunities for estimating the timescale of the Tree of Life. I describe the different approaches that have been used to deal with the computational and statistical challenges encountered in molecular clock analyses of genomic data. Finally, I offer a perspective on the future of molecular clocks, highlighting some of the key limitations and the most promising research directions.
Collapse
Affiliation(s)
- Simon Y W Ho
- School of Biological Sciences, University of Sydney, Sydney, NSW, Australia.
| |
Collapse
|
18
|
Librado P, Vieira FG, Sánchez-Gracia A, Kolokotronis SO, Rozas J. Mycobacterial phylogenomics: an enhanced method for gene turnover analysis reveals uneven levels of gene gain and loss among species and gene families. Genome Biol Evol 2014; 6:1454-65. [PMID: 24904011 PMCID: PMC4079203 DOI: 10.1093/gbe/evu117] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Species of the genus Mycobacterium differ in several features, from geographic ranges, and degree of pathogenicity, to ecological and host preferences. The recent availability of several fully sequenced genomes for a number of these species enabled the comparative study of the genetic determinants of this wide lifestyle diversity. Here, we applied two complementary phylogenetic-based approaches using information from 19 Mycobacterium genomes to obtain a more comprehensive view of the evolution of this genus. First, we inferred the phylogenetic relationships using two new approaches, one based on a Mycobacterium-specific amino acid substitution matrix and the other on a gene content dissimilarity matrix. Then, we utilized our recently developed gain-and-death stochastic models to study gene turnover dynamics in this genus in a maximum-likelihood framework. We uncovered a scenario that differs markedly from traditional 16S rRNA data and improves upon recent phylogenomic approaches. We also found that the rates of gene gain and death are high and unevenly distributed both across species and across gene families, further supporting the utility of the new models of rate heterogeneity applied in a phylogenetic context. Finally, the functional annotation of the most expanded or contracted gene families revealed that the transposable elements and the fatty acid metabolism-related gene families are the most important drivers of gene content evolution in Mycobacterium.
Collapse
Affiliation(s)
- Pablo Librado
- Departament de Genètica and Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, Spain
| | - Filipe G Vieira
- Departament de Genètica and Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, SpainDepartment of Integrative Biology, University of California, Berkeley
| | - Alejandro Sánchez-Gracia
- Departament de Genètica and Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, Spain
| | - Sergios-Orestis Kolokotronis
- Department of Biological Sciences, Fordham UniversitySackler Institute for Comparative Genomics, American Museum of Natural History, New York, New York
| | - Julio Rozas
- Departament de Genètica and Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, Spain
| |
Collapse
|
19
|
Abstract
The genomes of related species contain valuable information on the history of the considered taxa. Great apes in particular exhibit variation of evolutionary patterns along their genomes. However, the great ape data also bring new challenges, such as the presence of incomplete lineage sorting and ancestral shared polymorphisms. Previous methods for genome-scale analysis are restricted to very few individuals or cannot disentangle the contribution of mutation rates and fixation biases. This represents a limitation both for the understanding of these forces as well as for the detection of regions affected by selection. Here, we present a new model designed to estimate mutation rates and fixation biases from genetic variation within and between species. We relax the assumption of instantaneous substitutions, modeling substitutions as mutational events followed by a gradual fixation. Hence, we straightforwardly account for shared ancestral polymorphisms and incomplete lineage sorting. We analyze genome-wide synonymous site alignments of human, chimpanzee, and two orangutan species. From each taxon, we include data from several individuals. We estimate mutation rates and GC-biased gene conversion intensity. We find that both mutation rates and biased gene conversion vary with GC content. We also find lineage-specific differences, with weaker fixation biases in orangutan species, suggesting a reduced historical effective population size. Finally, our results are consistent with directional selection acting on coding sequences in relation to exonic splicing enhancers.
Collapse
Affiliation(s)
- Nicola De Maio
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria
| | | | | |
Collapse
|
20
|
Forest F. Calibrating the Tree of Life: fossils, molecules and evolutionary timescales. Ann Bot 2009; 104:789-94. [PMID: 19666901 PMCID: PMC2749537 DOI: 10.1093/aob/mcp192] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2009] [Revised: 06/11/2009] [Accepted: 07/09/2009] [Indexed: 05/09/2023]
Abstract
BACKGROUND Molecular dating has gained ever-increasing interest since the molecular clock hypothesis was proposed in the 1960s. Molecular dating provides detailed temporal frameworks for divergence events in phylogenetic trees, allowing diverse evolutionary questions to be addressed. The key aspect of the molecular clock hypothesis, namely that differences in DNA or protein sequence between two species are proportional to the time elapsed since they diverged, was soon shown to be untenable. Other approaches were proposed to take into account rate heterogeneity among lineages, but the calibration process, by which relative times are transformed into absolute ages, has received little attention until recently. New methods have now been proposed to resolve potential sources of error associated with the calibration of phylogenetic trees, particularly those involving use of the fossil record. SCOPE AND CONCLUSIONS The use of the fossil record as a source of independent information in the calibration process is the main focus of this paper; other sources of calibration information are also discussed. Particularly error-prone aspects of fossil calibration are identified, such as fossil dating, the phylogenetic placement of the fossil and the incompleteness of the fossil record. Methods proposed to tackle one or more of these potential error sources are discussed (e.g. fossil cross-validation, prior distribution of calibration points and confidence intervals on the fossil record). In conclusion, the fossil record remains the most reliable source of information for the calibration of phylogenetic trees, although associated assumptions and potential bias must be taken into account.
Collapse
Affiliation(s)
- Félix Forest
- Jodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3DS, UK.
| |
Collapse
|
21
|
Wróbel B, Torres-Puente M, Jiménez N, Bracho MA, García-Robles I, Moya A, González-Candelas F. Analysis of the overdispersed clock in the short-term evolution of hepatitis C virus: Using the E1/E2 gene sequences to infer infection dates in a single source outbreak. Mol Biol Evol 2006; 23:1242-53. [PMID: 16585120 PMCID: PMC7542578 DOI: 10.1093/molbev/msk012] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/23/2006] [Indexed: 02/07/2023] Open
Abstract
The assumption of a molecular clock for dating events from sequence information is often frustrated by the presence of heterogeneity among evolutionary rates due, among other factors, to positively selected sites. In this work, our goal is to explore methods to estimate infection dates from sequence analysis. One such method, based on site stripping for clock detection, was proposed to unravel the clocklike molecular evolution in sequences showing high variability of evolutionary rates and in the presence of positive selection. Other alternatives imply accommodating heterogeneity in evolutionary rates at various levels, without eliminating any information from the data. Here we present the analysis of a data set of hepatitis C virus (HCV) sequences from 24 patients infected by a single individual with known dates of infection. We first used a simple criterion of relative substitution rate for site removal prior to a regression analysis. Time was regressed on maximum likelihood pairwise evolutionary distances between the sequences sampled from the source individual and infected patients. We show that it is indeed the fastest evolving sites that disturb the molecular clock and that these sites correspond to positively selected codons. The high computational efficiency of the regression analysis allowed us to compare the site-stripping scheme with random removal of sites. We demonstrate that removing the fast-evolving sites significantly increases the accuracy of estimation of infection times based on a single substitution rate. However, the time-of-infection estimations improved substantially when a more sophisticated and computationally demanding Bayesian method was used. This method was used with the same data set but keeping all the sequence positions in the analysis. Consequently, despite the distortion introduced by positive selection on evolutionary rates, it is possible to obtain quite accurate estimates of infection dates, a result of especial relevance for molecular epidemiology studies.
Collapse
Affiliation(s)
- Borys Wróbel
- Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València, Valencia, Spain.
| | | | | | | | | | | | | |
Collapse
|