1
|
Benefits and Limits of Phasing Alleles for Network Inference of Allopolyploid Complexes. Syst Biol 2024:syae024. [PMID: 38733563 DOI: 10.1093/sysbio/syae024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Indexed: 05/13/2024] Open
Abstract
Accurately reconstructing the reticulate histories of polyploids remains a central challenge for understanding plant evolution. Although phylogenetic networks can provide insights into relationships among polyploid lineages, inferring networks may be hindered by the complexities of homology determination in polyploid taxa. We use simulations to show that phasing alleles from allopolyploid individuals can improve phylogenetic network inference under the multispecies coalescent by obtaining the true network with fewer loci compared to haplotype consensus sequences or sequences with heterozygous bases represented as ambiguity codes. Phased allelic data can also improve divergence time estimates for networks, which is helpful for evaluating allopolyploid speciation hypotheses and proposing mechanisms of speciation. To achieve these outcomes in empirical data, we present a novel pipeline that leverages a recently developed phasing algorithm to reliably phase alleles from polyploids. This pipeline is especially appropriate for target enrichment data, where depth of coverage is typically high enough to phase entire loci. We provide an empirical example in the North American Dryopteris fern complex that demonstrates insights from phased data as well as the challenges of network inference. We establish that our pipeline (PATÉ: Phased Alleles from Target Enrichment data) is capable of recovering a high proportion of phased loci from both diploids and polyploids. These data may improve network estimates compared to using haplotype consensus assemblies by accurately inferring the direction of gene flow, but statistical non-identifiability of phylogenetic networks poses a barrier to inferring the evolutionary history of reticulate complexes.
Collapse
|
2
|
Estimation of species divergence times in presence of cross-species gene flow. Syst Biol 2023; 72:820-836. [PMID: 36961245 PMCID: PMC10405360 DOI: 10.1093/sysbio/syad015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Accepted: 03/22/2023] [Indexed: 03/25/2023] Open
Abstract
Cross-species introgression can have significant impacts on phylogenomic reconstruction of species divergence events. Here, we used simulations to show how the presence of even a small amount of introgression can bias divergence time estimates when gene flow is ignored in the analysis. Using advances in analytical methods under the multispecies coalescent (MSC) model, we demonstrate that by accounting for incomplete lineage sorting and introgression using large phylogenomic data sets this problem can be avoided. The multispecies-coalescent-with-introgression (MSci) model is capable of accurately estimating both divergence times and ancestral effective population sizes, even when only a single diploid individual per species is sampled. We characterize some general expectations for biases in divergence time estimation under three different scenarios: 1) introgression between sister species, 2) introgression between non-sister species, and 3) introgression from an unsampled (i.e., ghost) outgroup lineage. We also conducted simulations under the isolation-with-migration (IM) model and found that the MSci model assuming episodic gene flow was able to accurately estimate species divergence times despite high levels of continuous gene flow. We estimated divergence times under the MSC and MSci models from two published empirical datasets with previous evidence of introgression, one of 372 target-enrichment loci from baobabs (Adansonia), and another of 1000 transcriptome loci from 14 species of the tomato relative, Jaltomata. The empirical analyses not only confirm our findings from simulations, demonstrating that the MSci model can reliably estimate divergence times but also show that divergence time estimation under the MSC can be robust to the presence of small amounts of introgression in empirical datasets with extensive taxon sampling. [divergence time; gene flow; hybridization; introgression; MSci model; multispecies coalescent].
Collapse
|
3
|
Phylogenomic analyses provide insights into primate evolution. Science 2023; 380:913-924. [PMID: 37262173 DOI: 10.1126/science.abn6919] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 01/26/2023] [Indexed: 06/03/2023]
Abstract
Comparative analysis of primate genomes within a phylogenetic context is essential for understanding the evolution of human genetic architecture and primate diversity. We present such a study of 50 primate species spanning 38 genera and 14 families, including 27 genomes first reported here, with many from previously less well represented groups, the New World monkeys and the Strepsirrhini. Our analyses reveal heterogeneous rates of genomic rearrangement and gene evolution across primate lineages. Thousands of genes under positive selection in different lineages play roles in the nervous, skeletal, and digestive systems and may have contributed to primate innovations and adaptations. Our study reveals that many key genomic innovations occurred in the Simiiformes ancestral node and may have had an impact on the adaptive radiation of the Simiiformes and human evolution.
Collapse
|
4
|
A first complete phylogenomic hypothesis for diploid blueberries (Vaccinium section Cyanococcus). AMERICAN JOURNAL OF BOTANY 2022; 109:1596-1606. [PMID: 36109839 DOI: 10.1002/ajb2.16065] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 06/08/2022] [Accepted: 08/23/2022] [Indexed: 06/15/2023]
Abstract
PREMISE The true blueberries (Vaccinium sect. Cyanococcus; Ericaceae), endemic to North America, have been intensively studied for over a century. However, with species estimates ranging from nine to 24 and much confusion regarding species boundaries, this ecologically and economically valuable group remains inadequately understood at a basic evolutionary and taxonomic level. As a first step toward understanding the evolutionary history and taxonomy of this species complex, we present the first phylogenomic hypothesis of the known diploid blueberries. METHODS We used flow cytometry to verify the ploidy of putative diploid taxa and a target-enrichment approach to obtain a genomic data set for phylogenetic analyses. RESULTS Despite evidence of gene flow, we found that a primary phylogenetic signal is present. Monophyly for all morphospecies was recovered, with two notable exceptions: one sample of V. boreale was consistently nested in the V. myrtilloides clade and V. caesariense was nested in the V. fuscatum clade. One diploid taxon, Vaccinium pallidum, is implicated as having a homoploid hybrid origin. CONCLUSIONS This foundational study represents the first attempt to elucidate evolutionary relationships of the true blueberries of North America with a phylogenomic approach and sets the stage for multiple avenues of future study such as a taxonomic revision of the group, the verification of a homoploid hybrid taxon, and the study of polyploid lineages within the context of a diploid phylogeny.
Collapse
|
5
|
Population genomic structure in Goodman's mouse lemur reveals long-standing separation of Madagascar's Central Highlands and eastern rainforests. Mol Ecol 2022; 31:4901-4918. [PMID: 35880414 DOI: 10.1111/mec.16632] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2020] [Revised: 06/25/2022] [Accepted: 07/08/2022] [Indexed: 11/28/2022]
Abstract
Madagascar's Central Highlands are largely composed of grasslands, interspersed with patches of forest. The historical perspective was that Madagascar's grasslands had anthropogenic origins, but emerging evidence suggests that grasslands were a component of the pre-human Central Highlands vegetation. Consequently, there is now vigorous debate regarding the extent to which these grasslands have expanded due to anthropogenic pressures. Here, we shed light on the temporal dynamics of Madagascar's vegetative composition by conducting a population genomic investigation of Goodman's mouse lemur (Microcebus lehilahytsara; Cheirogaleidae). These small-bodied primates occur both in Madagascar's eastern rainforests and in the Central Highlands, making them a valuable indicator species. Population divergences among forest-dwelling mammals will reflect changes to their habitat, including fragmentation, whereas patterns of post-divergence gene flow can reveal formerly wooded migration corridors. To explore these patterns, we used RADseq data to infer population genetic structure, demographic models of post-divergence gene flow, and population size change through time. The results offer evidence that open habitats are an ancient component of the Central Highlands, and that wide-spread forest fragmentation occurred naturally during a period of decreased precipitation near the Last Glacial Maximum. Models of gene flow suggest that migration across the Central Highlands has been possible from the Pleistocene through the recent Holocene via riparian corridors. Though our findings support the hypothesis that Central Highland grasslands predate human arrival, we also find evidence for human-mediated population declines. This highlights the extent to which species imminently threatened by human-mediated deforestation may already be vulnerable from paleoclimatic conditions.
Collapse
|
6
|
The mutationathon highlights the importance of reaching standardization in estimates of pedigree-based germline mutation rates. eLife 2022; 11:73577. [PMID: 35018888 PMCID: PMC8830884 DOI: 10.7554/elife.73577] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 01/11/2022] [Indexed: 11/13/2022] Open
Abstract
In the past decade, several studies have estimated the human per-generation germline mutation rate using large pedigrees. More recently, estimates for various nonhuman species have been published. However, methodological differences among studies in detecting germline mutations and estimating mutation rates make direct comparisons difficult. Here, we describe the many different steps involved in estimating pedigree-based mutation rates, including sampling, sequencing, mapping, variant calling, filtering, and appropriately accounting for false-positive and false-negative rates. For each step, we review the different methods and parameter choices that have been used in the recent literature. Additionally, we present the results from a ‘Mutationathon,’ a competition organized among five research labs to compare germline mutation rate estimates for a single pedigree of rhesus macaques. We report almost a twofold variation in the final estimated rate among groups using different post-alignment processing, calling, and filtering criteria, and provide details into the sources of variation across studies. Though the difference among estimates is not statistically significant, this discrepancy emphasizes the need for standardized methods in mutation rate estimations and the difficulty in comparing rates from different studies. Finally, this work aims to provide guidelines for computational and statistical benchmarks for future studies interested in identifying germline mutations from pedigrees.
Collapse
|
7
|
Pedigree-based and phylogenetic methods support surprising patterns of mutation rate and spectrum in the gray mouse lemur. Heredity (Edinb) 2021; 127:233-244. [PMID: 34272504 PMCID: PMC8322134 DOI: 10.1038/s41437-021-00446-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 05/25/2021] [Accepted: 05/26/2021] [Indexed: 02/06/2023] Open
Abstract
Mutations are the raw material on which evolution acts, and knowledge of their frequency and genomic distribution is crucial for understanding how evolution operates at both long and short timescales. At present, the rate and spectrum of de novo mutations have been directly characterized in relatively few lineages. Our study provides the first direct mutation-rate estimate for a strepsirrhine (i.e., the lemurs and lorises), which comprises nearly half of the primate clade. Using high-coverage linked-read sequencing for a focal quartet of gray mouse lemurs (Microcebus murinus), we estimated the mutation rate to be among the highest calculated for a mammal at 1.52 × 10-8 (95% credible interval: 1.28 × 10-8-1.78 × 10-8) mutations/site/generation. Further, we found an unexpectedly low count of paternal mutations, and only a modest overrepresentation of mutations at CpG sites. Despite the surprising nature of these results, we found both the rate and spectrum to be robust to the manipulation of a wide range of computational filtering criteria. We also sequenced a technical replicate to estimate a false-negative and false-positive rate for our data and show that any point estimate of a de novo mutation rate should be considered with a large degree of uncertainty. For validation, we conducted an independent analysis of context-dependent substitution types for gray mouse lemur and five additional primate species for which de novo mutation rates have also been estimated. These comparisons revealed general consistency of the mutation spectrum between the pedigree-based and the substitution-rate analyses for all species compared.
Collapse
|
8
|
The challenge and promise of estimating the de novo mutation rate from whole-genome comparisons among closely related individuals. Mol Ecol 2021; 30:6087-6100. [PMID: 34062029 DOI: 10.1111/mec.16007] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 04/22/2021] [Accepted: 05/26/2021] [Indexed: 12/20/2022]
Abstract
Germline mutations are the raw material for natural selection, driving species evolution and the generation of earth's biodiversity. Without this driver of genetic diversity, life on earth would stagnate. Yet, it is a double-edged sword. An excess of mutations can have devastating effects on fitness and population viability. It is therefore one of the great challenges of molecular ecology to determine the rate and mechanisms by which these mutations accrue across the tree of life. Advances in high-throughput sequencing technologies are providing new opportunities for characterizing the rates and mutational spectra within species and populations thus informing essential evolutionary parameters such as the timing of speciation events, the intricacies of historical demography, and the degree to which lineages are subject to the burdens of mutational load. Here, we will focus on both the challenge and promise of whole-genome comparisons among parents and their offspring from known pedigrees for the detection of germline mutations as they arise in a single generation. The potential of these studies is high, but the field is still in its infancy and much uncertainty remains. Namely, the technical challenges are daunting given that pedigree-based genome comparisons are essentially searching for needles in a haystack given the very low signal to noise ratio. Despite the challenges, we predict that rapidly developing methods for whole-genome comparisons hold great promise for integrating empirically derived estimates of de novo mutation rates and mutation spectra across many molecular ecological applications.
Collapse
|
9
|
Gene-rich UV sex chromosomes harbor conserved regulators of sexual development. SCIENCE ADVANCES 2021; 7:7/27/eabh2488. [PMID: 34193417 PMCID: PMC8245031 DOI: 10.1126/sciadv.abh2488] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 05/14/2021] [Indexed: 05/19/2023]
Abstract
Nonrecombining sex chromosomes, like the mammalian Y, often lose genes and accumulate transposable elements, a process termed degeneration. The correlation between suppressed recombination and degeneration is clear in animal XY systems, but the absence of recombination is confounded with other asymmetries between the X and Y. In contrast, UV sex chromosomes, like those found in bryophytes, experience symmetrical population genetic conditions. Here, we generate nearly gapless female and male chromosome-scale reference genomes of the moss Ceratodon purpureus to test for degeneration in the bryophyte UV sex chromosomes. We show that the moss sex chromosomes evolved over 300 million years ago and expanded via two chromosomal fusions. Although the sex chromosomes exhibit weaker purifying selection than autosomes, we find that suppressed recombination alone is insufficient to drive degeneration. Instead, the U and V sex chromosomes harbor thousands of broadly expressed genes, including numerous key regulators of sexual development across land plants.
Collapse
|
10
|
A target enrichment probe set for resolving the flagellate land plant tree of life. APPLICATIONS IN PLANT SCIENCES 2021; 9:e11406. [PMID: 33552748 PMCID: PMC7845764 DOI: 10.1002/aps3.11406] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 11/05/2020] [Indexed: 05/08/2023]
Abstract
PREMISE New sequencing technologies facilitate the generation of large-scale molecular data sets for constructing the plant tree of life. We describe a new probe set for target enrichment sequencing to generate nuclear sequence data to build phylogenetic trees with any flagellate land plants, including hornworts, liverworts, mosses, lycophytes, ferns, and all gymnosperms. METHODS We leveraged existing transcriptome and genome sequence data to design the GoFlag 451 probes, a set of 56,989 probes for target enrichment sequencing of 451 exons that are found in 248 single-copy or low-copy nuclear genes across flagellate plant lineages. RESULTS Our results indicate that target enrichment using the GoFlag451 probe set can provide large nuclear data sets that can be used to resolve relationships among both distantly and closely related taxa across the flagellate land plants. We also describe the GoFlag 408 probes, an optimized probe set covering 408 of the 451 exons from the GoFlag 451 probe set that is commercialized by RAPiD Genomics. CONCLUSIONS A target enrichment approach using the new probe set provides a relatively low-cost solution to obtain large-scale nuclear sequence data for inferring phylogenetic relationships across flagellate land plants.
Collapse
|
11
|
A target enrichment probe set for resolving the flagellate land plant tree of life. APPLICATIONS IN PLANT SCIENCES 2021. [PMID: 33552748 DOI: 10.1101/2020.05.29.124081] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
PREMISE New sequencing technologies facilitate the generation of large-scale molecular data sets for constructing the plant tree of life. We describe a new probe set for target enrichment sequencing to generate nuclear sequence data to build phylogenetic trees with any flagellate land plants, including hornworts, liverworts, mosses, lycophytes, ferns, and all gymnosperms. METHODS We leveraged existing transcriptome and genome sequence data to design the GoFlag 451 probes, a set of 56,989 probes for target enrichment sequencing of 451 exons that are found in 248 single-copy or low-copy nuclear genes across flagellate plant lineages. RESULTS Our results indicate that target enrichment using the GoFlag451 probe set can provide large nuclear data sets that can be used to resolve relationships among both distantly and closely related taxa across the flagellate land plants. We also describe the GoFlag 408 probes, an optimized probe set covering 408 of the 451 exons from the GoFlag 451 probe set that is commercialized by RAPiD Genomics. CONCLUSIONS A target enrichment approach using the new probe set provides a relatively low-cost solution to obtain large-scale nuclear sequence data for inferring phylogenetic relationships across flagellate land plants.
Collapse
|
12
|
Cryptic Patterns of Speciation in Cryptic Primates: Microendemic Mouse Lemurs and the Multispecies Coalescent. Syst Biol 2020; 70:203-218. [PMID: 32642760 DOI: 10.1093/sysbio/syaa053] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 06/13/2020] [Accepted: 06/23/2020] [Indexed: 12/21/2022] Open
Abstract
Mouse lemurs (Microcebus) are a radiation of morphologically cryptic primates distributed throughout Madagascar for which the number of recognized species has exploded in the past two decades. This taxonomic revision has prompted understandable concern that there has been substantial oversplitting in the mouse lemur clade. Here, we investigate mouse lemur diversity in a region in northeastern Madagascar with high levels of microendemism and predicted habitat loss. We analyzed RADseq data with multispecies coalescent (MSC) species delimitation methods for two pairs of sister lineages that include three named species and an undescribed lineage previously identified to have divergent mtDNA. Marked differences in effective population sizes, levels of gene flow, patterns of isolation-by-distance, and species delimitation results were found among the two pairs of lineages. Whereas all tests support the recognition of the presently undescribed lineage as a separate species, the species-level distinction of two previously described species, M. mittermeieri and M. lehilahytsara is not supported-a result that is particularly striking when using the genealogical discordance index (gdi). Nonsister lineages occur sympatrically in two of the localities sampled for this study, despite an estimated divergence time of less than 1 Ma. This suggests rapid evolution of reproductive isolation in the focal lineages and in the mouse lemur clade generally. The divergence time estimates reported here are based on the MSC calibrated with pedigree-based mutation rates and are considerably more recent than previously published fossil-calibrated relaxed-clock estimates. We discuss the possible explanations for this discrepancy, noting that there are theoretical justifications for preferring the MSC estimates in this case. [Cryptic species; effective population size; microendemism; multispecies coalescent; speciation; species delimitation.].
Collapse
|
13
|
Carey SB, Jenkins J, Lovell JT, Maumus F, Sreedasyam A, Payton AC, Shu S, Tiley GP, Fernandez-pozo N, Barry K, Chen C, Wang M, Lipzen A, Daum C, Saski CA, Mcbreen JC, Conrad RE, Kollar LM, Olsson S, Huttunen S, Landis JB, Burleigh JG, Wickett NJ, Johnson MG, Rensing SA, Grimwood J, Schmutz J, Mcdaniel SF. The Ceratodon purpureus genome uncovers structurally complex, gene rich sex chromosomes.. [PMID: 0 DOI: 10.1101/2020.07.03.163634] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
AbstractNon-recombining sex chromosomes, like the mammalian Y, often lose genes and accumulate transposable elements, a process termed degeneration1,2. The correlation between suppressed recombination and degeneration is clear in animal XY systems1,2, but the absence of recombination is confounded with other asymmetries between the X and Y. In contrast, UV sex chromosomes, like those found in bryophytes, experience symmetrical population genetic conditions3,4. Here we test for degeneration in the bryophyte UV sex chromosome system through genomic comparisons with new female and male chromosome-scale reference genomes of the moss Ceratodon purpureus. We show that the moss sex chromosomes evolved over 300 million years ago and expanded via two chromosomal fusions. Although the sex chromosomes show signs of weaker purifying selection than autosomes, we find suppressed recombination alone is insufficient to drive gene loss on sex-specific chromosomes. Instead, the U and V sex chromosomes harbor thousands of broadly-expressed genes, including numerous key regulators of sexual development across land plants.
Collapse
|
14
|
Comparative Genomic Analysis of the Pheromone Receptor Class 1 Family (V1R) Reveals Extreme Complexity in Mouse Lemurs (Genus, Microcebus) and a Chromosomal Hotspot across Mammals. Genome Biol Evol 2020; 12:3562-3579. [PMID: 31555816 PMCID: PMC6944220 DOI: 10.1093/gbe/evz200] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2019] [Indexed: 12/14/2022] Open
Abstract
Sensory gene families are of special interest for both what they can tell us about molecular evolution and what they imply as mediators of social communication. The vomeronasal type-1 receptors (V1Rs) have often been hypothesized as playing a fundamental role in driving or maintaining species boundaries given their likely function as mediators of intraspecific mate choice, particularly in nocturnal mammals. Here, we employ a comparative genomic approach for revealing patterns of V1R evolution within primates, with a special focus on the small-bodied nocturnal mouse and dwarf lemurs of Madagascar (genera Microcebus and Cheirogaleus, respectively). By doubling the existing genomic resources for strepsirrhine primates (i.e. the lemurs and lorises), we find that the highly speciose and morphologically cryptic mouse lemurs have experienced an elaborate proliferation of V1Rs that we argue is functionally related to their capacity for rapid lineage diversification. Contrary to a previous study that found equivalent degrees of V1R diversity in diurnal and nocturnal lemurs, our study finds a strong correlation between nocturnality and V1R elaboration, with nocturnal lemurs showing elaborate V1R repertoires and diurnal lemurs showing less diverse repertoires. Recognized subfamilies among V1Rs show unique signatures of diversifying positive selection, as might be expected if they have each evolved to respond to specific stimuli. Furthermore, a detailed syntenic comparison of mouse lemurs with mouse (genus Mus) and other mammalian outgroups shows that orthologous mammalian subfamilies, predicted to be of ancient origin, tend to cluster in a densely populated region across syntenic chromosomes that we refer to as a V1R "hotspot."
Collapse
|
15
|
Assessing the Performance of Ks Plots for Detecting Ancient Whole Genome Duplications. Genome Biol Evol 2018; 10:2882-2898. [PMID: 30239709 PMCID: PMC6225891 DOI: 10.1093/gbe/evy200] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/14/2018] [Indexed: 02/06/2023] Open
Abstract
Genomic data have provided evidence of previously unknown ancient whole genome duplications (WGDs) and highlighted the role of WGDs in the evolution of many eukaryotic lineages. Ancient WGDs often are detected by examining distributions of synonymous substitutions per site (Ks) within a genome, or “Ks plots.” For example, WGDs can be detected from Ks plots by using univariate mixture models to identify peaks in Ks distributions. We performed gene family simulation experiments to evaluate the effects of different Ks estimation methods and mixture models on our ability to detect ancient WGDs from Ks plots. The simulation experiments, which accounted for variation in substitution rates and gene duplication and loss rates across gene families, tested the effects of WGD age and gene retention rates following WGD on inferring WGDs from Ks plots. Our simulations reveal limitations of Ks plot analyses. Strict interpretations of mixture model analyses often overestimate the number of WGD events, and Ks plot analyses typically fail to detect WGDs when ≤10% of the duplicated genes are retained following the WGD. However, WGDs can accurately be characterized over an intermediate range of Ks. The simulation results are supported by empirical analyses of transcriptomic data, which also suggest that biases in gene retention likely affect our ability to detect ancient WGDs. Although our results indicate mixture model results should be interpreted with great caution, using node-averaged Ks estimates and applying more appropriate mixture models can improve the accuracy of detecting WGDs.
Collapse
|
16
|
Comparison of the Chinese bamboo partridge and red Junglefowl genome sequences highlights the importance of demography in genome evolution. BMC Genomics 2018; 19:336. [PMID: 29739321 PMCID: PMC5941490 DOI: 10.1186/s12864-018-4711-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Accepted: 04/23/2018] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Recent large-scale whole genome sequencing efforts in birds have elucidated broad patterns of avian phylogeny and genome evolution. However, despite the great interest in economically important phasianids like Gallus gallus (Red Junglefowl, the progenitor of the chicken), we know little about the genomes of closely related species. Gallus gallus is highly sexually dichromatic and polygynous, but its sister genus, Bambusicola, is smaller, sexually monomorphic, and monogamous with biparental care. We sequenced the genome of Bambusicola thoracicus (Chinese Bamboo Partridge) using a single insert library to test hypotheses about genome evolution in galliforms. Selection acting at the phenotypic level could result in more evidence of positive selection in the Gallus genome than in Bambusicola. However, the historical range size of Bambusicola was likely smaller than Gallus, and demographic effects could lead to higher rates of nonsynonymous substitution in Bambusicola than in Gallus. RESULTS We generated a genome assembly suitable for evolutionary analyses. We examined the impact of selection on coding regions by examining shifts in the average nonsynonymous to synonymous rate ratio (dN/dS) and the proportion of sites subject to episodic positive selection. We observed elevated dN/dS in Bambusicola relative to Gallus, which is consistent with our hypothesis that demographic effects may be important drivers of genome evolution in Bambusicola. We also demonstrated that alignment error can greatly inflate estimates of the number of genes that experienced episodic positive selection and heterogeneity in dN/dS. However, overall patterns of molecular evolution were robust to alignment uncertainty. Bambusicola thoracicus has higher estimates of heterozygosity than Gallus gallus, possibly due to migration events over the past 100,000 years. CONCLUSIONS Our results emphasized the importance of demographic processes in generating the patterns of variation between Bambusicola and Gallus. We also demonstrated that genome assemblies generated using a single library can provide valuable insights into avian evolutionary history and found that it is important to account for alignment uncertainty in evolutionary inferences from draft genomes.
Collapse
|
17
|
|
18
|
Evaluating and Characterizing Ancient Whole-Genome Duplications in Plants with Gene Count Data. Genome Biol Evol 2016; 8:1023-37. [PMID: 26988251 PMCID: PMC4860690 DOI: 10.1093/gbe/evw058] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Whole-genome duplications (WGDs) have helped shape the genomes of land plants, and recent evidence suggests that the genomes of all angiosperms have experienced at least two ancient WGDs. In plants, WGDs often are followed by rapid fractionation, in which many homeologous gene copies are lost. Thus, it can be extremely difficult to identify, let alone characterize, ancient WGDs. In this study, we use a new maximum likelihood estimator to test for evidence of ancient WGDs in land plants and estimate the fraction of new genes copies that are retained following a WGD using gene count data, the number of gene copies in gene families. We identified evidence of many putative ancient WGDs in land plants and found that the genome fractionation rates vary tremendously among ancient WGDs. Analyses of WGDs within Brassicales also indicate that background gene duplication and loss rates vary across land plants, and different gene families have different probabilities of being retained following a WGD. Although our analyses are largely robust to errors in duplication and loss rates and the choice of priors, simulations indicate that this method can have trouble detecting multiple WGDs that occur on the same branch, especially when the gene retention rates for ancient WGDs are very low. They also suggest that we should carefully evaluate evidence for some ancient plant WGD hypotheses.
Collapse
|
19
|
Erratum to: The relationship of recombination rate, genome structure, and patterns of molecular evolution across angiosperms. BMC Evol Biol 2015; 15:244. [PMID: 26553259 PMCID: PMC4640390 DOI: 10.1186/s12862-015-0525-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Accepted: 10/28/2015] [Indexed: 11/19/2022] Open
|
20
|
Phylogenetics and diversification of morning glories (tribe Ipomoeeae, Convolvulaceae) based on whole plastome sequences. AMERICAN JOURNAL OF BOTANY 2014; 101:92-103. [PMID: 24375828 DOI: 10.3732/ajb.1300207] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
PREMISE OF THE STUDY Morning glories are an emerging model system, and resolving phylogenetic relationships is critical for understanding their evolution. Phylogenetic studies demonstrated that the largest morning glory genus, Ipomoea, is not monophyletic, and nine other genera are derived from within Ipomoea. Therefore, systematic research is focused on the monophyletic tribe Ipomoeeae (ca. 650-900 species). We used whole plastomes to infer relationships across Ipomoeeae. METHODS Whole plastomes were sequenced for 29 morning glory species, representing major lineages. Phylogenies were estimated using alignments of 82 plastid genes and whole plastomes. Divergence times were estimated using three fossil calibration points. Finally, evolution of root architecture, flower color, and ergot alkaloid presence was examined. KEY RESULTS Phylogenies estimated from both data sets had nearly identical topologies. Phylogenetic results are generally consistent with prior phylogenetic hypotheses. Higher-level relationships with weak support in previous studies were recovered here with strong support. Molecular dating analysis suggests a late Eocene divergence time for the Ipomoeeae. The two clades within the tribe, Argyreiinae and Astripomoeinae, diversified at similar times. Reconstructed most recent common ancestor of the Ipomoeeae had blue flowers, an association with ergot-producing fungi, and either tuberous or fibrous roots. CONCLUSIONS Phylogenetic results provide confidence in relationships among Ipomoeeae lineages. Divergence time estimation results provide a temporal context for diversification of morning glories. Ancestral character reconstructions support previous findings that morning glory morphology is evolutionarily labile. Taken together, our study provides strong resolution of the morning glory phylogeny, which is broadly applicable to the evolution and ecology of these fascinating species.
Collapse
|