1
|
Diamantidis D, Fan WTL, Birkner M, Wakeley J. Bursts of coalescence within population pedigrees whenever big families occur. Genetics 2024; 227:iyae030. [PMID: 38408329 DOI: 10.1093/genetics/iyae030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 01/23/2024] [Accepted: 02/18/2024] [Indexed: 02/28/2024] Open
Abstract
We consider a simple diploid population-genetic model with potentially high variability of offspring numbers among individuals. Specifically, against a backdrop of Wright-Fisher reproduction and no selection, there is an additional probability that a big family occurs, meaning that a pair of individuals has a number of offspring on the order of the population size. We study how the pedigree of the population generated under this model affects the ancestral genetic process of a sample of size two at a single autosomal locus without recombination. Our population model is of the type for which multiple-merger coalescent processes have been described. We prove that the conditional distribution of the pairwise coalescence time given the random pedigree converges to a limit law as the population size tends to infinity. This limit law may or may not be the usual exponential distribution of the Kingman coalescent, depending on the frequency of big families. But because it includes the number and times of big families, it differs from the usual multiple-merger coalescent models. The usual multiple-merger coalescent models are seen as describing the ancestral process marginal to, or averaging over, the pedigree. In the limiting ancestral process conditional on the pedigree, the intervals between big families can be modeled using the Kingman coalescent but each big family causes a discrete jump in the probability of coalescence. Analogous results should hold for larger samples and other population models. We illustrate these results with simulations and additional analysis, highlighting their implications for inference and understanding of multilocus data.
Collapse
Affiliation(s)
| | - Wai-Tong Louis Fan
- Department of Mathematics, Indiana University, Bloomington, IN 47405, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Matthias Birkner
- Institut für Mathematik, Johannes-Gutenberg-Universität, 55099 Mainz, Germany
| | - John Wakeley
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| |
Collapse
|
2
|
Sabin S, Morales-Arce AY, Pfeifer SP, Jensen JD. The impact of frequently neglected model violations on bacterial recombination rate estimation: a case study in Mycobacterium canettii and Mycobacterium tuberculosis. G3 (BETHESDA, MD.) 2022; 12:jkac055. [PMID: 35253851 PMCID: PMC9073693 DOI: 10.1093/g3journal/jkac055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 02/28/2022] [Indexed: 12/04/2022]
Abstract
Mycobacterium canettii is a causative agent of tuberculosis in humans, along with the members of the Mycobacterium tuberculosis complex. Frequently used as an outgroup to the M. tuberculosis complex in phylogenetic analyses, M. canettii is thought to offer the best proxy for the progenitor species that gave rise to the complex. Here, we leverage whole-genome sequencing data and biologically relevant population genomic models to compare the evolutionary dynamics driving variation in the recombining M. canettii with that in the nonrecombining M. tuberculosis complex, and discuss differences in observed genomic diversity in the light of expected levels of Hill-Robertson interference. In doing so, we highlight the methodological challenges of estimating recombination rates through traditional population genetic approaches using sequences called from populations of microorganisms and evaluate the likely mis-inference that arises owing to a neglect of common model violations including purifying selection, background selection, progeny skew, and population size change. In addition, we compare performance when full within-host polymorphism data are utilized, versus the more common approach of basing analyses on within-host consensus sequences.
Collapse
Affiliation(s)
- Susanna Sabin
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA
| | - Ana Y Morales-Arce
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA
| | - Susanne P Pfeifer
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA
| | - Jeffrey D Jensen
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
3
|
Vendrami DLJ, Peck LS, Clark MS, Eldon B, Meredith M, Hoffman JI. Sweepstake reproductive success and collective dispersal produce chaotic genetic patchiness in a broadcast spawner. SCIENCE ADVANCES 2021; 7:eabj4713. [PMID: 34516767 PMCID: PMC8442859 DOI: 10.1126/sciadv.abj4713] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 07/22/2021] [Indexed: 06/13/2023]
Abstract
A long-standing paradox of marine populations is chaotic genetic patchiness (CGP), temporally unstable patterns of genetic differentiation that occur below the geographic scale of effective dispersal. Several mechanisms are hypothesized to explain CGP including natural selection, spatiotemporal fluctuations in larval source populations, self-recruitment, and sweepstake reproduction. Discriminating among them is extremely difficult but is fundamental to understanding how marine organisms reproduce and disperse. Here, we report a notable example of CGP in the Antarctic limpet, an unusually tractable system where multiple confounding explanations can be discounted. Using population genomics, temporally replicated sampling, surface drifters, and forward genetic simulations, we show that CGP likely arises from an extreme sweepstake event together with collective larval dispersal, while selection appears to be unimportant. Our results illustrate the importance of neutral demographic forces in natural populations and have important implications for understanding the recruitment dynamics, population connectivity, local adaptation, and resilience of marine populations.
Collapse
Affiliation(s)
- David L. J. Vendrami
- Department of Animal Behaviour, Bielefeld University, Postfach 100131, 33501 Bielefeld, Germany
| | - Lloyd S. Peck
- British Antarctic Survey, High Cross, Madingley Road, Cambridge CB3 OET, UK
| | - Melody S. Clark
- British Antarctic Survey, High Cross, Madingley Road, Cambridge CB3 OET, UK
| | - Bjarki Eldon
- Leibniz Institute for Evolution and Biodiversity Research, Museum für Naturkunde, 10115 Berlin, Germany
| | - Michael Meredith
- British Antarctic Survey, High Cross, Madingley Road, Cambridge CB3 OET, UK
| | - Joseph I. Hoffman
- Department of Animal Behaviour, Bielefeld University, Postfach 100131, 33501 Bielefeld, Germany
- British Antarctic Survey, High Cross, Madingley Road, Cambridge CB3 OET, UK
| |
Collapse
|
4
|
Harris RB, Jensen JD. Considering Genomic Scans for Selection as Coalescent Model Choice. Genome Biol Evol 2020; 12:871-877. [PMID: 32396636 PMCID: PMC7313662 DOI: 10.1093/gbe/evaa093] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/06/2020] [Indexed: 12/17/2022] Open
Abstract
First inspired by the seminal work of Lewontin and Krakauer (1973. Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics 74(1):175-195.) and Maynard Smith and Haigh (1974. The hitch-hiking effect of a favourable gene. Genet Res. 23(1):23-35.), genomic scans for positive selection remain a widely utilized tool in modern population genomic analysis. Yet, the relative frequency and genomic impact of selective sweeps have remained a contentious point in the field for decades, largely owing to an inability to accurately identify their presence and quantify their effects-with current methodologies generally being characterized by low true-positive rates and/or high false-positive rates under many realistic demographic models. Most of these approaches are based on Wright-Fisher assumptions and the Kingman coalescent and generally rely on detecting outlier regions which do not conform to these neutral expectations. However, previous theoretical results have demonstrated that selective sweeps are well characterized by an alternative class of model known as the multiple-merger coalescent. Taken together, this suggests the possibility of not simply identifying regions which reject the Kingman, but rather explicitly testing the relative fit of a genomic window to the multiple-merger coalescent. We describe the advantages of such an approach, which owe to the branching structure differentiating selective and neutral models, and demonstrate improved power under certain demographic scenarios relative to a commonly used approach. However, regions of the demographic parameter space continue to exist in which neither this approach nor existing methodologies have sufficient power to detect selective sweeps.
Collapse
|
5
|
Inferring Demography and Selection in Organisms Characterized by Skewed Offspring Distributions. Genetics 2019; 211:1019-1028. [PMID: 30651284 DOI: 10.1534/genetics.118.301684] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Accepted: 01/15/2019] [Indexed: 01/01/2023] Open
Abstract
The recent increase in time-series population genomic data from experimental, natural, and ancient populations has been accompanied by a promising growth in methodologies for inferring demographic and selective parameters from such data. However, these methods have largely presumed that the populations of interest are well-described by the Kingman coalescent. In reality, many groups of organisms, including viruses, marine organisms, and some plants, protists, and fungi, typified by high variance in progeny number, may be best characterized by multiple-merger coalescent models. Estimation of population genetic parameters under Wright-Fisher assumptions for these organisms may thus be prone to serious mis-inference. We propose a novel method for the joint inference of demography and selection under the Ψ-coalescent model, termed Multiple-Merger Coalescent Approximate Bayesian Computation, or MMC-ABC. We first demonstrate mis-inference under the Kingman, and then exhibit the superior performance of MMC-ABC under conditions of skewed offspring distributions. In order to highlight the utility of this approach, we reanalyzed previously published drug-selection lines of influenza A virus. We jointly inferred the extent of progeny-skew inherent to viral replication and identified putative drug-resistance mutations.
Collapse
|
6
|
Coalescent Processes with Skewed Offspring Distributions and Nonequilibrium Demography. Genetics 2017; 208:323-338. [PMID: 29127263 DOI: 10.1534/genetics.117.300499] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 10/30/2017] [Indexed: 11/18/2022] Open
Abstract
Nonequilibrium demography impacts coalescent genealogies leaving detectable, well-studied signatures of variation. However, similar genomic footprints are also expected under models of large reproductive skew, posing a serious problem when trying to make inference. Furthermore, current approaches consider only one of the two processes at a time, neglecting any genomic signal that could arise from their simultaneous effects, preventing the possibility of jointly inferring parameters relating to both offspring distribution and population history. Here, we develop an extended Moran model with exponential population growth, and demonstrate that the underlying ancestral process converges to a time-inhomogeneous psi-coalescent. However, by applying a nonlinear change of time scale-analogous to the Kingman coalescent-we find that the ancestral process can be rescaled to its time-homogeneous analog, allowing the process to be simulated quickly and efficiently. Furthermore, we derive analytical expressions for the expected site-frequency spectrum under the time-inhomogeneous psi-coalescent, and develop an approximate-likelihood framework for the joint estimation of the coalescent and growth parameters. By means of extensive simulation, we demonstrate that both can be estimated accurately from whole-genome data. In addition, not accounting for demography can lead to serious biases in the inferred coalescent model, with broad implications for genomic studies ranging from ecology to conservation biology. Finally, we use our method to analyze sequence data from Japanese sardine populations, and find evidence of high variation in individual reproductive success, but few signs of a recent demographic expansion.
Collapse
|
7
|
Molina C, Earn DJD. On selection in finite populations. J Math Biol 2017; 76:645-678. [PMID: 28664222 DOI: 10.1007/s00285-017-1151-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Revised: 05/25/2017] [Indexed: 10/19/2022]
Abstract
Two major forces shaping evolution are drift and selection. The standard models of neutral drift-the Wright-Fisher (WF) and Moran processes-can be extended to include selection. However, these standard models are not always applicable in practice, and-even without selection-many other drift models make very different predictions. For example, "generalised Wright-Fisher" models (so-called because their first two conditional moments agree with those of the WF process) can yield wildly different absorption times from WF. Additionally, evolutionary stability in finite populations depends only on fixation probabilities, which can be evaluated under less restrictive assumptions than those required to estimate fixation times or more complex population-genetic quantities. We therefore distill the notion of a selection process into a broad class of finite-population, mutationless models of drift and selection (including the WF and Moran processes). We characterize when selection favours fixation of one strategy over another, for any selection process, which allows us to derive finite-population conditions for evolutionary stability independent of the selection process. In applications, the precise details of the selection process are seldom known, yet by exploiting these new theoretical results it is now possible to make rigorously justifiable inferences about fixation of traits.
Collapse
Affiliation(s)
- Chai Molina
- Department of Mathematics and Statistics, McMaster University, 1280 Main Street West, Hamilton, ON, L8S 4K1, Canada.
| | - David J D Earn
- Department of Mathematics and Statistics, McMaster University, 1280 Main Street West, Hamilton, ON, L8S 4K1, Canada
| |
Collapse
|
8
|
On the importance of skewed offspring distributions and background selection in virus population genetics. Heredity (Edinb) 2016; 117:393-399. [PMID: 27649621 DOI: 10.1038/hdy.2016.58] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 06/08/2016] [Indexed: 12/16/2022] Open
Abstract
Many features of virus populations make them excellent candidates for population genetic study, including a very high rate of mutation, high levels of nucleotide diversity, exceptionally large census population sizes, and frequent positive selection. However, these attributes also mean that special care must be taken in population genetic inference. For example, highly skewed offspring distributions, frequent and severe population bottleneck events associated with infection and compartmentalization, and strong purifying selection all affect the distribution of genetic variation but are often not taken into account. Here, we draw particular attention to multiple-merger coalescent events and background selection, discuss potential misinference associated with these processes, and highlight potential avenues for better incorporating them into future population genetic analyses.
Collapse
|
9
|
Inference Methods for Multiple Merger Coalescents. Evol Biol 2016. [DOI: 10.1007/978-3-319-41324-2_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
10
|
Árnason E, Halldórsdóttir K. Nucleotide variation and balancing selection at the Ckma gene in Atlantic cod: analysis with multiple merger coalescent models. PeerJ 2015; 3:e786. [PMID: 25755922 PMCID: PMC4349156 DOI: 10.7717/peerj.786] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2014] [Accepted: 02/03/2015] [Indexed: 01/11/2023] Open
Abstract
High-fecundity organisms, such as Atlantic cod, can withstand substantial natural selection and the entailing genetic load of replacing alleles at a number of loci due to their excess reproductive capacity. High-fecundity organisms may reproduce by sweepstakes leading to highly skewed heavy-tailed offspring distribution. Under such reproduction the Kingman coalescent of binary mergers breaks down and models of multiple merger coalescent are more appropriate. Here we study nucleotide variation at the Ckma (Creatine Kinase Muscle type A) gene in Atlantic cod. The gene shows extreme differentiation between the North (Canada, Greenland, Iceland, Norway, Barents Sea) and the South (Faroe Islands, North-, Baltic-, Celtic-, and Irish Seas) with FST > 0.8 between regions whereas neutral loci show no differentiation. This is evidence of natural selection. The protein sequence is conserved by purifying selection whereas silent and non-coding sites show extreme differentiation. The unfolded site-frequency spectrum has three modes, a mode at singleton sites and two high frequency modes at opposite frequencies representing divergent branches of the gene genealogy that is evidence for balancing selection. Analysis with multiple-merger coalescent models can account for the high frequency of singleton sites and indicate reproductive sweepstakes. Coalescent time scales vary with population size and with the inverse of variance in offspring number. Parameter estimates using multiple-merger coalescent models show that times scales are faster than under the Kingman coalescent.
Collapse
Affiliation(s)
- Einar Árnason
- Institute of Life and Environmental Sciences, University of Iceland , Reykjavík , Iceland
| | - Katrín Halldórsdóttir
- Institute of Life and Environmental Sciences, University of Iceland , Reykjavík , Iceland
| |
Collapse
|
11
|
Li J, Zeng Y, Shen D, Xia G, Huang Y, Huang Y, Chang J, Huang J, Wang Z. Development of SSR Markers in Hickory (Carya cathayensis Sarg.) and Their Transferability to Other Species of Carya. Curr Genomics 2014; 15:357-79. [PMID: 25435799 PMCID: PMC4245696 DOI: 10.2174/138920291505141106103734] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Revised: 08/25/2014] [Accepted: 09/11/2014] [Indexed: 12/29/2022] Open
Abstract
Hickory (Carya cathayensis Sarg.), an important nut-producing species in Southeastern China, has high economic value, but so far there has been no cultivar bred under species although it is mostly propagated by seeding and some elite individuals have been found. It has been found recently that this species has a certain rate of apomixis and poor knowledge of its genetic background has influenced development of a feasible breeding strategy. Here in this paper we first release SSR (Simple sequence repeat) markers developed in this species and their transferability to other three species of the same genus, Carya. A total of 311 pairs of SSR primers in hickory were developed based on sequenced cDNAs of a fruit development-associated cDNA library and RNA-seq data of developing female floral buds and could be used to distinguish hickory, C. hunanensis Cheng et R. H. Chang ex R. H. Chang et Lu, C. illinoensis K. Koch (pecan) and C. dabieshanensis M. C. Liu et Z. J. Li, but they were monomorphic in both hickory and C. hunanensis although multi-alleles have been identified in all the four species. There is a transferability rate of 63.02% observed between hickory and pecan and the markers can be applied to study genetic diversity of accessions in pecan. When used in C. dabieshanensis, it was revealed that C. dabieshanensis had the number of alleles per locus ranging from 2 to 4, observed heterozygosity from 0 to 0.6667 and expected heterozygosity from 0.333 to 0.8667, respectively, which supports the existence of C. dabieshanensis as a separate species different from hickory and indicates that there is potential for selection and breeding in this species.
Collapse
Affiliation(s)
- Juan Li
- The Nurturing Station for the State Key Laboratory of Subtropical Silviculture, Zhejiang A & F University, Lin'an 311300, China
| | - Yanru Zeng
- The Nurturing Station for the State Key Laboratory of Subtropical Silviculture, Zhejiang A & F University, Lin'an 311300, China
| | - Dengfeng Shen
- College of Biological Sciences and Technology, Beijing Forestry University, Beijing 100083, China
| | - Guohua Xia
- The Nurturing Station for the State Key Laboratory of Subtropical Silviculture, Zhejiang A & F University, Lin'an 311300, China
| | - Yinzhi Huang
- The Nurturing Station for the State Key Laboratory of Subtropical Silviculture, Zhejiang A & F University, Lin'an 311300, China
| | - Youjun Huang
- The Nurturing Station for the State Key Laboratory of Subtropical Silviculture, Zhejiang A & F University, Lin'an 311300, China
| | - Jun Chang
- Research Institute of Subtropical Forestry, Chinese Academy of Forestry, Fuyang 311400, China
| | - Jianqin Huang
- The Nurturing Station for the State Key Laboratory of Subtropical Silviculture, Zhejiang A & F University, Lin'an 311300, China
| | - Zhengjia Wang
- The Nurturing Station for the State Key Laboratory of Subtropical Silviculture, Zhejiang A & F University, Lin'an 311300, China
| |
Collapse
|
12
|
Tellier A, Lemaire C. Coalescence 2.0: a multiple branching of recent theoretical developments and their applications. Mol Ecol 2014; 23:2637-52. [DOI: 10.1111/mec.12755] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2014] [Revised: 04/08/2014] [Accepted: 04/13/2014] [Indexed: 02/01/2023]
Affiliation(s)
- Aurélien Tellier
- Section of Population Genetics; Center of Life and Food Sciences Weihenstephan; Technische Universität München; 85354 Freising Germany
| | - Christophe Lemaire
- LUNAM; UMR1345 Institut de Recherche en Horticulture et Semences; Université d'Angers; SFR 4207 QUASAV 49045 Angers France
- INRA; UMR1345 Institut de Recherche en Horticulture et Semences; 49071 Beaucouzé France
- AgroCampus-Ouest; UMR1345 Institut de Recherche en Horticulture et Semences; 49045 Angers France
| |
Collapse
|
13
|
Abstract
We study the population genetics of two neutral alleles under reversible mutation in a model that features a skewed offspring distribution, called the Λ-Fleming-Viot process. We describe the shape of the equilibrium allele frequency distribution as a function of the model parameters. We show that the mutation rates can be uniquely identified from this equilibrium distribution, but the form of the offspring distribution cannot itself always be so identified. We introduce an estimator for the mutation rate that is consistent, independent of the form of reproductive skew. We also introduce a two-allele infinite-sites version of the Λ-Fleming-Viot process, and we use it to study how reproductive skew influences standing genetic diversity in a population. We derive asymptotic formulas for the expected number of segregating sites as a function of sample size and offspring distribution. We find that the Wright-Fisher model minimizes the equilibrium genetic diversity, for a given mutation rate and variance effective population size, compared to all other Λ-processes.
Collapse
|
14
|
Birkner M, Blath J, Eldon B. Statistical properties of the site-frequency spectrum associated with lambda-coalescents. Genetics 2013; 195:1037-53. [PMID: 24026094 PMCID: PMC3813835 DOI: 10.1534/genetics.113.156612] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2013] [Accepted: 08/28/2013] [Indexed: 11/18/2022] Open
Abstract
Statistical properties of the site-frequency spectrum associated with Λ-coalescents are our objects of study. In particular, we derive recursions for the expected value, variance, and covariance of the spectrum, extending earlier results of Fu (1995) for the classical Kingman coalescent. Estimating coalescent parameters introduced by certain Λ-coalescents for data sets too large for full-likelihood methods is our focus. The recursions for the expected values we obtain can be used to find the parameter values that give the best fit to the observed frequency spectrum. The expected values are also used to approximate the probability a (derived) mutation arises on a branch subtending a given number of leaves (DNA sequences), allowing us to apply a pseudolikelihood inference to estimate coalescence parameters associated with certain subclasses of Λ-coalescents. The properties of the pseudolikelihood approach are investigated on simulated as well as real mtDNA data sets for the high-fecundity Atlantic cod (Gadus morhua). Our results for two subclasses of Λ-coalescents show that one can distinguish these subclasses from the Kingman coalescent, as well as between the Λ-subclasses, even for a moderate (maybe a few hundred) sample size.
Collapse
Affiliation(s)
- Matthias Birkner
- Institut für Mathematik, Johannes-Gutenberg-Universität, 55099 Mainz, Germany
| | - Jochen Blath
- Institut für Mathematik, Technische Universität Berlin, 10623 Berlin, Germany
| | - Bjarki Eldon
- Institut für Mathematik, Technische Universität Berlin, 10623 Berlin, Germany
| |
Collapse
|
15
|
An ancestral recombination graph for diploid populations with skewed offspring distribution. Genetics 2012; 193:255-90. [PMID: 23150600 DOI: 10.1534/genetics.112.144329] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A large offspring-number diploid biparental multilocus population model of Moran type is our object of study. At each time step, a pair of diploid individuals drawn uniformly at random contributes offspring to the population. The number of offspring can be large relative to the total population size. Similar "heavily skewed" reproduction mechanisms have been recently considered by various authors (cf. e.g., Eldon and Wakeley 2006, 2008) and reviewed by Hedgecock and Pudovkin (2011). Each diploid parental individual contributes exactly one chromosome to each diploid offspring, and hence ancestral lineages can coalesce only when in distinct individuals. A separation-of-timescales phenomenon is thus observed. A result of Möhle (1998) is extended to obtain convergence of the ancestral process to an ancestral recombination graph necessarily admitting simultaneous multiple mergers of ancestral lineages. The usual ancestral recombination graph is obtained as a special case of our model when the parents contribute only one offspring to the population each time. Due to diploidy and large offspring numbers, novel effects appear. For example, the marginal genealogy at each locus admits simultaneous multiple mergers in up to four groups, and different loci remain substantially correlated even as the recombination rate grows large. Thus, genealogies for loci far apart on the same chromosome remain correlated. Correlation in coalescence times for two loci is derived and shown to be a function of the coalescence parameters of our model. Extending the observations by Eldon and Wakeley (2008), predictions of linkage disequilibrium are shown to be functions of the reproduction parameters of our model, in addition to the recombination rate. Correlations in ratios of coalescence times between loci can be high, even when the recombination rate is high and sample size is large, in large offspring-number populations, as suggested by simulations, hinting at how to distinguish between different population models.
Collapse
|
16
|
Coop G, Ralph P. Patterns of neutral diversity under general models of selective sweeps. Genetics 2012; 192:205-24. [PMID: 22714413 PMCID: PMC3430537 DOI: 10.1534/genetics.112.141861] [Citation(s) in RCA: 63] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2012] [Accepted: 06/01/2012] [Indexed: 11/18/2022] Open
Abstract
Two major sources of stochasticity in the dynamics of neutral alleles result from resampling of finite populations (genetic drift) and the random genetic background of nearby selected alleles on which the neutral alleles are found (linked selection). There is now good evidence that linked selection plays an important role in shaping polymorphism levels in a number of species. One of the best-investigated models of linked selection is the recurrent full-sweep model, in which newly arisen selected alleles fix rapidly. However, the bulk of selected alleles that sweep into the population may not be destined for rapid fixation. Here we develop a general model of recurrent selective sweeps in a coalescent framework, one that generalizes the recurrent full-sweep model to the case where selected alleles do not sweep to fixation. We show that in a large population, only the initial rapid increase of a selected allele affects the genealogy at partially linked sites, which under fairly general assumptions are unaffected by the subsequent fate of the selected allele. We also apply the theory to a simple model to investigate the impact of recurrent partial sweeps on levels of neutral diversity and find that for a given reduction in diversity, the impact of recurrent partial sweeps on the frequency spectrum at neutral sites is determined primarily by the frequencies rapidly achieved by the selected alleles. Consequently, recurrent sweeps of selected alleles to low frequencies can have a profound effect on levels of diversity but can leave the frequency spectrum relatively unperturbed. In fact, the limiting coalescent model under a high rate of sweeps to low frequency is identical to the standard neutral model. The general model of selective sweeps we describe goes some way toward providing a more flexible framework to describe genomic patterns of diversity than is currently available.
Collapse
Affiliation(s)
- Graham Coop
- Department of Evolution and Ecology and Center for Population Biology, University of California, Davis, California 95616, USA.
| | | |
Collapse
|
17
|
Abstract
We analyze the dynamics of two alternative alleles in a simple model of a population that allows for large family sizes in the distribution of offspring number. This population model was first introduced by Eldon and Wakeley, who described the backward-time genealogical relationships among sampled individuals, assuming neutrality. We study the corresponding forward-time dynamics of allele frequencies, with or without selection. We derive a continuum approximation, analogous to Kimura's diffusion approximation, and we describe three distinct regimes of behavior that correspond to distinct regimes in the coalescent processes of Eldon and Wakeley. We demonstrate that the effect of selection is strongly amplified in the Eldon-Wakeley model, compared to the Wright-Fisher model with the same variance effective population size. Remarkably, an advantageous allele can even be guaranteed to fix in the Eldon-Wakeley model, despite the presence of genetic drift. We compute the selection coefficient required for such behavior in populations of Pacific oysters, based on estimates of their family sizes. Our analysis underscores that populations with the same effective population size may nevertheless experience radically different forms of genetic drift, depending on the reproductive mechanism, with significant consequences for the resulting allele dynamics.
Collapse
|
18
|
Abstract
To model deviations from selectively neutral genetic variation caused by different forms of selection, it is necessary to first understand patterns of neutral variation. Best understood is neutral genetic variation at a single locus. But, as is well known, additional insights can be gained by investigating multiple loci. The resulting patterns reflect the degree of association (linkage) between loci and provide information about the underlying multilocus gene genealogies. The statistical properties of two-locus gene genealogies have been intensively studied for populations of constant size, as well as for simple demographic histories such as exponential population growth and single bottlenecks. By contrast, the combined effect of recombination and sustained demographic fluctuations is poorly understood. Addressing this issue, we study a two-locus Wright-Fisher model of a population subject to recurrent bottlenecks. We derive coalescent approximations for the covariance of the times to the most recent common ancestor at two loci in samples of two chromosomes. This covariance reflects the degree of association and thus linkage disequilibrium between these loci. We find, first, that an effective population-size approximation describes the numerically observed association between two loci provided that recombination occurs either much faster or much more slowly than the population-size fluctuations. Second, when recombination occurs frequently between but rarely within bottlenecks, we observe that the association of gene histories becomes independent of physical distance over a certain range of distances. Third, we show that in this case, a commonly used measure of linkage disequilibrium, σ(2)(d) (closely related to r(2)), fails to capture the long-range association between two loci. The reason is that constituent terms, each reflecting the long-range association, cancel. Fourth, we analyze a limiting case in which the long-range association can be described in terms of a Xi coalescent allowing for simultaneous multiple mergers of ancestral lines.
Collapse
|
19
|
Generalized population models and the nature of genetic drift. Theor Popul Biol 2011; 80:80-99. [DOI: 10.1016/j.tpb.2011.06.004] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2010] [Revised: 06/07/2011] [Accepted: 06/08/2011] [Indexed: 11/17/2022]
|
20
|
Barton NH, Kelleher J, Etheridge AM. A new model for extinction and recolonization in two dimensions: quantifying phylogeography. Evolution 2011; 64:2701-15. [PMID: 20408876 DOI: 10.1111/j.1558-5646.2010.01019.x] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Classical models of gene flow fail in three ways: they cannot explain large-scale patterns; they predict much more genetic diversity than is observed; and they assume that loosely linked genetic loci evolve independently. We propose a new model that deals with these problems. Extinction events kill some fraction of individuals in a region. These are replaced by offspring from a small number of parents, drawn from the preexisting population. This model of evolution forwards in time corresponds to a backwards model, in which ancestral lineages jump to a new location if they are hit by an event, and may coalesce with other lineages that are hit by the same event. We derive an expression for the identity in allelic state, and show that, over scales much larger than the largest event, this converges to the classical value derived by Wright and Malécot. However, rare events that cover large areas cause low genetic diversity, large-scale patterns, and correlations in ancestry between unlinked loci.
Collapse
Affiliation(s)
- Nicholas H Barton
- Institute of Evolutionary Biology, University of Edinburgh, King's Buildings, West Mains Road, United Kingdom.
| | | | | |
Collapse
|
21
|
Yuan X, Zhang J, Wang Y. Mutual information and linkage disequilibrium based SNP association study by grouping case-control. Genes Genomics 2011. [DOI: 10.1007/s13258-010-0094-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
22
|
Eldon B. Structured coalescent processes from a modified Moran model with large offspring numbers. Theor Popul Biol 2009; 76:92-104. [DOI: 10.1016/j.tpb.2009.05.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2009] [Revised: 04/30/2009] [Accepted: 05/04/2009] [Indexed: 10/20/2022]
|