1
|
Guerrero Montero J, Blythe RA. Self-contained Beta-with-Spikes approximation for inference under a Wright-Fisher model. Genetics 2023; 225:iyad092. [PMID: 37226886 PMCID: PMC10550310 DOI: 10.1093/genetics/iyad092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 03/10/2023] [Accepted: 05/10/2023] [Indexed: 05/26/2023] Open
Abstract
We construct a reliable estimation method for evolutionary parameters within the Wright-Fisher model, which describes changes in allele frequencies due to selection and genetic drift, from time-series data. Such data exist for biological populations, for example via artificial evolution experiments, and for the cultural evolution of behavior, such as linguistic corpora that document historical usage of different words with similar meanings. Our method of analysis builds on a Beta-with-Spikes approximation to the distribution of allele frequencies predicted by the Wright-Fisher model. We introduce a self-contained scheme for estimating parameters in the approximation, and demonstrate its robustness with synthetic data, especially in the strong-selection and near-extinction regimes where previous approaches fail. We further apply the method to allele frequency data for baker's yeast (Saccharomyces cerevisiae), finding a significant signal of selection in cases where independent evidence supports such a conclusion. We further demonstrate the possibility of detecting time points at which evolutionary parameters change in the context of a historical spelling reform in the Spanish language.
Collapse
Affiliation(s)
- Juan Guerrero Montero
- SUPA, School of Physics and Astronomy, University of Edinburgh, Edinburgh, EH9 3FD, UK
| | - Richard A Blythe
- Corresponding author: SUPA, School of Physics and Astronomy, University of Edinburgh, Edinburgh EH9 3FD, UK.
| |
Collapse
|
2
|
Boitard S, Liaubet L, Paris C, Fève K, Dehais P, Bouquet A, Riquet J, Mercat MJ. Whole-genome sequencing of cryopreserved resources from French Large White pigs at two distinct sampling times reveals strong signatures of convergent and divergent selection between the dam and sire lines. Genet Sel Evol 2023; 55:13. [PMID: 36864379 PMCID: PMC9979506 DOI: 10.1186/s12711-023-00789-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 02/15/2023] [Indexed: 03/04/2023] Open
Abstract
BACKGROUND Numerous genomic scans for positive selection have been performed in livestock species within the last decade, but often a detailed characterization of the detected regions (gene or trait under selection, timing of selection events) is lacking. Cryopreserved resources stored in reproductive or DNA gene banks offer a great opportunity to improve this characterization by providing direct access to recent allele frequency dynamics, thereby differentiating between signatures from recent breeding objectives and those related to more ancient selection constraints. Improved characterization can also be achieved by using next-generation sequencing data, which helps narrowing the size of the detected regions while reducing the number of associated candidate genes. METHODS We estimated genetic diversity and detected signatures of recent selection in French Large White pigs by sequencing the genomes of 36 animals from three distinct cryopreserved samples: two recent samples from dam (LWD) and sire (LWS) lines, which had diverged from 1995 and were selected under partly different objectives, and an older sample from 1977 prior to the divergence. RESULTS French LWD and LWS lines have lost approximately 5% of the SNPs that segregated in the 1977 ancestral population. Thirty-eight genomic regions under recent selection were detected in these lines and the corresponding selection events were further classified as convergent between lines (18 regions), divergent between lines (10 regions), specific to the dam line (6 regions) or specific to the sire line (4 regions). Several biological functions were found to be significantly enriched among the genes included in these regions: body size, body weight and growth regardless of the category, early life survival and calcium metabolism more specifically in the signatures in the dam line and lipid and glycogen metabolism more specifically in the signatures in the sire line. Recent selection on IGF2 was confirmed and several other regions were linked to a single candidate gene (ARHGAP10, BMPR1B, GNA14, KATNA1, LPIN1, PKP1, PTH, SEMA3E or ZC3HAV1, among others). CONCLUSIONS These results illustrate that sequencing the genome of animals at several recent time points generates considerable insight into the traits, genes and variants under recent selection in a population. This approach could be applied to other livestock populations, e.g. by exploiting the rich biological resources stored in cryobanks.
Collapse
Affiliation(s)
- Simon Boitard
- CBGP, CIRAD, INRAE, Institut Agro, IRD, Université de Montpellier, Montferrier-sur-Lez, France. .,GenPhySE, INRAE, INP, Université de Toulouse, Castanet-Tolosan, France.
| | - Laurence Liaubet
- grid.507621.7GenPhySE, INRAE, INP, Université de Toulouse, Castanet-Tolosan, France
| | - Cyriel Paris
- grid.507621.7GenPhySE, INRAE, INP, Université de Toulouse, Castanet-Tolosan, France
| | - Katia Fève
- grid.507621.7GenPhySE, INRAE, INP, Université de Toulouse, Castanet-Tolosan, France
| | - Patrice Dehais
- grid.507621.7GenPhySE, INRAE, INP, Université de Toulouse, Castanet-Tolosan, France
| | - Alban Bouquet
- IFIP Institut du porc/Alliance R & D, Le Rheu, France
| | - Juliette Riquet
- grid.507621.7GenPhySE, INRAE, INP, Université de Toulouse, Castanet-Tolosan, France
| | | |
Collapse
|
3
|
Wang J. MLNe: Simulating and Estimating Effective Size and Migration Rate from Temporal Changes in Allele Frequencies. J Hered 2022; 113:563-567. [PMID: 35932284 DOI: 10.1093/jhered/esac039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 08/03/2022] [Indexed: 11/12/2022] Open
Abstract
In studies of molecular ecology, conservation biology and evolutionary biology, the current or recent effective size (Ne) of a population is frequently estimated from the marker genotype data of two or more temporally spaced samples of individuals taken from the population. Despite the developments of numerous Bayesian, likelihood and moment estimators, only a couple of them can use both temporally and spatially spaced samples of individuals to estimate jointly the effective size (Ne) of and the migration rate (m) into a population. In this note I describe new implementations of these joint estimators of Ne and m in software MLNe which runs on multiple platforms (Windows, Mac, Linux) with or without a graphical user interface (GUI), has an integrated simulation module to simulate genotype data for investigating the impacts of various factors (such as sample size and sampling interval) on estimation precision and accuracy, exploits both Message Passing Interface (MPI) and openMP for parallel computations using multiple cores and nodes to speed up analysis. The program does not require data pre-processing and accepts multiple formats of a file of original genotype data and a file of parameters as input. The GUI facilitates data and parameter inputs and produces publication-quality output graphs, while the non-GUI version of software is convenient for batch analysis of multiple datasets as in simulations. MLNe will help advance the analysis of temporal genetic marker data for estimating Ne of and m between populations, which are important parameters that will help biologists for the conservation management of natural and managed populations. MLNe can be downloaded free from the website http://www.zsl.org/science/research/software/.
Collapse
Affiliation(s)
- Jinliang Wang
- Institute of Zoology, Zoological Society of London, London NW1 4RY, United Kingdom
| |
Collapse
|
4
|
Nadachowska‐Brzyska K, Konczal M, Babik W. Navigating the temporal continuum of effective population size. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13740] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
| | | | - Wieslaw Babik
- Jagiellonian University in Kraków Faculty of Biology Institute of Environmental Sciences Kraków Poland
| |
Collapse
|
5
|
Pérez de Rosas AR, Restelli MF, García BA. Spatio‐temporal genetic structure in populations of the Chagas’ disease vector
Triatoma infestans
from Argentina. J ZOOL SYST EVOL RES 2021. [DOI: 10.1111/jzs.12552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Alicia Raquel Pérez de Rosas
- Instituto de Investigaciones en Ciencias de la Salud (INICSA) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) and Cátedra de Bioquímica y Biología Molecular Facultad de Ciencias Médicas Universidad Nacional de Córdoba Córdoba Argentina
| | - María Florencia Restelli
- Instituto de Investigaciones en Ciencias de la Salud (INICSA) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) and Cátedra de Bioquímica y Biología Molecular Facultad de Ciencias Médicas Universidad Nacional de Córdoba Córdoba Argentina
| | - Beatriz Alicia García
- Instituto de Investigaciones en Ciencias de la Salud (INICSA) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) and Cátedra de Bioquímica y Biología Molecular Facultad de Ciencias Médicas Universidad Nacional de Córdoba Córdoba Argentina
| |
Collapse
|
6
|
Kidner J, Theodorou P, Engler JO, Taubert M, Husemann M. A brief history and popularity of methods and tools used to estimate micro-evolutionary forces. Ecol Evol 2021; 11:13723-13743. [PMID: 34707813 PMCID: PMC8525119 DOI: 10.1002/ece3.8076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 07/12/2021] [Accepted: 08/12/2021] [Indexed: 11/30/2022] Open
Abstract
Population genetics is a field of research that predates the current generations of sequencing technology. Those approaches, that were established before massively parallel sequencing methods, have been adapted to these new marker systems (in some cases involving the development of new methods) that allow genome-wide estimates of the four major micro-evolutionary forces-mutation, gene flow, genetic drift, and selection. Nevertheless, classic population genetic markers are still commonly used and a plethora of analysis methods and programs is available for these and high-throughput sequencing (HTS) data. These methods employ various and diverse theoretical and statistical frameworks, to varying degrees of success, to estimate similar evolutionary parameters making it difficult to get a concise overview across the available approaches. Presently, reviews on this topic generally focus on a particular class of methods to estimate one or two evolutionary parameters. Here, we provide a brief history of methods and a comprehensive list of available programs for estimating micro-evolutionary forces. We furthermore analyzed their usage within the research community based on popularity (citation bias) and discuss the implications of this bias for the software community. We found that a few programs received the majority of citations, with program success being independent of both the parameters estimated and the computing platform. The only deviation from a model of exponential growth in the number of citations was found for the presence of a graphical user interface (GUI). Interestingly, no relationship was found for the impact factor of the journals, when the tools were published, suggesting accessibility might be more important than visibility.
Collapse
Affiliation(s)
- Jonathan Kidner
- General Zoology Institute for Biology Martin Luther University Halle-Wittenberg Halle (Saale) Germany
| | - Panagiotis Theodorou
- General Zoology Institute for Biology Martin Luther University Halle-Wittenberg Halle (Saale) Germany
| | - Jan O Engler
- Terrestrial Ecology Unit Department of Biology Ghent University Ghent Belgium
| | - Martin Taubert
- Aquatic Geomicrobiology Institute for Biodiversity Friedrich Schiller University Jena Jena Germany
| | - Martin Husemann
- General Zoology Institute for Biology Martin Luther University Halle-Wittenberg Halle (Saale) Germany
- Centrum für Naturkunde University of Hamburg Hamburg Germany
| |
Collapse
|
7
|
Lynch M, Ho WC. The Limits to Estimating Population-Genetic Parameters with Temporal Data. Genome Biol Evol 2021; 12:443-455. [PMID: 32181820 PMCID: PMC7197491 DOI: 10.1093/gbe/evaa056] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/13/2020] [Indexed: 11/14/2022] Open
Abstract
The ability to obtain genome-wide sequences of very large numbers of individuals from natural populations raises questions about optimal sampling designs and the limits to extracting information on key population-genetic parameters from temporal-survey data. Methods are introduced for evaluating whether observed temporal fluctuations in allele frequencies are consistent with the hypothesis of random genetic drift, and expressions for the expected sampling variances for the relevant statistics are given in terms of sample sizes and numbers. Estimation methods and aspects of statistical reliability are also presented for the mean and temporal variance of selection coefficients. For nucleotide sites that pass the test of neutrality, the current effective population size can be estimated by a method of moments, and expressions for its sampling variance provide insight into the degree to which such methodology can yield meaningful results under alternative sampling schemes. Finally, some caveats are raised regarding the use of the temporal covariance of allele-frequency change to infer selection. Taken together, these results provide a statistical view of the limits to population-genetic inference in even the simplest case of a closed population.
Collapse
Affiliation(s)
- Michael Lynch
- Biodesign Center for Mechanisms of Evolution, Arizona State University
| | - Wei-Chin Ho
- Biodesign Center for Mechanisms of Evolution, Arizona State University
| |
Collapse
|
8
|
Tsuzuki Y, Sato MP, Matsuo A, Suyama Y, Ohara M. Genetic consequences of habitat fragmentation in a perennial plant
Trillium camschatcense
are subjected to its slow‐paced life history. POPUL ECOL 2021. [DOI: 10.1002/1438-390x.12093] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Yoichi Tsuzuki
- Graduate School of Environmental Science Hokkaido University Sapporo Hokkaido Japan
| | - Mitsuhiko P. Sato
- Kawatabi Field Science Center Graduate School of Agricultural Science, Tohoku University Osaki Miyagi Japan
| | - Ayumi Matsuo
- Kawatabi Field Science Center Graduate School of Agricultural Science, Tohoku University Osaki Miyagi Japan
| | - Yoshihisa Suyama
- Kawatabi Field Science Center Graduate School of Agricultural Science, Tohoku University Osaki Miyagi Japan
| | - Masashi Ohara
- Graduate School of Environmental Science Hokkaido University Sapporo Hokkaido Japan
| |
Collapse
|
9
|
Nadachowska-Brzyska K, Dutoit L, Smeds L, Kardos M, Gustafsson L, Ellegren H. Genomic inference of contemporary effective population size in a large island population of collared flycatchers (Ficedula albicollis). Mol Ecol 2021; 30:3965-3973. [PMID: 34145933 DOI: 10.1111/mec.16025] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Revised: 05/28/2021] [Accepted: 06/01/2021] [Indexed: 12/30/2022]
Abstract
Due to its central importance to many aspects of evolutionary biology and population genetics, the long-term effective population size (Ne ) has been estimated for numerous species and populations. However, estimating contemporary Ne is difficult and in practice this parameter is often unknown. In principle, contemporary Ne can be estimated using either analyses of temporal changes in allele frequencies, or the extent of linkage disequilibrium (LD) between unlinked markers. We applied these approaches to estimate contemporary Ne of a relatively recently founded island population of collared flycatchers (Ficedula albicollis). We sequenced the genomes of 85 birds sampled in 1993 and 2015, and applied several temporal methods to estimate Ne at a few thousand (4000-7000). The approach based on LD provided higher estimates of Ne (20,000-32,000) and was associated with high variance, often resulting in infinite Ne . We conclude that whole-genome sequencing data offers new possibilities to estimate high (>1000) contemporary Ne , but also note that such estimates remain challenging, in particular for LD-based methods for contemporary Ne estimation.
Collapse
Affiliation(s)
- Krystyna Nadachowska-Brzyska
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden.,Institute of Environmental Sciences, Jagiellonian University, Kraków, Poland
| | - Ludovic Dutoit
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden.,Department of Zoology, University of Otago, Dunedin, New Zealand
| | - Linnéa Smeds
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Martin Kardos
- Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Seattle, WA, USA
| | - Lars Gustafsson
- Department of Animal Ecology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | - Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| |
Collapse
|
10
|
Hui TYJ, Brenas JH, Burt A. Contemporary N e estimation using temporally spaced data with linked loci. Mol Ecol Resour 2021; 21:2221-2230. [PMID: 33950582 PMCID: PMC8518636 DOI: 10.1111/1755-0998.13412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 04/23/2021] [Accepted: 04/27/2021] [Indexed: 11/30/2022]
Abstract
The contemporary effective population size Ne is important in many disciplines including population genetics, conservation science and pest management. One of the most popular methods of estimating this quantity uses temporal changes in allele frequency due to genetic drift. A significant assumption of the existing methods is the independence among loci while constructing confidence intervals (CI), which restricts the types of species or genetic data applicable to the methods. Although genetic linkage does not bias point Ne estimates, applying these methods to linked loci can yield unreliable CI that are far too narrow. We extend the current methods to enable the use of many linked loci to produce precise contemporary Ne estimates, while preserving the targeted CI width and coverage. This is achieved by deriving the covariance of changes in allele frequency at linked loci in the face of recombination and sampling errors, such that the extra sampling variance due to between‐locus correlation is properly handled. Extensive simulations are used to verify the new method. We apply the method to two temporally spaced genomic data sets of Anopheles mosquitoes collected from a cluster of villages in Burkina Faso between 2012 and 2014. With over 33,000 linked loci considered, the Ne estimate for Anopheles coluzzii is 9,242 (95% CI 5,702–24,282), and for Anopheles gambiae it is 4,826 (95% CI 3,602–7,353).
Collapse
Affiliation(s)
- Tin-Yu J Hui
- Department of Life Sciences, Silwood Park Campus, Imperial College London, Ascot, UK
| | - Jon Haël Brenas
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.,Wellcome Sanger Institute, Wellcome Trust Genome Campus, Saffron Walden, UK
| | - Austin Burt
- Department of Life Sciences, Silwood Park Campus, Imperial College London, Ascot, UK
| |
Collapse
|
11
|
Boitard S, Paris C, Sevane N, Servin B, Bazi-Kabbaj K, Dunner S. Gene Banks as Reservoirs to Detect Recent Selection: The Example of the Asturiana de los Valles Bovine Breed. Front Genet 2021; 12:575405. [PMID: 33633776 PMCID: PMC7901938 DOI: 10.3389/fgene.2021.575405] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Accepted: 01/05/2021] [Indexed: 11/13/2022] Open
Abstract
Gene banks, framed within the efforts for conserving animal genetic resources to ensure the adaptability of livestock production systems to population growth, income, and climate change challenges, have emerged as invaluable resources for biodiversity and scientific research. Allele frequency trajectories over the few last generations contain rich information about the selection history of populations, which cannot be obtained from classical selection scan approaches based on present time data only. Here we apply a new statistical approach taking advantage of genomic time series and a state of the art statistic (nSL) based on present time data to disentangle both old and recent signatures of selection in the Asturiana de los Valles cattle breed. This local Spanish originally multipurpose breed native to Asturias has been selected for beef production over the last few generations. With the use of SNP chip and whole-genome sequencing (WGS) data, we detect candidate regions under selection reflecting the effort of breeders to produce economically valuable beef individuals, e.g., by improving carcass and meat traits with genes such as MSTN, FLRT2, CRABP2, ZNF215, RBPMS2, OAZ2, or ZNF609, while maintaining the ability to thrive under a semi-intensive production system, with the selection of immune (GIMAP7, GIMAP4, GIMAP8, and TICAM1) or olfactory receptor (OR2D2, OR2D3, OR10A4, and 0R6A2) genes. This kind of information will allow us to take advantage of the invaluable resources provided by gene bank collections from local less competitive breeds, enabling the livestock industry to exploit the different mechanisms fine-tuned by natural and human-driven selection on different populations to improve productivity.
Collapse
Affiliation(s)
- Simon Boitard
- GenPhySE, Université de Toulouse, INRA, INPT, INP-ENVT, Castanet-Tolosan, France
| | - Cyriel Paris
- GenPhySE, Université de Toulouse, INRA, INPT, INP-ENVT, Castanet-Tolosan, France
| | - Natalia Sevane
- Dpto. Animal Production, Facultad de Veterinaria, Universidad Complutense de Madrid, Madrid, Spain
| | - Bertrand Servin
- GenPhySE, Université de Toulouse, INRA, INPT, INP-ENVT, Castanet-Tolosan, France
| | - Kenza Bazi-Kabbaj
- GABI, INRAE, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France.,SIGENAE, INRA, Jouy-en-Josas, France
| | - Susana Dunner
- Dpto. Animal Production, Facultad de Veterinaria, Universidad Complutense de Madrid, Madrid, Spain
| |
Collapse
|
12
|
Mating System in a Native Norway Spruce (Picea abies [L.] KARST.) Stand-Relatedness and Effective Pollen Population Size Show an Association with the Germination Percentage of Single Tree Progenies. DIVERSITY 2020. [DOI: 10.3390/d12070266] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Norway spruce differs little in neutral genetic markers among populations and provenances often reported, but in terms of putative adaptive traits and their candidate genes, some clear differences have been observed. This has previously been shown for crown morphotypes. Stands with mostly narrow crown shapes are adapted to high elevation conditions, but these stands are scattered, and the forest area is often occupied by planted stands with predominantly broad crowned morphotypes. This raises questions on whether this differentiation can remain despite gene flow, and on the level of gene flow between natural and planted stands growing in close neighbourhood. The locally adapted stands are a valuable seed source, the progeny of which is expected to have high genetic quality and germination ability. The presented case study is useful for spruce plantation by demonstrating evaluation of these expectations. Immigrant pollen and seeds from planted trees could be maladaptive and may alter the genetic composition of the progeny. This motivated us to study single tree progenies in a locally adapted stand with narrow crowned trees in a partial mast year at nuclear genomic simple sequence repeat (SSR) markers. Spruce is a typical open-pollinated conifer tree species with very low selfing rates, which were also observed in our study (s = 0.3–2.1%) and could be explained by efficient cross-pollination and postzygotic early embryo abortion, common in conifers. The estimated high amount of immigrant pollen found in the pooled seed lot (70.2–91.5%) is likely to influence the genetic composition of the seedlings. Notably, for individual mother trees located in the centre of the stand, up to 50% of the pollen was characterised as local. Seeds from these trees are therefore considered to retain most of the adaptive variance of the stand. Germination percentage varied greatly between half-sib families (3.6–61.9%) and was negatively correlated with relatedness and positively with effective pollen population size of the respective families. As pollen mostly originated from outside the stand and no family structures in the stand itself were found, germination differences can likely be explained by diversity differences in the individual pollen cloud.
Collapse
|
13
|
Furlan EM, Gruber B, Attard CRM, Wager RNE, Kerezsy A, Faulks LK, Beheregaray LB, Unmack PJ. Assessing the benefits and risks of translocations in depauperate species: A theoretical framework with an empirical validation. J Appl Ecol 2020. [DOI: 10.1111/1365-2664.13581] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Elise M. Furlan
- Institute for Applied Ecology University of Canberra Canberra ACT Australia
| | - Bernd Gruber
- Institute for Applied Ecology University of Canberra Canberra ACT Australia
| | - Catherine R. M. Attard
- Molecular Ecology Laboratory College of Science and Engineering Flinders University Adelaide SA Australia
| | - Robert N. E. Wager
- Bush Heritage Australia Melbourne Vic. Australia
- “Rockatoo” Esk Qld Australia
| | - Adam Kerezsy
- Bush Heritage Australia Melbourne Vic. Australia
- DrFishContracting Lake Cargelligo NSW Australia
| | - Leanne K. Faulks
- Sugadaira Research Station Mountain Science Center University of Tsukuba Tsukuba Japan
| | - Luciano B. Beheregaray
- Molecular Ecology Laboratory College of Science and Engineering Flinders University Adelaide SA Australia
| | - Peter J. Unmack
- Institute for Applied Ecology University of Canberra Canberra ACT Australia
| |
Collapse
|
14
|
Inference of Selection from Genetic Time Series Using Various Parametric Approximations to the Wright-Fisher Model. G3-GENES GENOMES GENETICS 2019; 9:4073-4086. [PMID: 31597676 PMCID: PMC6893182 DOI: 10.1534/g3.119.400778] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Detecting genomic regions under selection is an important objective of population genetics. Typical analyses for this goal are based on exploiting genetic diversity patterns in present time data but rapid advances in DNA sequencing have increased the availability of time series genomic data. A common approach to analyze such data is to model the temporal evolution of an allele frequency as a Markov chain. Based on this principle, several methods have been proposed to infer selection intensity. One of their differences lies in how they model the transition probabilities of the Markov chain. Using the Wright-Fisher model is a natural choice but its computational cost is prohibitive for large population sizes so approximations to this model based on parametric distributions have been proposed. Here, we compared the performance of some of these approximations with respect to their power to detect selection and their estimation of the selection coefficient. We developped a new generic Hidden Markov Model likelihood calculator and applied it on genetic time series simulated under various evolutionary scenarios. The Beta with spikes approximation, which combines discrete fixation probabilities with a continuous Beta distribution, was found to perform consistently better than the others. This distribution provides an almost perfect fit to the Wright-Fisher model in terms of selection inference, for a computational cost that does not increase with population size. We further evaluated this model for population sizes not accessible to the Wright-Fisher model and illustrated its performance on a dataset of two divergently selected chicken populations.
Collapse
|
15
|
Tian-Bi YNT, Konan JNK, Sangaré A, Ortega-Abboud E, Utzinger J, N'Goran EK, Jarne P. Spatio-temporal population genetic structure, relative to demographic and ecological characteristics, in the freshwater snail Biomphalaria pfeifferi in Man, western Côte d'Ivoire. Genetica 2018; 147:33-45. [PMID: 30498954 DOI: 10.1007/s10709-018-0049-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Accepted: 11/22/2018] [Indexed: 11/30/2022]
Abstract
Combining the analysis of spatial and temporal variation when investigating population structure enhances our capacity for unravelling the biotic and abiotic factors responsible for microevolutionary change. This work aimed at measuring the spatial and temporal genetic structure of populations of the freshwater snail Biomphalaria pfeifferi (the intermediate host of the trematode Schistosoma mansoni) in relation to the mating system (self-fertilization), demography, parasite prevalence and some ecological parameters. Snail populations were sampled four times in seven human-water contact sites in the Man region, western Côte d'Ivoire, and their variability was measured at five microsatellite loci. Limited genetic diversity and high selfing rates were observed in the populations studied. We failed to reveal an effect of demographic and ecological parameters on within-population diversity, perhaps as a result of a too small number of populations. A strong spatial genetic differentiation was detected among populations. The temporal differentiation within populations was high in most populations, though lower than the spatial differentiation. All estimates of effective population size were lower than seven suggesting a strong effect of genetic drift. However, the genetic drift was compensated by high gene flow. The genetic structure within and among populations reflected that observed in other selfing snail species, relying on high selfing rates, low effective population sizes, environmental stochasticity and high gene flow.
Collapse
Affiliation(s)
- Yves-Nathan T Tian-Bi
- Laboratoire de Génétique, Unité de Formation et de Recherche Biosciences, Université Félix Houphouët-Boigny, 22 BP 1106, Abidjan 22, Côte d'Ivoire.
- Centre Suisse de Recherches Scientifiques en Côte d'Ivoire, 01 BP 1303, Abidjan 01, Côte d'Ivoire.
| | - Jean-Noël K Konan
- Centre National de Recherche Agronomique, Adiopodoumé KM 17, route de Dabou, 01 BP 1740, Abidjan 01, Côte d'Ivoire
| | - Abdourahamane Sangaré
- Centre National de Recherche Agronomique, Adiopodoumé KM 17, route de Dabou, 01 BP 1740, Abidjan 01, Côte d'Ivoire
| | - Enrique Ortega-Abboud
- Centre d'Ecologie Fonctionnelle et Evolutive, UMR 5175, CNRS, IRD, Université de Montpellier, Université Paul Valéry Montpellier, EPHE, 1919 route de Mende, 34293, Montpellier Cedex 5, France
| | - Jürg Utzinger
- Swiss Tropical and Public Health Institute, P.O. Box, 4002, Basel, Switzerland
- University of Basel, P.O. Box, 4003, Basel, Switzerland
| | - Eliézer K N'Goran
- Centre Suisse de Recherches Scientifiques en Côte d'Ivoire, 01 BP 1303, Abidjan 01, Côte d'Ivoire
- Laboratoire de Zoologie-Biologie Animale, Unité de Recherche et de Formation Parasitologie et Ecologie Parasitaire, Unité de Formation et de Recherche Biosciences, Université Félix Houphouët-Boigny, 22 BP 582, Abidjan 22, Côte d'Ivoire
| | - Philippe Jarne
- Centre d'Ecologie Fonctionnelle et Evolutive, UMR 5175, CNRS, IRD, Université de Montpellier, Université Paul Valéry Montpellier, EPHE, 1919 route de Mende, 34293, Montpellier Cedex 5, France
| |
Collapse
|
16
|
Hunter ME, Johnson NA, Smith BJ, Davis MC, Butterfield JSS, Snow RW, Hart KM. Cytonuclear discordance in the Florida Everglades invasive Burmese python ( Python bivittatus) population reveals possible hybridization with the Indian python ( P. molurus). Ecol Evol 2018; 8:9034-9047. [PMID: 30271564 PMCID: PMC6157680 DOI: 10.1002/ece3.4423] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Revised: 06/08/2018] [Accepted: 06/22/2018] [Indexed: 12/04/2022] Open
Abstract
The invasive Burmese python (Python bivittatus) has been reproducing in the Florida Everglades since the 1980s. These giant constrictor snakes have caused a precipitous decline in small mammal populations in southern Florida following escapes or releases from the commercial pet trade. To better understand the invasion pathway and genetic composition of the population, two mitochondrial (mtDNA) loci across 1,398 base pairs were sequenced on 426 snakes and 22 microsatellites were assessed on 389 snakes. Concatenated mtDNA sequences produced six haplotypes with an average nucleotide and haplotype diversity of π = 0.002 and h = 0.097, respectively. Samples collected in Florida from morphologically identified P. bivittatus snakes were similar to published cytochrome oxidase 1 and cytochrome b sequences from both P. bivittatus and Python molurus and were highly divergent (genetic distances of 5.4% and 4.3%, respectively). The average number of microsatellite alleles and expected heterozygosity were N A = 5.50 and H E = 0.60, respectively. Nuclear Bayesian assignment tests supported two genetically distinct groups and an admixed group, not geographically differentiated. The effective population size (N E = 315.1) was lower than expected for a population this large, but reflected the low genetic diversity overall. The patterns of genetic diversity between mtDNA and microsatellites were disparate, indicating nuclear introgression of separate mtDNA lineages corresponding to cytonuclear discordance. The introgression likely occurred prior to the invasion, but genetic information on the native range and commercial trade is needed for verification. Our finding that the Florida python population is comprised of distinct lineages suggests greater standing variation for adaptation and the potential for broader areas of suitable habitat in the invaded range.
Collapse
Affiliation(s)
- Margaret E. Hunter
- U.S. Geological SurveyWetland and Aquatic Research CenterGainesvilleFlorida
| | - Nathan A. Johnson
- U.S. Geological SurveyWetland and Aquatic Research CenterGainesvilleFlorida
| | - Brian J. Smith
- Wetland and Aquatic Research CenterCherokee Nation TechnologiesDavieFlorida
| | - Michelle C. Davis
- U.S. Geological SurveyWetland and Aquatic Research CenterGainesvilleFlorida
| | | | - Ray W. Snow
- U.S. National Park ServiceEverglades National ParkHomesteadFlorida
| | - Kristen M. Hart
- U.S. Geological SurveyWetland and Aquatic Research CenterDavieFlorida
| |
Collapse
|
17
|
Tataru P, Simonsen M, Bataillon T, Hobolth A. Statistical Inference in the Wright-Fisher Model Using Allele Frequency Data. Syst Biol 2018; 66:e30-e46. [PMID: 28173553 PMCID: PMC5837693 DOI: 10.1093/sysbio/syw056] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Revised: 05/31/2016] [Accepted: 06/06/2016] [Indexed: 11/14/2022] Open
Abstract
The Wright–Fisher model provides an elegant mathematical framework for understanding allele frequency data. In particular, the model can be used to infer the demographic history of species and identify loci under selection. A crucial quantity for inference under the Wright–Fisher model is the distribution of allele frequencies (DAF). Despite the apparent simplicity of the model, the calculation of the DAF is challenging. We review and discuss strategies for approximating the DAF, and how these are used in methods that perform inference from allele frequency data. Various evolutionary forces can be incorporated in the Wright–Fisher model, and we consider these in turn. We begin our review with the basic bi-allelic Wright–Fisher model where random genetic drift is the only evolutionary force. We then consider mutation, migration, and selection. In particular, we compare diffusion-based and moment-based methods in terms of accuracy, computational efficiency, and analytical tractability. We conclude with a brief overview of the multi-allelic process with a general mutation model.
Collapse
Affiliation(s)
- Paula Tataru
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Maria Simonsen
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Asger Hobolth
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| |
Collapse
|
18
|
Inferring sex-specific demographic history from SNP data. PLoS Genet 2018; 14:e1007191. [PMID: 29385127 PMCID: PMC5809101 DOI: 10.1371/journal.pgen.1007191] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Revised: 02/12/2018] [Accepted: 01/08/2018] [Indexed: 12/04/2022] Open
Abstract
The relative female and male contributions to demography are of great importance to better understand the history and dynamics of populations. While earlier studies relied on uniparental markers to investigate sex-specific questions, the increasing amount of sequence data now enables us to take advantage of tens to hundreds of thousands of independent loci from autosomes and the X chromosome. Here, we develop a novel method to estimate effective sex ratios or ESR (defined as the female proportion of the effective population) from allele count data for each branch of a rooted tree topology that summarizes the history of the populations of interest. Our method relies on Kimura’s time-dependent diffusion approximation for genetic drift, and is based on a hierarchical Bayesian model to integrate over the allele frequencies along the branches. We show via simulations that parameters are inferred robustly, even under scenarios that violate some of the model assumptions. Analyzing bovine SNP data, we infer a strongly female-biased ESR in both dairy and beef cattle, as expected from the underlying breeding scheme. Conversely, we observe a strongly male-biased ESR in early domestication times, consistent with an easier taming and management of cows, and/or introgression from wild auroch males, that would both cause a relative increase in male effective population size. In humans, analyzing a subsample of non-African populations, we find a male-biased ESR in Oceanians that may reflect complex marriage patterns in Aboriginal Australians. Because our approach relies on allele count data, it may be applied on a wide range of species. The history of populations and their social organization is often intricate due to breeding structures, migration patterns or population bottlenecks. Estimation of the female proportion of the effective population (sex ratio) is therefore important to better understand this underlying social structure and dynamics. This question has been mainly investigated so far by comparing genetic variation of mitochondrial DNA and the Y chromosome, two uniparentally inherited markers that reflect the demographic history of females and males, respectively. To overcome the intrinsic limitations of these genetic markers, and to take advantage of the increasing amount of sequence data, we propose a new approach that uses large numbers of independent polymorphisms from autosomes and the X chromosome to estimate sex ratios, throughout the history of populations. This method allows us to confirm a strongly female-biased sex ratio in modern dairy and beef cattle breeds. Yet, we find a strongly male-biased sex ratio during domestication times, consistent with an easier taming and management of cows, and/or introgression from wild auroch males. Analyzing human data from a sample of non-African populations, we find a male bias in Oceanians, possibly indicating complex marriage patterns among Aboriginal Australian groups.
Collapse
|
19
|
Wang J, Santiago E, Caballero A. Prediction and estimation of effective population size. Heredity (Edinb) 2016; 117:193-206. [PMID: 27353047 PMCID: PMC5026755 DOI: 10.1038/hdy.2016.43] [Citation(s) in RCA: 169] [Impact Index Per Article: 21.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2015] [Revised: 05/03/2016] [Accepted: 05/16/2016] [Indexed: 12/19/2022] Open
Abstract
Effective population size (Ne) is a key parameter in population genetics. It has important applications in evolutionary biology, conservation genetics and plant and animal breeding, because it measures the rates of genetic drift and inbreeding and affects the efficacy of systematic evolutionary forces, such as mutation, selection and migration. We review the developments in predictive equations and estimation methodologies of effective size. In the prediction part, we focus on the equations for populations with different modes of reproduction, for populations under selection for unlinked or linked loci and for the specific applications to conservation genetics. In the estimation part, we focus on methods developed for estimating the current or recent effective size from molecular marker or sequence data. We discuss some underdeveloped areas in predicting and estimating Ne for future research.
Collapse
Affiliation(s)
- J Wang
- Institute of Zoology, Zoological Society of London, London, UK
| | - E Santiago
- Departamento de Biología Funcional, Facultad de Biología, Universidad de Oviedo, Oviedo, Spain
| | - A Caballero
- Departamento de Bioquímica, Genética e Inmunología, Facultad de Biología, Universidad de Vigo, Vigo, Spain
| |
Collapse
|
20
|
Vaughn JN, Li Z. Genomic Signatures of North American Soybean Improvement Inform Diversity Enrichment Strategies and Clarify the Impact of Hybridization. G3 (BETHESDA, MD.) 2016; 6:2693-705. [PMID: 27402364 PMCID: PMC5015928 DOI: 10.1534/g3.116.029215] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 06/14/2016] [Indexed: 11/18/2022]
Abstract
Crop improvement represents a long-running experiment in artificial selection on a complex trait, namely yield. How such selection relates to natural populations is unclear, but the analysis of domesticated populations could offer insights into the relative role of selection, drift, and recombination in all species facing major shifts in selective regimes. Because of the extreme autogamy exhibited by soybean (Glycine max), many "immortalized" genotypes of elite varieties spanning the last century have been preserved and characterized using ∼50,000 single nucleotide polymorphic (SNP) markers. Also due to autogamy, the history of North American soybean breeding can be roughly divided into pre- and posthybridization eras, allowing for direct interrogation of the role of recombination in improvement and selection. Here, we report on genome-wide characterization of the structure and history of North American soybean populations and the signature of selection in these populations. Supporting previous work, we find that maturity defines population structure. Though the diversity of North American ancestors is comparable to available landraces, prehybridization line selections resulted in a clonal structure that dominated early breeding and explains many of the reductions in diversity found in the initial generations of soybean hybridization. The rate of allele frequency change does not deviate sharply from neutral expectation, yet some regions bare hallmarks of strong selection, suggesting a highly variable range of selection strengths biased toward weak effects. We also discuss the importance of haplotypes as units of analysis when complex traits fall under novel selection regimes.
Collapse
Affiliation(s)
- Justin N Vaughn
- Center for Applied Genetic Technologies, University of Georgia, Athens, Georgia 30602 Department of Crop and Soil Science, University of Georgia, Athens, Georgia 30602
| | - Zenglu Li
- Center for Applied Genetic Technologies, University of Georgia, Athens, Georgia 30602 Department of Crop and Soil Science, University of Georgia, Athens, Georgia 30602
| |
Collapse
|
21
|
Estimating the Effective Population Size from Temporal Allele Frequency Changes in Experimental Evolution. Genetics 2016; 204:723-735. [PMID: 27542959 PMCID: PMC5068858 DOI: 10.1534/genetics.116.191197] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2016] [Accepted: 07/30/2016] [Indexed: 01/22/2023] Open
Abstract
The effective population size (Ne) is a major factor determining allele frequency changes in natural and experimental populations. Temporal methods provide a powerful and simple approach to estimate short-term Ne. They use allele frequency shifts between temporal samples to calculate the standardized variance, which is directly related to Ne. Here we focus on experimental evolution studies that often rely on repeated sequencing of samples in pools (Pool-seq). Pool-seq is cost-effective and often outperforms individual-based sequencing in estimating allele frequencies, but it is associated with atypical sampling properties: Additional to sampling individuals, sequencing DNA in pools leads to a second round of sampling, which increases the variance of allele frequency estimates. We propose a new estimator of Ne, which relies on allele frequency changes in temporal data and corrects for the variance in both sampling steps. In simulations, we obtain accurate Ne estimates, as long as the drift variance is not too small compared to the sampling and sequencing variance. In addition to genome-wide Ne estimates, we extend our method using a recursive partitioning approach to estimate Ne locally along the chromosome. Since the type I error is controlled, our method permits the identification of genomic regions that differ significantly in their Ne estimates. We present an application to Pool-seq data from experimental evolution with Drosophila and provide recommendations for whole-genome data. The estimator is computationally efficient and available as an R package at https://github.com/ThomasTaus/Nest.
Collapse
|
22
|
A modified Wright-Fisher model that incorporates Ne: A variant of the standard model with increased biological realism and reduced computational complexity. J Theor Biol 2016; 393:218-28. [PMID: 26796316 DOI: 10.1016/j.jtbi.2016.01.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2015] [Revised: 01/01/2016] [Accepted: 01/04/2016] [Indexed: 11/20/2022]
Abstract
The Wright-Fisher model is an important model in evolutionary biology and population genetics. It has been applied in numerous analyses of finite populations with discrete generations. It is recognised that real populations can behave, in some key aspects, as though their size that is not the census size, N, but rather a smaller size, namely the effective population size, Ne. However, in the Wright-Fisher model, there is no distinction between the effective and census population sizes. Equivalently, we can say that in this model, Ne coincides with N. The Wright-Fisher model therefore lacks an important aspect of biological realism. Here, we present a method that allows Ne to be directly incorporated into the Wright-Fisher model. The modified model involves matrices whose size is determined by Ne. Thus apart from increased biological realism, the modified model also has reduced computational complexity, particularly so when Ne⪡N. For complex problems, it may be hard or impossible to numerically analyse the most commonly-used approximation of the Wright-Fisher model that incorporates Ne, namely the diffusion approximation. An alternative approach is simulation. However, the simulations need to be sufficiently detailed that they yield an effective size that is different to the census size. Simulations may also be time consuming and have attendant statistical errors. The method presented in this work may then be the only alternative to simulations, when Ne differs from N. We illustrate the straightforward application of the method to some problems involving allele fixation and the determination of the equilibrium site frequency spectrum. We then apply the method to the problem of fixation when three alleles are segregating in a population. This latter problem is significantly more complex than a two allele problem and since the diffusion equation cannot be numerically solved, the only other way Ne can be incorporated into the analysis is by simulation. We have achieved good accuracy in all cases considered. In summary, the present work extends the realism and tractability of an important model of evolutionary biology and population genetics.
Collapse
|