1
|
Guerrero Montero J, Blythe RA. Self-contained Beta-with-Spikes approximation for inference under a Wright-Fisher model. Genetics 2023; 225:iyad092. [PMID: 37226886 PMCID: PMC10550310 DOI: 10.1093/genetics/iyad092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 03/10/2023] [Accepted: 05/10/2023] [Indexed: 05/26/2023] Open
Abstract
We construct a reliable estimation method for evolutionary parameters within the Wright-Fisher model, which describes changes in allele frequencies due to selection and genetic drift, from time-series data. Such data exist for biological populations, for example via artificial evolution experiments, and for the cultural evolution of behavior, such as linguistic corpora that document historical usage of different words with similar meanings. Our method of analysis builds on a Beta-with-Spikes approximation to the distribution of allele frequencies predicted by the Wright-Fisher model. We introduce a self-contained scheme for estimating parameters in the approximation, and demonstrate its robustness with synthetic data, especially in the strong-selection and near-extinction regimes where previous approaches fail. We further apply the method to allele frequency data for baker's yeast (Saccharomyces cerevisiae), finding a significant signal of selection in cases where independent evidence supports such a conclusion. We further demonstrate the possibility of detecting time points at which evolutionary parameters change in the context of a historical spelling reform in the Spanish language.
Collapse
Affiliation(s)
- Juan Guerrero Montero
- SUPA, School of Physics and Astronomy, University of Edinburgh, Edinburgh, EH9 3FD, UK
| | - Richard A Blythe
- Corresponding author: SUPA, School of Physics and Astronomy, University of Edinburgh, Edinburgh EH9 3FD, UK.
| |
Collapse
|
2
|
Boitard S, Liaubet L, Paris C, Fève K, Dehais P, Bouquet A, Riquet J, Mercat MJ. Whole-genome sequencing of cryopreserved resources from French Large White pigs at two distinct sampling times reveals strong signatures of convergent and divergent selection between the dam and sire lines. Genet Sel Evol 2023; 55:13. [PMID: 36864379 PMCID: PMC9979506 DOI: 10.1186/s12711-023-00789-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 02/15/2023] [Indexed: 03/04/2023] Open
Abstract
BACKGROUND Numerous genomic scans for positive selection have been performed in livestock species within the last decade, but often a detailed characterization of the detected regions (gene or trait under selection, timing of selection events) is lacking. Cryopreserved resources stored in reproductive or DNA gene banks offer a great opportunity to improve this characterization by providing direct access to recent allele frequency dynamics, thereby differentiating between signatures from recent breeding objectives and those related to more ancient selection constraints. Improved characterization can also be achieved by using next-generation sequencing data, which helps narrowing the size of the detected regions while reducing the number of associated candidate genes. METHODS We estimated genetic diversity and detected signatures of recent selection in French Large White pigs by sequencing the genomes of 36 animals from three distinct cryopreserved samples: two recent samples from dam (LWD) and sire (LWS) lines, which had diverged from 1995 and were selected under partly different objectives, and an older sample from 1977 prior to the divergence. RESULTS French LWD and LWS lines have lost approximately 5% of the SNPs that segregated in the 1977 ancestral population. Thirty-eight genomic regions under recent selection were detected in these lines and the corresponding selection events were further classified as convergent between lines (18 regions), divergent between lines (10 regions), specific to the dam line (6 regions) or specific to the sire line (4 regions). Several biological functions were found to be significantly enriched among the genes included in these regions: body size, body weight and growth regardless of the category, early life survival and calcium metabolism more specifically in the signatures in the dam line and lipid and glycogen metabolism more specifically in the signatures in the sire line. Recent selection on IGF2 was confirmed and several other regions were linked to a single candidate gene (ARHGAP10, BMPR1B, GNA14, KATNA1, LPIN1, PKP1, PTH, SEMA3E or ZC3HAV1, among others). CONCLUSIONS These results illustrate that sequencing the genome of animals at several recent time points generates considerable insight into the traits, genes and variants under recent selection in a population. This approach could be applied to other livestock populations, e.g. by exploiting the rich biological resources stored in cryobanks.
Collapse
Affiliation(s)
- Simon Boitard
- CBGP, CIRAD, INRAE, Institut Agro, IRD, Université de Montpellier, Montferrier-sur-Lez, France. .,GenPhySE, INRAE, INP, Université de Toulouse, Castanet-Tolosan, France.
| | - Laurence Liaubet
- grid.507621.7GenPhySE, INRAE, INP, Université de Toulouse, Castanet-Tolosan, France
| | - Cyriel Paris
- grid.507621.7GenPhySE, INRAE, INP, Université de Toulouse, Castanet-Tolosan, France
| | - Katia Fève
- grid.507621.7GenPhySE, INRAE, INP, Université de Toulouse, Castanet-Tolosan, France
| | - Patrice Dehais
- grid.507621.7GenPhySE, INRAE, INP, Université de Toulouse, Castanet-Tolosan, France
| | - Alban Bouquet
- IFIP Institut du porc/Alliance R & D, Le Rheu, France
| | - Juliette Riquet
- grid.507621.7GenPhySE, INRAE, INP, Université de Toulouse, Castanet-Tolosan, France
| | | |
Collapse
|
3
|
Mathieson I, Terhorst J. Direct detection of natural selection in Bronze Age Britain. Genome Res 2022; 32:2057-2067. [PMID: 36316157 PMCID: PMC9808619 DOI: 10.1101/gr.276862.122] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 08/29/2022] [Indexed: 11/04/2022]
Abstract
We developed a novel method for efficiently estimating time-varying selection coefficients from genome-wide ancient DNA data. In simulations, our method accurately recovers selective trajectories and is robust to misspecification of population size. We applied it to a large data set of ancient and present-day human genomes from Britain and identified seven loci with genome-wide significant evidence of selection in the past 4500 yr. Almost all of them can be related to increased vitamin D or calcium levels, suggesting strong selective pressure on these or related phenotypes. However, the strength of selection on individual loci varied substantially over time, suggesting that cultural or environmental factors moderated the genetic response. Of 28 complex anthropometric and metabolic traits, skin pigmentation was the only one with significant evidence of polygenic selection, further underscoring the importance of phenotypes related to vitamin D. Our approach illustrates the power of ancient DNA to characterize selection in human populations and illuminates the recent evolutionary history of Britain.
Collapse
Affiliation(s)
- Iain Mathieson
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Jonathan Terhorst
- Department of Statistics, University of Michigan, Ann Arbor, Michigan 48109, USA
| |
Collapse
|
4
|
Delpuech E, Aliakbari A, Labrune Y, Fève K, Billon Y, Gilbert H, Riquet J. Identification of genomic regions affecting production traits in pigs divergently selected for feed efficiency. Genet Sel Evol 2021; 53:49. [PMID: 34126920 PMCID: PMC8201702 DOI: 10.1186/s12711-021-00642-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Accepted: 05/28/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Feed efficiency is a major driver of the sustainability of pig production systems. Understanding the biological mechanisms that underlie these agronomic traits is an important issue for environment questions and farms' economy. This study aimed at identifying genomic regions that affect residual feed intake (RFI) and other production traits in two pig lines divergently selected for RFI during nine generations (LRFI, low RFI; HRFI, high RFI). RESULTS We built a whole dataset of 570,447 single nucleotide polymorphisms (SNPs) in 2426 pigs with records for 24 production traits after both imputation and prediction of genotypes using pedigree information. Genome-wide association studies (GWAS) were performed including both lines (global-GWAS) or each line independently (LRFI-GWAS and HRFI-GWAS). Forty-five chromosomal regions were detected in the global-GWAS, whereas 28 and 42 regions were detected in the HRFI-GWAS and LRFI-GWAS, respectively. Among these 45 regions, only 13 were shared between at least two analyses, and only one was common between the three GWAS but it affects different traits. Among the five quantitative trait loci (QTL) detected for RFI, two were close to QTL for meat quality traits and two pinpointed novel genomic regions that harbor candidate genes involved in cell proliferation and differentiation processes of gastrointestinal tissues or in lipid metabolism-related signaling pathways. In most cases, different QTL regions were detected between the three designs, which suggests a strong impact of the dataset structure on the detection power and could be due to the changes in allelic frequencies during the establishment of lines. CONCLUSIONS In addition to efficiently detecting known and new QTL regions for feed efficiency, the combination of GWAS carried out per line or simultaneously using all individuals highlighted chromosomal regions that affect production traits and presented significant changes in allelic frequencies across generations. Further analyses are needed to estimate whether these regions correspond to traces of selection or result from genetic drift.
Collapse
Affiliation(s)
- Emilie Delpuech
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31320, Castanet-Tolosan, France
| | - Amir Aliakbari
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31320, Castanet-Tolosan, France
| | - Yann Labrune
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31320, Castanet-Tolosan, France
| | - Katia Fève
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31320, Castanet-Tolosan, France
| | | | - Hélène Gilbert
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31320, Castanet-Tolosan, France
| | - Juliette Riquet
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31320, Castanet-Tolosan, France.
| |
Collapse
|
5
|
Boitard S, Paris C, Sevane N, Servin B, Bazi-Kabbaj K, Dunner S. Gene Banks as Reservoirs to Detect Recent Selection: The Example of the Asturiana de los Valles Bovine Breed. Front Genet 2021; 12:575405. [PMID: 33633776 PMCID: PMC7901938 DOI: 10.3389/fgene.2021.575405] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Accepted: 01/05/2021] [Indexed: 11/13/2022] Open
Abstract
Gene banks, framed within the efforts for conserving animal genetic resources to ensure the adaptability of livestock production systems to population growth, income, and climate change challenges, have emerged as invaluable resources for biodiversity and scientific research. Allele frequency trajectories over the few last generations contain rich information about the selection history of populations, which cannot be obtained from classical selection scan approaches based on present time data only. Here we apply a new statistical approach taking advantage of genomic time series and a state of the art statistic (nSL) based on present time data to disentangle both old and recent signatures of selection in the Asturiana de los Valles cattle breed. This local Spanish originally multipurpose breed native to Asturias has been selected for beef production over the last few generations. With the use of SNP chip and whole-genome sequencing (WGS) data, we detect candidate regions under selection reflecting the effort of breeders to produce economically valuable beef individuals, e.g., by improving carcass and meat traits with genes such as MSTN, FLRT2, CRABP2, ZNF215, RBPMS2, OAZ2, or ZNF609, while maintaining the ability to thrive under a semi-intensive production system, with the selection of immune (GIMAP7, GIMAP4, GIMAP8, and TICAM1) or olfactory receptor (OR2D2, OR2D3, OR10A4, and 0R6A2) genes. This kind of information will allow us to take advantage of the invaluable resources provided by gene bank collections from local less competitive breeds, enabling the livestock industry to exploit the different mechanisms fine-tuned by natural and human-driven selection on different populations to improve productivity.
Collapse
Affiliation(s)
- Simon Boitard
- GenPhySE, Université de Toulouse, INRA, INPT, INP-ENVT, Castanet-Tolosan, France
| | - Cyriel Paris
- GenPhySE, Université de Toulouse, INRA, INPT, INP-ENVT, Castanet-Tolosan, France
| | - Natalia Sevane
- Dpto. Animal Production, Facultad de Veterinaria, Universidad Complutense de Madrid, Madrid, Spain
| | - Bertrand Servin
- GenPhySE, Université de Toulouse, INRA, INPT, INP-ENVT, Castanet-Tolosan, France
| | - Kenza Bazi-Kabbaj
- GABI, INRAE, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France.,SIGENAE, INRA, Jouy-en-Josas, France
| | - Susana Dunner
- Dpto. Animal Production, Facultad de Veterinaria, Universidad Complutense de Madrid, Madrid, Spain
| |
Collapse
|
6
|
He Z, Dai X, Beaumont M, Yu F. Detecting and Quantifying Natural Selection at Two Linked Loci from Time Series Data of Allele Frequencies with Forward-in-Time Simulations. Genetics 2020; 216:521-541. [PMID: 32826299 PMCID: PMC7536848 DOI: 10.1534/genetics.120.303463] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2019] [Accepted: 08/15/2020] [Indexed: 12/16/2022] Open
Abstract
Recent advances in DNA sequencing techniques have made it possible to monitor genomes in great detail over time. This improvement provides an opportunity for us to study natural selection based on time serial samples of genomes while accounting for genetic recombination effect and local linkage information. Such time series genomic data allow for more accurate estimation of population genetic parameters and hypothesis testing on the recent action of natural selection. In this work, we develop a novel Bayesian statistical framework for inferring natural selection at a pair of linked loci by capitalising on the temporal aspect of DNA data with the additional flexibility of modeling the sampled chromosomes that contain unknown alleles. Our approach is built on a hidden Markov model where the underlying process is a two-locus Wright-Fisher diffusion with selection, which enables us to explicitly model genetic recombination and local linkage. The posterior probability distribution for selection coefficients is computed by applying the particle marginal Metropolis-Hastings algorithm, which allows us to efficiently calculate the likelihood. We evaluate the performance of our Bayesian inference procedure through extensive simulations, showing that our approach can deliver accurate estimates of selection coefficients, and the addition of genetic recombination and local linkage brings about significant improvement in the inference of natural selection. We also illustrate the utility of our method on real data with an application to ancient DNA data associated with white spotting patterns in horses.
Collapse
Affiliation(s)
- Zhangyi He
- School of Mathematics, University of Bristol, BS8 1UG, United Kingdom
| | - Xiaoyang Dai
- School of Biological Sciences, University of Bristol, BS8 1TQ, United Kingdom
| | - Mark Beaumont
- School of Biological Sciences, University of Bristol, BS8 1TQ, United Kingdom
| | - Feng Yu
- School of Mathematics, University of Bristol, BS8 1UG, United Kingdom
| |
Collapse
|