1
|
Lukaszewicz M, Salia OI, Hohenlohe PA, Buzbas EO. Approximate Bayesian computational methods to estimate the strength of divergent selection in population genomics models. JOURNAL OF COMPUTATIONAL MATHEMATICS AND DATA SCIENCE 2024; 10:100091. [PMID: 38616846 PMCID: PMC11014422 DOI: 10.1016/j.jcmds.2024.100091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Statistical estimation of parameters in large models of evolutionary processes is often too computationally inefficient to pursue using exact model likelihoods, even with single-nucleotide polymorphism (SNP) data, which offers a way to reduce the size of genetic data while retaining relevant information. Approximate Bayesian Computation (ABC) to perform statistical inference about parameters of large models takes the advantage of simulations to bypass direct evaluation of model likelihoods. We develop a mechanistic model to simulate forward-in-time divergent selection with variable migration rates, modes of reproduction (sexual, asexual), length and number of migration-selection cycles. We investigate the computational feasibility of ABC to perform statistical inference and study the quality of estimates on the position of loci under selection and the strength of selection. To expand the parameter space of positions under selection, we enhance the model by implementing an outlier scan on summarized observed data. We evaluate the usefulness of summary statistics well-known to capture the strength of selection, and assess their informativeness under divergent selection. We also evaluate the effect of genetic drift with respect to an idealized deterministic model with single-locus selection. We discuss the role of the recombination rate as a confounding factor in estimating the strength of divergent selection, and emphasize its importance in break down of linkage disequilibrium (LD). We answer the question for which part of the parameter space of the model we recover strong signal for estimating the selection, and determine whether population differentiation-based summary statistics or LD-based summary statistics perform well in estimating selection.
Collapse
Affiliation(s)
- Martyna Lukaszewicz
- Institute for Interdisciplinary Data Sciences (IIDS), University of Idaho, Moscow, ID, United States of America
- Department of Mathematics and Statistical Science, University of Idaho, Moscow, ID, United States of America
- Department of Biological Sciences, University of Idaho, Moscow, ID, United States of America
| | - Ousseini Issaka Salia
- Institute for Interdisciplinary Data Sciences (IIDS), University of Idaho, Moscow, ID, United States of America
- Institute for Modeling Collaboration and Innovation (IMCI), University of Idaho, Moscow, ID, United States of America
- Department of Mathematics and Statistical Science, University of Idaho, Moscow, ID, United States of America
- Department of Biological Sciences, University of Idaho, Moscow, ID, United States of America
- Department of Horticulture, Washington State University, Pullman, WA, United States of America
| | - Paul A. Hohenlohe
- Institute for Interdisciplinary Data Sciences (IIDS), University of Idaho, Moscow, ID, United States of America
- Institute for Modeling Collaboration and Innovation (IMCI), University of Idaho, Moscow, ID, United States of America
- Department of Mathematics and Statistical Science, University of Idaho, Moscow, ID, United States of America
- Department of Biological Sciences, University of Idaho, Moscow, ID, United States of America
| | - Erkan O. Buzbas
- Institute for Interdisciplinary Data Sciences (IIDS), University of Idaho, Moscow, ID, United States of America
- Institute for Modeling Collaboration and Innovation (IMCI), University of Idaho, Moscow, ID, United States of America
- Department of Mathematics and Statistical Science, University of Idaho, Moscow, ID, United States of America
| |
Collapse
|
2
|
Thia JA, Korhonen PK, Young ND, Gasser RB, Umina PA, Yang Q, Edwards O, Walsh T, Hoffmann AA. The redlegged earth mite draft genome provides new insights into pesticide resistance evolution and demography in its invasive Australian range. J Evol Biol 2023; 36:381-398. [PMID: 36573922 PMCID: PMC10107102 DOI: 10.1111/jeb.14144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 10/13/2022] [Accepted: 11/03/2022] [Indexed: 12/28/2022]
Abstract
Genomic data provide valuable insights into pest management issues such as resistance evolution, historical patterns of pest invasions and ongoing population dynamics. We assembled the first reference genome for the redlegged earth mite, Halotydeus destructor (Tucker, 1925), to investigate adaptation to pesticide pressures and demography in its invasive Australian range using whole-genome pool-seq data from regionally distributed populations. Our reference genome comprises 132 autosomal contigs, with a total length of 48.90 Mb. We observed a large complex of ace genes, which has presumably evolved from a long history of organophosphate selection in H. destructor and may contribute towards organophosphate resistance through copy number variation, target-site mutations and structural variants. In the putative ancestral H. destructor ace gene, we identified three target-site mutations (G119S, A201S and F331Y) segregating in organophosphate-resistant populations. Additionally, we identified two new para sodium channel gene mutations (L925I and F1020Y) that may contribute to pyrethroid resistance. Regional structuring observed in population genomic analyses indicates that gene flow in H. destructor does not homogenize populations across large geographic distances. However, our demographic analyses were equivocal on the magnitude of gene flow; the short invasion history of H. destructor makes it difficult to distinguish scenarios of complete isolation vs. ongoing migration. Nonetheless, we identified clear signatures of reduced genetic diversity and smaller inferred effective population sizes in eastern vs. western populations, which is consistent with the stepping-stone invasion pathway of this pest in Australia. These new insights will inform development of diagnostic genetic markers of resistance, further investigation into the multifaceted organophosphate resistance mechanism and predictive modelling of resistance evolution and spread.
Collapse
Affiliation(s)
- Joshua A Thia
- Bio21 Institute, School of BioSciences, The University of Melbourne, Melbourne, Victoria, Australia
| | - Pasi K Korhonen
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Melbourne, Victoria, Australia
| | - Neil D Young
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Melbourne, Victoria, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Melbourne, Victoria, Australia
| | | | - Qiong Yang
- Bio21 Institute, School of BioSciences, The University of Melbourne, Melbourne, Victoria, Australia
| | - Owain Edwards
- Land and Water, CSIRO, Floreat, Western Australia, Australia
| | - Tom Walsh
- CSIRO, Black Mountain Laboratories, Canberra, Australian Capital Territory, Australia.,Applied BioSciences, Macquarie University, Sydney, New South Wales, Australia
| | - Ary A Hoffmann
- Bio21 Institute, School of BioSciences, The University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|
3
|
North HL, McGaughran A, Jiggins CD. Insights into invasive species from whole-genome resequencing. Mol Ecol 2021; 30:6289-6308. [PMID: 34041794 DOI: 10.1111/mec.15999] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 03/12/2021] [Accepted: 04/30/2021] [Indexed: 12/12/2022]
Abstract
Studies of invasive species can simultaneously inform management strategies and quantify rapid evolution in the wild. The role of genomics in invasion science is increasingly recognised, and the growing availability of reference genomes for invasive species is paving the way for whole-genome resequencing studies in a wide range of systems. Here, we survey the literature to assess the application of whole-genome resequencing data in invasion biology. For some applications, such as the reconstruction of invasion routes in time and space, sequencing the whole genome of many individuals can increase the accuracy of existing methods. In other cases, population genomic approaches such as haplotype analysis can permit entirely new questions to be addressed and new technologies applied. To date whole-genome resequencing has only been used in a handful of invasive systems, but these studies have confirmed the importance of processes such as balancing selection and hybridization in allowing invasive species to reuse existing adaptations and rapidly overcome the challenges of a foreign ecosystem. The use of genomic data does not constitute a paradigm shift per se, but by leveraging new theory, tools, and technologies, population genomics can provide unprecedented insight into basic and applied aspects of invasion science.
Collapse
Affiliation(s)
- Henry L North
- Department of Zoology, University of Cambridge, Cambridge, UK
| | - Angela McGaughran
- Te Aka Mātuatua/School of Science, University of Waikato, Hamilton, New Zealand
| | - Chris D Jiggins
- Department of Zoology, University of Cambridge, Cambridge, UK
| |
Collapse
|
4
|
Bourgeois YXC, Warren BH. An overview of current population genomics methods for the analysis of whole-genome resequencing data in eukaryotes. Mol Ecol 2021; 30:6036-6071. [PMID: 34009688 DOI: 10.1111/mec.15989] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Revised: 04/26/2021] [Accepted: 05/11/2021] [Indexed: 01/01/2023]
Abstract
Characterizing the population history of a species and identifying loci underlying local adaptation is crucial in functional ecology, evolutionary biology, conservation and agronomy. The constant improvement of high-throughput sequencing techniques has facilitated the production of whole genome data in a wide range of species. Population genomics now provides tools to better integrate selection into a historical framework, and take into account selection when reconstructing demographic history. However, this improvement has come with a profusion of analytical tools that can confuse and discourage users. Such confusion limits the amount of information effectively retrieved from complex genomic data sets, and impairs the diffusion of the most recent analytical tools into fields such as conservation biology. It may also lead to redundancy among methods. To address these isssues, we propose an overview of more than 100 state-of-the-art methods that can deal with whole genome data. We summarize the strategies they use to infer demographic history and selection, and discuss some of their limitations. A website listing these methods is available at www.methodspopgen.com.
Collapse
Affiliation(s)
| | - Ben H Warren
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum National d'Histoire Naturelle, CNRS, Sorbonne Université, EPHE, UA, CP 51, Paris, France
| |
Collapse
|
5
|
Fraïsse C, Popovic I, Mazoyer C, Spataro B, Delmotte S, Romiguier J, Loire É, Simon A, Galtier N, Duret L, Bierne N, Vekemans X, Roux C. DILS: Demographic inferences with linked selection by using ABC. Mol Ecol Resour 2021; 21:2629-2644. [PMID: 33448666 DOI: 10.1111/1755-0998.13323] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 12/09/2020] [Accepted: 12/21/2020] [Indexed: 01/21/2023]
Abstract
We present DILS, a deployable statistical analysis platform for conducting demographic inferences with linked selection from population genomic data using an Approximate Bayesian Computation framework. DILS takes as input single-population or two-population data sets (multilocus fasta sequences) and performs three types of analyses in a hierarchical manner, identifying: (a) the best demographic model to study the importance of gene flow and population size change on the genetic patterns of polymorphism and divergence, (b) the best genomic model to determine whether the effective size Ne and migration rate N, m are heterogeneously distributed along the genome (implying linked selection) and (c) loci in genomic regions most associated with barriers to gene flow. Also available via a Web interface, an objective of DILS is to facilitate collaborative research in speciation genomics. Here, we show the performance and limitations of DILS by using simulations and finally apply the method to published data on a divergence continuum composed by 28 pairs of Mytilus mussel populations/species.
Collapse
Affiliation(s)
- Christelle Fraïsse
- Institute of Science and Technology Austria, Klosterneuœburg, Austria.,Univ. Lille, CNRS, UMR 8198 - Evo-Eco-Paleo, Lille, France
| | - Iva Popovic
- School of Biological Sciences, University of Queensland, St Lucia, Qld, Australia
| | | | - Bruno Spataro
- Laboratoire de Biologie et Biométrie Évolutive CNRS UMR 5558, Université Claude Bernard, Lyon, France
| | - Stéphane Delmotte
- Laboratoire de Biologie et Biométrie Évolutive CNRS UMR 5558, Université Claude Bernard, Lyon, France
| | | | - Étienne Loire
- Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), UMR, ASTRE, Montpellier, France
| | - Alexis Simon
- ISEM, Univ Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Nicolas Galtier
- ISEM, Univ Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | - Laurent Duret
- Laboratoire de Biologie et Biométrie Évolutive CNRS UMR 5558, Université Claude Bernard, Lyon, France
| | - Nicolas Bierne
- ISEM, Univ Montpellier, CNRS, EPHE, IRD, Montpellier, France
| | | | - Camille Roux
- Univ. Lille, CNRS, UMR 8198 - Evo-Eco-Paleo, Lille, France
| |
Collapse
|
6
|
Sanchez T, Cury J, Charpiat G, Jay F. Deep learning for population size history inference: Design, comparison and combination with approximate Bayesian computation. Mol Ecol Resour 2020; 21:2645-2660. [DOI: 10.1111/1755-0998.13224] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 06/19/2020] [Accepted: 07/02/2020] [Indexed: 12/28/2022]
Affiliation(s)
- Théophile Sanchez
- Laboratoire de Recherche en Informatique CNRS UMR 8623 Université Paris‐Saclay Orsay France
| | - Jean Cury
- Laboratoire de Recherche en Informatique CNRS UMR 8623 Université Paris‐Saclay Orsay France
| | - Guillaume Charpiat
- Laboratoire de Recherche en Informatique CNRS UMR 8623 Université Paris‐Saclay Orsay France
| | - Flora Jay
- Laboratoire de Recherche en Informatique CNRS UMR 8623 Université Paris‐Saclay Orsay France
| |
Collapse
|