1
|
Smith CCR, Patterson G, Ralph PL, Kern AD. Estimation of spatial demographic maps from polymorphism data using a neural network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.15.585300. [PMID: 38559192 PMCID: PMC10980082 DOI: 10.1101/2024.03.15.585300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
A fundamental goal in population genetics is to understand how variation is arrayed over natural landscapes. From first principles we know that common features such as heterogeneous population densities and source sink dynamics of dispersal should shape genetic variation over space, however there are few tools currently available that can deal with these ubiquitous complexities. Geographically referenced single nucleotide polymorphism (SNP) data are increasingly accessible, presenting an opportunity to study genetic variation across geographic space in myriad species. We present a new inference method that uses geo-referenced SNPs and a deep neural network to estimate spatially heterogeneous maps of population density and dispersal rate. Our neural network trains on simulated input and output pairings, where the input consists of genotypes and sampling locations generated from a continuous space population genetic simulator, and the output is a map of the true demographic parameters. We benchmark our tool against existing methods and discuss qualitative differences between the different approaches; in particular, our program is unique because it infers the magnitude of both dispersal and density as well as their variation over the landscape, and it does so using SNP data. Similar methods are constrained to estimating relative migration rates, or require identity by descent blocks as input. We applied our tool to empirical data from North American grey wolves, for which it estimated mostly reasonable demographic parameters, but was affected by incomplete spatial sampling. Genetic based methods like ours complement other, direct methods for estimating past and present demography, and we believe will serve as valuable tools for applications in conservation, ecology, and evolutionary biology. An open source software package implementing our method is available from https://github.com/kr-colab/mapNN.
Collapse
Affiliation(s)
- Chris C. R. Smith
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
| | - Gilia Patterson
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
| | - Peter L. Ralph
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
| | - Andrew D. Kern
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
| |
Collapse
|
2
|
Champer SE, Chae B, Haller BC, Champer J, Messer PW. Resource-explicit interactions in spatial population models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.13.575512. [PMID: 38293045 PMCID: PMC10827080 DOI: 10.1101/2024.01.13.575512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Continuous-space population models can yield significantly different results from their panmictic counterparts when assessing evolutionary, ecological, or population-genetic processes. However, the computational burden of spatial models is typically much greater than that of panmictic models due to the overhead of determining which individuals interact with one another and how strongly they interact. Though these calculations are necessary to model local competition that regulates the population density, they can lead to prohibitively long runtimes. Here, we present a novel modeling method in which the resources available to a population are abstractly represented as an additional layer of the simulation. Instead of interacting directly with one another, individuals interact indirectly via this resource layer. We find that this method closely matches other spatial models, yet can dramatically increase the speed of the model, allowing the simulation of much larger populations. Additionally, models structured in this manner exhibit other desirable characteristics, including more realistic spatial dynamics near the edge of the simulated area, and an efficient route for modeling more complex heterogeneous landscapes.
Collapse
Affiliation(s)
- Samuel E. Champer
- Department of Computational Biology, Cornell University, Ithaca, NY 14853
| | - Bryan Chae
- Department of Computational Biology, Cornell University, Ithaca, NY 14853
| | - Benjamin C. Haller
- Department of Computational Biology, Cornell University, Ithaca, NY 14853
| | - Jackson Champer
- Center for Bioinformatics, School of Life Sciences, Center for Life Sciences, Peking University, Beijing, China 100871
| | - Philipp W. Messer
- Department of Computational Biology, Cornell University, Ithaca, NY 14853
| |
Collapse
|
3
|
Smith CCR, Tittes S, Ralph PL, Kern AD. Dispersal inference from population genetic variation using a convolutional neural network. Genetics 2023; 224:iyad068. [PMID: 37052957 PMCID: PMC10213498 DOI: 10.1093/genetics/iyad068] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 02/08/2023] [Accepted: 04/07/2023] [Indexed: 04/14/2023] Open
Abstract
The geographic nature of biological dispersal shapes patterns of genetic variation over landscapes, making it possible to infer properties of dispersal from genetic variation data. Here, we present an inference tool that uses geographically distributed genotype data in combination with a convolutional neural network to estimate a critical population parameter: the mean per-generation dispersal distance. Using extensive simulation, we show that our deep learning approach is competitive with or outperforms state-of-the-art methods, particularly at small sample sizes. In addition, we evaluate varying nuisance parameters during training-including population density, demographic history, habitat size, and sampling area-and show that this strategy is effective for estimating dispersal distance when other model parameters are unknown. Whereas competing methods depend on information about local population density or accurate inference of identity-by-descent tracts, our method uses only single-nucleotide-polymorphism data and the spatial scale of sampling as input. Strikingly, and unlike other methods, our method does not use the geographic coordinates of the genotyped individuals. These features make our method, which we call "disperseNN," a potentially valuable new tool for estimating dispersal distance in nonmodel systems with whole genome data or reduced representation data. We apply disperseNN to 12 different species with publicly available data, yielding reasonable estimates for most species. Importantly, our method estimated consistently larger dispersal distances than mark-recapture calculations in the same species, which may be due to the limited geographic sampling area covered by some mark-recapture studies. Thus genetic tools like ours complement direct methods for improving our understanding of dispersal.
Collapse
Affiliation(s)
- Chris C R Smith
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
| | - Silas Tittes
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
| | - Peter L Ralph
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
| | - Andrew D Kern
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR 97403, USA
| |
Collapse
|
4
|
Smith TB, Weissman DB. Isolation by distance in populations with power-law dispersal. G3 (BETHESDA, MD.) 2023; 13:jkad023. [PMID: 36718551 PMCID: PMC10085794 DOI: 10.1093/g3journal/jkad023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 01/07/2023] [Indexed: 02/01/2023]
Abstract
Limited dispersal of individuals between generations results in isolation by distance, in which individuals further apart in space tend to be less related. Classic models of isolation by distance assume that dispersal distances are drawn from a thin-tailed distribution and predict that the proportion of the genome that is identical by descent between a pair of individuals should decrease exponentially with the spatial separation between them. However, in many natural populations, individuals occasionally disperse over very long distances. In this work, we use mathematical analysis and coalescent simulations to study the effect of long-range (power-law) dispersal on patterns of isolation by distance. We find that it leads to power-law decay of identity-by-descent at large distances with the same exponent as dispersal. We also find that broad power-law dispersal produces another, shallow power-law decay of identity-by-descent at short distances. These results suggest that the distribution of long-range dispersal events could be estimated from sequencing large population samples taken from a wide range of spatial scales.
Collapse
Affiliation(s)
- Tyler B Smith
- Department of Physics, Emory University, Atlanta, Georgia 30322, USA
| | - Daniel B Weissman
- Corresponding author: Department of Physics, Emory University, Atlanta, Georgia 30322, USA.
| |
Collapse
|
5
|
Fletcher RJ, Sefair JA, Kortessis N, Jaffe R, Holt RD, Robertson EP, Duncan SI, Marx AJ, Austin JD. Extending isolation by resistance to predict genetic connectivity. Methods Ecol Evol 2022. [DOI: 10.1111/2041-210x.13975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Robert J. Fletcher
- Department of Wildlife Ecology and Conservation University of Florida Gainesville Florida USA
| | - Jorge A. Sefair
- School of Computing, Informatics, and Decision Systems Engineering Arizona State University Tempe Arizona USA
| | | | | | - Robert D. Holt
- Department of Biology University of Florida Gainesville Florida USA
| | - Ellen P. Robertson
- Department of Wildlife Ecology and Conservation University of Florida Gainesville Florida USA
| | - Sarah I. Duncan
- Department of Wildlife Ecology and Conservation University of Florida Gainesville Florida USA
- Department of Biology Eckerd College St. Petersburg Florida USA
| | - Andrew J. Marx
- Department of Wildlife Ecology and Conservation University of Florida Gainesville Florida USA
| | - James D. Austin
- Department of Wildlife Ecology and Conservation University of Florida Gainesville Florida USA
| |
Collapse
|
6
|
Steiner MC, Novembre J. Population genetic models for the spatial spread of adaptive variants: A review in light of SARS-CoV-2 evolution. PLoS Genet 2022; 18:e1010391. [PMID: 36137003 PMCID: PMC9498967 DOI: 10.1371/journal.pgen.1010391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Theoretical population genetics has long studied the arrival and geographic spread of adaptive variants through the analysis of mathematical models of dispersal and natural selection. These models take on a renewed interest in the context of the COVID-19 pandemic, especially given the consequences that novel adaptive variants have had on the course of the pandemic as they have spread through global populations. Here, we review theoretical models for the spatial spread of adaptive variants and identify areas to be improved in future work, toward a better understanding of variants of concern in Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) evolution and other contemporary applications. As we describe, characteristics of pandemics such as COVID-19-such as the impact of long-distance travel patterns and the overdispersion of lineages due to superspreading events-suggest new directions for improving upon existing population genetic models.
Collapse
Affiliation(s)
- Margaret C. Steiner
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - John Novembre
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Department of Ecology & Evolution, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
7
|
Chang CW, Fridman E, Mascher M, Himmelbach A, Schmid K. Physical geography, isolation by distance and environmental variables shape genomic variation of wild barley (Hordeum vulgare L. ssp. spontaneum) in the Southern Levant. Heredity (Edinb) 2022; 128:107-119. [PMID: 35017679 PMCID: PMC8814169 DOI: 10.1038/s41437-021-00494-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2021] [Revised: 12/13/2021] [Accepted: 12/16/2021] [Indexed: 01/12/2023] Open
Abstract
Determining the extent of genetic variation that reflects local adaptation in crop-wild relatives is of interest for the purpose of identifying useful genetic diversity for plant breeding. We investigated the association of genomic variation with geographical and environmental factors in wild barley (Hordeum vulgare L. ssp. spontaneum) populations of the Southern Levant using genotyping by sequencing (GBS) of 244 accessions in the Barley 1K+ collection. The inference of population structure resulted in four genetic clusters that corresponded to eco-geographical habitats and a significant association between lower gene flow rates and geographical barriers, e.g. the Judaean Mountains and the Sea of Galilee. Redundancy analysis (RDA) revealed that spatial autocorrelation explained 45% and environmental variables explained 15% of total genomic variation. Only 4.5% of genomic variation was solely attributed to environmental variation if the component confounded with spatial autocorrelation was excluded. A synthetic environmental variable combining latitude, solar radiation, and accumulated precipitation explained the highest proportion of genomic variation (3.9%). When conditioned on population structure, soil water capacity was the most important environmental variable explaining 1.18% of genomic variation. Genome scans with outlier analysis and genome-environment association studies were conducted to identify adaptation signatures. RDA and outlier methods jointly detected selection signatures in the pericentromeric regions, which have reduced recombination, of the chromosomes 3H, 4H, and 5H. However, selection signatures mostly disappeared after correction for population structure. In conclusion, adaptation to the highly diverse environments of the Southern Levant over short geographical ranges had a limited effect on the genomic diversity of wild barley. This highlighted the importance of nonselective forces in genetic differentiation.
Collapse
Affiliation(s)
| | - Eyal Fridman
- Plant Sciences Institute, Agricultural Research Organization (ARO), The Volcani Center, Rishon LeZion, Israel
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland OT Gatersleben, Germany
| | - Axel Himmelbach
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland OT Gatersleben, Germany
| | - Karl Schmid
- University of Hohenheim, Stuttgart, Germany.
| |
Collapse
|
8
|
Mulvaney JM, Matthee CA, Cherry MI. Species-landscape interactions drive divergent population trajectories in four forest-dependent Afromontane forest songbird species within a biodiversity hotspot in South Africa. Evol Appl 2021; 14:2680-2697. [PMID: 34815747 PMCID: PMC8591328 DOI: 10.1111/eva.13306] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 07/19/2021] [Accepted: 09/26/2021] [Indexed: 11/27/2022] Open
Abstract
Species confined to naturally fragmented habitats may exhibit intrinsic population complexity which may challenge interpretations of species response to anthropogenic landscape transformation. In South Africa, where native forests are naturally fragmented, forest-dependent birds have undergone range declines since 1992, most notably among insectivores. These insectivores appear sensitive to the quality of natural matrix habitats, and it is unknown whether transformation of the landscape matrix has disrupted gene flow in these species. We undertook a landscape genetics study of four forest-dependent insectivorous songbirds across southeast South Africa. Microsatellite data were used to conduct a priori optimization of landscape resistance surfaces (land cover, rivers and dams, and elevation) using cost-distances along least-cost pathway (LCP), and resistance distances (IBR). We detected pronounced declines in effective population sizes over the past two centuries for the endemic forest specialist Cossypha dichroa and Batis capensis, alongside recent gene flow disruption in B. capensis, C. dichroa and Pogonocichla stellata. Landscape resistance modelling showed both native forest and dense thicket configuration facilitates gene flow in P. stellata, B. capensis and C. dichroa. Facultative dispersal of P. stellata through dense thicket likely aided resilience against historic landscape transformation, whereas combined forest-thicket degradation adversely affected the forest generalist B. capensis. By contrast, Phylloscopus ruficapilla appears least reliant upon landscape features to maintain gene flow and was least impacted by anthropogenic landscape transformation. Collectively, gene flow in all four species is improved at lower elevations, along river valleys, and riparian corridors- where native forest and dense thicket better persist. Consistent outperformance of LCP over IBR land-cover models for P. stellata, B. capensis and C. dichroa demonstrates the benefits of wildlife corridors for South African forest-dependent bird conservation, to ameliorate the extinction debts from past and present anthropogenic forest exploitation.
Collapse
Affiliation(s)
- Jake M. Mulvaney
- Department of Botany and ZoologyStellenbosch UniversityMatielandSouth Africa
| | - Conrad A. Matthee
- Department of Botany and ZoologyStellenbosch UniversityMatielandSouth Africa
| | - Michael I. Cherry
- Department of Botany and ZoologyStellenbosch UniversityMatielandSouth Africa
| |
Collapse
|
9
|
Marcus J, Ha W, Barber RF, Novembre J. Fast and flexible estimation of effective migration surfaces. eLife 2021; 10:61927. [PMID: 34328078 PMCID: PMC8324296 DOI: 10.7554/elife.61927] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Accepted: 06/07/2021] [Indexed: 12/12/2022] Open
Abstract
Spatial population genetic data often exhibits ‘isolation-by-distance,’ where genetic similarity tends to decrease as individuals become more geographically distant. The rate at which genetic similarity decays with distance is often spatially heterogeneous due to variable population processes like genetic drift, gene flow, and natural selection. Petkova et al., 2016 developed a statistical method called Estimating Effective Migration Surfaces (EEMS) for visualizing spatially heterogeneous isolation-by-distance on a geographic map. While EEMS is a powerful tool for depicting spatial population structure, it can suffer from slow runtimes. Here, we develop a related method called Fast Estimation of Effective Migration Surfaces (FEEMS). FEEMS uses a Gaussian Markov Random Field model in a penalized likelihood framework that allows for efficient optimization and output of effective migration surfaces. Further, the efficient optimization facilitates the inference of migration parameters per edge in the graph, rather than per node (as in EEMS). With simulations, we show conditions under which FEEMS can accurately recover effective migration surfaces with complex gene-flow histories, including those with anisotropy. We apply FEEMS to population genetic data from North American gray wolves and show it performs favorably in comparison to EEMS, with solutions obtained orders of magnitude faster. Overall, FEEMS expands the ability of users to quickly visualize and interpret spatial structure in their data.
Collapse
Affiliation(s)
- Joseph Marcus
- Department of Human Genetics, University of Chicago, Chicago, United States
| | - Wooseok Ha
- Department of Statistics, University of California, Berkeley, Berkeley, United States
| | | | - John Novembre
- Department of Human Genetics, University of Chicago, Chicago, United States.,Department of Ecology and Evolution, University of Chicago, Chicago, United States
| |
Collapse
|
10
|
Chafin TK, Zbinden ZD, Douglas MR, Martin BT, Middaugh CR, Gray MC, Ballard JR, Douglas ME. Spatial population genetics in heavily managed species: Separating patterns of historical translocation from contemporary gene flow in white-tailed deer. Evol Appl 2021; 14:1673-1689. [PMID: 34178112 PMCID: PMC8210790 DOI: 10.1111/eva.13233] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2020] [Accepted: 03/10/2021] [Indexed: 01/16/2023] Open
Abstract
Approximately 100 years ago, unregulated harvest nearly eliminated white-tailed deer (Odocoileus virginianus) from eastern North America, which subsequently served to catalyze wildlife management as a national priority. An extensive stock-replenishment effort soon followed, with deer broadly translocated among states as a means of re-establishment. However, an unintended consequence was that natural patterns of gene flow became obscured and pretranslocation signatures of population structure were replaced. We applied cutting-edge molecular and biogeographic tools to disentangle genetic signatures of historical management from those reflecting spatially heterogeneous dispersal by evaluating 35,099 single nucleotide polymorphisms (SNPs) derived via reduced-representation genomic sequencing from 1143 deer sampled statewide in Arkansas. We then employed Simpson's diversity index to summarize ancestry assignments and visualize spatial genetic transitions. Using sub-sampled transects across these transitions, we tested clinal patterns across loci against theoretical expectations of their response under scenarios of re-colonization and restricted dispersal. Two salient results emerged: (A) Genetic signatures from historic translocations are demonstrably apparent; and (B) Geographic filters (major rivers; urban centers; highways) now act as inflection points for the distribution of this contemporary ancestry. These results yielded a statewide assessment of contemporary population structure in deer as driven by historic translocations as well as ongoing processes. In addition, the analytical framework employed herein to effectively decipher extant/historic drivers of deer distribution in Arkansas is also applicable for other biodiversity elements with similarly complex demographic histories.
Collapse
Affiliation(s)
- Tyler K. Chafin
- Department of Biological SciencesUniversity of ArkansasFayettevilleARUSA
- Present address:
Department of Ecology and Evolutionary BiologyUniversity of ColoradoBoulderCOUSA
| | - Zachery D. Zbinden
- Department of Biological SciencesUniversity of ArkansasFayettevilleARUSA
| | - Marlis R. Douglas
- Department of Biological SciencesUniversity of ArkansasFayettevilleARUSA
| | - Bradley T. Martin
- Department of Biological SciencesUniversity of ArkansasFayettevilleARUSA
| | | | - M. Cory Gray
- Research DivisionArkansas Game and Fish CommissionLittle RockARUSA
| | | | - Michael E. Douglas
- Department of Biological SciencesUniversity of ArkansasFayettevilleARUSA
| |
Collapse
|
11
|
Castilla AR, Méndez-Vigo B, Marcer A, Martínez-Minaya J, Conesa D, Picó FX, Alonso-Blanco C. Ecological, genetic and evolutionary drivers of regional genetic differentiation in Arabidopsis thaliana. BMC Evol Biol 2020; 20:71. [PMID: 32571210 PMCID: PMC7310121 DOI: 10.1186/s12862-020-01635-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Accepted: 06/01/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Disentangling the drivers of genetic differentiation is one of the cornerstones in evolution. This is because genetic diversity, and the way in which it is partitioned within and among populations across space, is an important asset for the ability of populations to adapt and persist in changing environments. We tested three major hypotheses accounting for genetic differentiation-isolation-by-distance (IBD), isolation-by-environment (IBE) and isolation-by-resistance (IBR)-in the annual plant Arabidopsis thaliana across the Iberian Peninsula, the region with the largest genomic diversity. To that end, we sampled, genotyped with genome-wide SNPs, and analyzed 1772 individuals from 278 populations distributed across the Iberian Peninsula. RESULTS IBD, and to a lesser extent IBE, were the most important drivers of genetic differentiation in A. thaliana. In other words, dispersal limitation, genetic drift, and to a lesser extent local adaptation to environmental gradients, accounted for the within- and among-population distribution of genetic diversity. Analyses applied to the four Iberian genetic clusters, which represent the joint outcome of the long demographic and adaptive history of the species in the region, showed similar results except for one cluster, in which IBR (a function of landscape heterogeneity) was the most important driver of genetic differentiation. Using spatial hierarchical Bayesian models, we found that precipitation seasonality and topsoil pH chiefly accounted for the geographic distribution of genetic diversity in Iberian A. thaliana. CONCLUSIONS Overall, the interplay between the influence of precipitation seasonality on genetic diversity and the effect of restricted dispersal and genetic drift on genetic differentiation emerges as the major forces underlying the evolutionary trajectory of Iberian A. thaliana.
Collapse
Affiliation(s)
- Antonio R Castilla
- Centre for Applied Ecology "Prof. Baeta Neves", InBIO, School of Agriculture, University of Lisbon, Lisbon, Portugal
- Departamento de Ecología Integrativa, Estación Biológica de Doñana (EBD), Consejo Superior de Investigaciones Científicas (CSIC), Sevilla, Spain
| | - Belén Méndez-Vigo
- Departamento de Genética Molecular de Plantas, Centro Nacional de Biotecnología (CNB), Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain
| | - Arnald Marcer
- CREAF, Centre de Recerca Ecològica i Aplicacions Forestals, Bellaterra, E08193, Cerdanyola de Vallès, Catalonia, Spain
- Universitat Autònoma de Barcelona, Bellaterra, E08193, Cerdanyola de Vallès, Catalonia, Spain
| | | | - David Conesa
- Departament d'Estadística i Investigació Operativa, Universitat de València, Valencia, Spain
| | - F Xavier Picó
- Departamento de Ecología Integrativa, Estación Biológica de Doñana (EBD), Consejo Superior de Investigaciones Científicas (CSIC), Sevilla, Spain.
| | - Carlos Alonso-Blanco
- Departamento de Genética Molecular de Plantas, Centro Nacional de Biotecnología (CNB), Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain
| |
Collapse
|
12
|
Battey CJ, Ralph PL, Kern AD. Space is the Place: Effects of Continuous Spatial Structure on Analysis of Population Genetic Data. Genetics 2020; 215:193-214. [PMID: 32209569 PMCID: PMC7198281 DOI: 10.1534/genetics.120.303143] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Accepted: 03/12/2020] [Indexed: 12/14/2022] Open
Abstract
Real geography is continuous, but standard models in population genetics are based on discrete, well-mixed populations. As a result, many methods of analyzing genetic data assume that samples are a random draw from a well-mixed population, but are applied to clustered samples from populations that are structured clinally over space. Here, we use simulations of populations living in continuous geography to study the impacts of dispersal and sampling strategy on population genetic summary statistics, demographic inference, and genome-wide association studies (GWAS). We find that most common summary statistics have distributions that differ substantially from those seen in well-mixed populations, especially when Wright's neighborhood size is < 100 and sampling is spatially clustered. "Stepping-stone" models reproduce some of these effects, but discretizing the landscape introduces artifacts that in some cases are exacerbated at higher resolutions. The combination of low dispersal and clustered sampling causes demographic inference from the site frequency spectrum to infer more turbulent demographic histories, but averaged results across multiple simulations revealed surprisingly little systematic bias. We also show that the combination of spatially autocorrelated environments and limited dispersal causes GWAS to identify spurious signals of genetic association with purely environmentally determined phenotypes, and that this bias is only partially corrected by regressing out principal components of ancestry. Last, we discuss the relevance of our simulation results for inference from genetic variation in real organisms.
Collapse
Affiliation(s)
- C J Battey
- Institute of Ecology Evolution, Department of Biology, University of Oregon, Eugene, Oregon
| | - Peter L Ralph
- Institute of Ecology Evolution, Department of Biology, University of Oregon, Eugene, Oregon
| | - Andrew D Kern
- Institute of Ecology Evolution, Department of Biology, University of Oregon, Eugene, Oregon
| |
Collapse
|
13
|
Thomaz AT, He Q. When are populations not connected like a circuit? Identifying biases in gene flow from coalescent times. Mol Ecol Resour 2020; 19:1381-1384. [PMID: 31657534 DOI: 10.1111/1755-0998.13075] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Revised: 08/06/2019] [Accepted: 08/07/2019] [Indexed: 11/28/2022]
Abstract
Connectivity and movement patterns of populations are influenced by past and present environmental and biotic factors, which are reflected in genetic relatedness among populations. Methods that estimate the "commute time" between populations based on electrical resistance (i.e., isolation-by-resistance [IBR]) have been widely applied to either infer movement patterns directly from environmental factors or detect possible barriers to gene flow given empirical genetic relatedness. Yet, the commute time is only equivalent to the coalescence time between populations under symmetric migration with isotropic landscapes. Asymmetric gene flow is relatively common when populations are expanding, retreating, or with source-sink dynamics. In a From the Cover paper in this issue of Molecular Ecology Resources, Lundgren and Ralph (Molecular Ecology Resources, 19, 2019) describe a Bayesian method to infer bidirectional gene flow rates and population sizes without the assumption of symmetry. The method shows great accuracy in connectivity estimations under symmetric, as well as asymmetric gene flow scenarios where resistance methods fail. However, computational complexity limits the method to a few populations, preventing its application to finescale environmental maps. Also, as a discrete-deme static model, the inferred differences in gene flow rates are sensitive to population discretization and cannot be directly used to differentiate among processes (e.g., past expansion vs. current barrier). Here, we discuss scenarios where the new method can best be utilized and provide potential directions to identify the underlying processes causing deviations in gene flow patterns.
Collapse
Affiliation(s)
- Andréa T Thomaz
- Department of Zoology and Biodiversity Research Centre, University of British Columbia, Vancouver, BC, Canada
| | - Qixin He
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| |
Collapse
|
14
|
Narum S, Kelley J, Sibbett B. Editorial 2020. Mol Ecol Resour 2020; 20:1-7. [DOI: 10.1111/1755-0998.13125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Accepted: 12/09/2019] [Indexed: 11/29/2022]
|
15
|
Bradburd GS, Ralph PL. Spatial Population Genetics: It's About Time. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2019. [DOI: 10.1146/annurev-ecolsys-110316-022659] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Many important questions about the history and dynamics of organisms have a geographical component: How many are there, and where do they live? How do they move and interbreed across the landscape? How were they moving a thousand years ago, and where were the ancestors of a particular individual alive today? Answers to these questions can have profound consequences for our understanding of history, ecology, and the evolutionary process. In this review, we discuss how geographic aspects of the distribution, movement, and reproduction of organisms are reflected in their pedigree across space and time. Because the structure of the pedigree is what determines patterns of relatedness in modern genetic variation, our aim is to thus provide intuition for how these processes leave an imprint in genetic data. We also highlight some current methods and gaps in the statistical toolbox of spatial population genetics.
Collapse
Affiliation(s)
- Gideon S. Bradburd
- Ecology, Evolutionary Biology, and Behavior Group, Department of Integrative Biology, Michigan State University, East Lansing, Michigan 48824, USA
| | - Peter L. Ralph
- Institute of Ecology and Evolution, Department of Biology, University of Oregon, Eugene, Oregon 97403, USA
- Department of Mathematics, University of Oregon, Eugene, Oregon 97403, USA
| |
Collapse
|