1
|
Dutheil JY. On the estimation of genome-average recombination rates. Genetics 2024; 227:iyae051. [PMID: 38565705 PMCID: PMC11232287 DOI: 10.1093/genetics/iyae051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 03/13/2024] [Accepted: 03/20/2024] [Indexed: 04/04/2024] Open
Abstract
The rate at which recombination events occur in a population is an indicator of its effective population size and the organism's reproduction mode. It determines the extent of linkage disequilibrium along the genome and, thereby, the efficacy of both purifying and positive selection. The population recombination rate can be inferred using models of genome evolution in populations. Classic methods based on the patterns of linkage disequilibrium provide the most accurate estimates, providing large sample sizes are used and the demography of the population is properly accounted for. Here, the capacity of approaches based on the sequentially Markov coalescent (SMC) to infer the genome-average recombination rate from as little as a single diploid genome is examined. SMC approaches provide highly accurate estimates even in the presence of changing population sizes, providing that (1) within genome heterogeneity is accounted for and (2) classic maximum-likelihood optimization algorithms are employed to fit the model. SMC-based estimates proved sensitive to gene conversion, leading to an overestimation of the recombination rate if conversion events are frequent. Conversely, methods based on the correlation of heterozygosity succeed in disentangling the rate of crossing over from that of gene conversion events, but only when the population size is constant and the recombination landscape homogeneous. These results call for a convergence of these two methods to obtain accurate and comparable estimates of recombination rates between populations.
Collapse
Affiliation(s)
- Julien Y Dutheil
- Max Planck Institute for Evolutionary Biology, August-Thienemann-Str. 2, Plön 24306, Germany
| |
Collapse
|
2
|
Freudiger A, Jovanovic VM, Huang Y, Snyder-Mackler N, Conrad DF, Miller B, Montague MJ, Westphal H, Stadler PF, Bley S, Horvath JE, Brent LJN, Platt ML, Ruiz-Lambides A, Tung J, Nowick K, Ringbauer H, Widdig A. Taking identity-by-descent analysis into the wild: Estimating realized relatedness in free-ranging macaques. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.09.574911. [PMID: 38260273 PMCID: PMC10802400 DOI: 10.1101/2024.01.09.574911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Biological relatedness is a key consideration in studies of behavior, population structure, and trait evolution. Except for parent-offspring dyads, pedigrees capture relatedness imperfectly. The number and length of DNA segments that are identical-by-descent (IBD) yield the most precise estimates of relatedness. Here, we leverage novel methods for estimating locus-specific IBD from low coverage whole genome resequencing data to demonstrate the feasibility and value of resolving fine-scaled gradients of relatedness in free-living animals. Using primarily 4-6× coverage data from a rhesus macaque (Macaca mulatta) population with available long-term pedigree data, we show that we can call the number and length of IBD segments across the genome with high accuracy even at 0.5× coverage. The resulting estimates demonstrate substantial variation in genetic relatedness within kin classes, leading to overlapping distributions between kin classes. They identify cryptic genetic relatives that are not represented in the pedigree and reveal elevated recombination rates in females relative to males, which allows us to discriminate maternal and paternal kin using genotype data alone. Our findings represent a breakthrough in the ability to understand the predictors and consequences of genetic relatedness in natural populations, contributing to our understanding of a fundamental component of population structure in the wild.
Collapse
Affiliation(s)
- Annika Freudiger
- Behavioral Ecology Research Group, Faculty of Life Sciences, Institute of Biology, Leipzig University, Leipzig, Germany
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Vladimir M Jovanovic
- Human Biology and Primate Evolution, Institut für Zoologie, Freie Universität Berlin, Berlin, Germany
- Bioinformatics Solution Center, Freie Universität Berlin, Berlin, Germany
| | - Yilei Huang
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Bioinformatics Group, Institute of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Noah Snyder-Mackler
- Center for Evolution & Medicine, School of Life Sciences, Arizona State University, Tempe, USA
| | - Donald F Conrad
- Division of Genetics, Oregon National Primate Research Center, Portland, Oregon, USA
| | - Brian Miller
- Division of Genetics, Oregon National Primate Research Center, Portland, Oregon, USA
| | - Michael J Montague
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Hendrikje Westphal
- Behavioral Ecology Research Group, Faculty of Life Sciences, Institute of Biology, Leipzig University, Leipzig, Germany
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Bioinformatics Group, Institute of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Institute of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Austria
- Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia
- Santa Fe Institute, Santa Fe, NM, USA
| | - Stefanie Bley
- Behavioral Ecology Research Group, Faculty of Life Sciences, Institute of Biology, Leipzig University, Leipzig, Germany
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Julie E Horvath
- Department of Biological and Biomedical Sciences, North Carolina Central University, North Carolina, Durham, USA
- Research and Collections Section, North Carolina Museum of Natural Sciences, North Carolina, Raleigh, USA
- Department of Biological Sciences, North Carolina State University, North Carolina, Raleigh, USA
- Department of Evolutionary Anthropology, Duke University, North Carolina, Durham, USA
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Lauren J N Brent
- Centre for Research in Animal Behaviour, University of Exeter, Exeter, UK
| | - Michael L Platt
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Marketing Department, the Wharton School of Business, University of Pennsylvania, Philadelphia, PA, USA
- Department of Psychology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA
| | - Angelina Ruiz-Lambides
- Cayo Santiago Field Station, Caribbean Primate Research Center, University of Puerto Rico, Punta Santiago, Puerto Rico
| | - Jenny Tung
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Department of Evolutionary Anthropology, Duke University, North Carolina, Durham, USA
- Department of Biology, Duke University, Durham, North Carolina, USA
- Duke University Population Research Institute, Durham, North Carolina, USA
| | - Katja Nowick
- Human Biology and Primate Evolution, Institut für Zoologie, Freie Universität Berlin, Berlin, Germany
- Bioinformatics Solution Center, Freie Universität Berlin, Berlin, Germany
| | - Harald Ringbauer
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Anja Widdig
- Behavioral Ecology Research Group, Faculty of Life Sciences, Institute of Biology, Leipzig University, Leipzig, Germany
- Department of Primate Behavior and Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig, Germany
| |
Collapse
|
3
|
Bascón-Cardozo K, Bours A, Manthey G, Durieux G, Dutheil JY, Pruisscher P, Odenthal-Hesse L, Liedvogel M. Fine-Scale Map Reveals Highly Variable Recombination Rates Associated with Genomic Features in the Eurasian Blackcap. Genome Biol Evol 2024; 16:evad233. [PMID: 38198800 PMCID: PMC10781513 DOI: 10.1093/gbe/evad233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/12/2023] [Indexed: 01/12/2024] Open
Abstract
Recombination is responsible for breaking up haplotypes, influencing genetic variability, and the efficacy of selection. Bird genomes lack the protein PR domain-containing protein 9, a key determinant of recombination dynamics in most metazoans. Historical recombination maps in birds show an apparent stasis in positioning recombination events. This highly conserved recombination pattern over long timescales may constrain the evolution of recombination in birds. At the same time, extensive variation in recombination rate is observed across the genome and between different species of birds. Here, we characterize the fine-scale historical recombination map of an iconic migratory songbird, the Eurasian blackcap (Sylvia atricapilla), using a linkage disequilibrium-based approach that accounts for population demography. Our results reveal variable recombination rates among and within chromosomes, which associate positively with nucleotide diversity and GC content and negatively with chromosome size. Recombination rates increased significantly at regulatory regions but not necessarily at gene bodies. CpG islands are associated strongly with recombination rates, though their specific position and local DNA methylation patterns likely influence this relationship. The association with retrotransposons varied according to specific family and location. Our results also provide evidence of heterogeneous intrachromosomal conservation of recombination maps between the blackcap and its closest sister taxon, the garden warbler. These findings highlight the considerable variability of recombination rates at different scales and the role of specific genomic features in shaping this variation. This study opens the possibility of further investigating the impact of recombination on specific population-genomic features.
Collapse
Affiliation(s)
- Karen Bascón-Cardozo
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Andrea Bours
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Georg Manthey
- Institute of Avian Research “Vogelwarte Helgoland”, Wilhelmshaven 26386, Germany
| | - Gillian Durieux
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Julien Y Dutheil
- Department for Theoretical Biology, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Peter Pruisscher
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
- Department of Zoology, Stockholm University, Stockholm SE-106 91, Sweden
| | - Linda Odenthal-Hesse
- Department Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
| | - Miriam Liedvogel
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, Plön 24306, Germany
- Institute of Avian Research “Vogelwarte Helgoland”, Wilhelmshaven 26386, Germany
- Department of Biology and Environmental Sciences, Carl von Ossietzky University of Oldenburg, Oldenburg 26129, Germany
| |
Collapse
|
4
|
Dinh BL, Tang E, Taparra K, Nakatsuka N, Chen F, Chiang CWK. Recombination map tailored to Native Hawaiians may improve robustness of genomic scans for positive selection. Hum Genet 2024; 143:85-99. [PMID: 38157018 PMCID: PMC10794367 DOI: 10.1007/s00439-023-02625-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Accepted: 11/25/2023] [Indexed: 01/03/2024]
Abstract
Recombination events establish the patterns of haplotypic structure in a population and estimates of recombination rates are used in several downstream population and statistical genetic analyses. Using suboptimal maps from distantly related populations may reduce the efficacy of genomic analyses, particularly for underrepresented populations such as the Native Hawaiians. To overcome this challenge, we constructed recombination maps using genome-wide array data from two study samples of Native Hawaiians: one reflecting the current admixed state of Native Hawaiians (NH map) and one based on individuals of enriched Polynesian ancestries (PNS map) with the potential to be used for less admixed Polynesian populations such as the Samoans. We found the recombination landscape to be less correlated with those from other continental populations (e.g. Spearman's rho = 0.79 between PNS and CEU (Utah residents with Northern and Western European ancestry) compared to 0.92 between YRI (Yoruba in Ibadan, Nigeria) and CEU at 50 kb resolution), likely driven by the unique demographic history of the Native Hawaiians. PNS also shared the fewest recombination hotspots with other populations (e.g. 8% of hotspots shared between PNS and CEU compared to 27% of hotspots shared between YRI and CEU). We found that downstream analyses in the Native Hawaiian population, such as local ancestry inference, imputation, and IBD segment and relatedness detections, would achieve similar efficacy when using the NH map compared to an omnibus map. However, for genome scans of adaptive loci using integrated haplotype scores, we found several loci with apparent genome-wide significant signals (|Z-score|> 4) in Native Hawaiians that would not have been significant when analyzed using NH-specific maps. Population-specific recombination maps may therefore improve the robustness of haplotype-based statistics and help us better characterize the evolutionary history that may underlie Native Hawaiian-specific health conditions that persist today.
Collapse
Affiliation(s)
- Bryan L Dinh
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Echo Tang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Kekoa Taparra
- Department of Radiation Oncology, Stanford University, Palo Alto, CA, USA
| | | | - Fei Chen
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Charleston W K Chiang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
5
|
Spence JP, Zeng T, Mostafavi H, Pritchard JK. Scaling the discrete-time Wright-Fisher model to biobank-scale datasets. Genetics 2023; 225:iyad168. [PMID: 37724741 PMCID: PMC10627256 DOI: 10.1093/genetics/iyad168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 06/01/2023] [Accepted: 09/08/2023] [Indexed: 09/21/2023] Open
Abstract
The discrete-time Wright-Fisher (DTWF) model and its diffusion limit are central to population genetics. These models can describe the forward-in-time evolution of allele frequencies in a population resulting from genetic drift, mutation, and selection. Computing likelihoods under the diffusion process is feasible, but the diffusion approximation breaks down for large samples or in the presence of strong selection. Existing methods for computing likelihoods under the DTWF model do not scale to current exome sequencing sample sizes in the hundreds of thousands. Here, we present a scalable algorithm that approximates the DTWF model with provably bounded error. Our approach relies on two key observations about the DTWF model. The first is that transition probabilities under the model are approximately sparse. The second is that transition distributions for similar starting allele frequencies are extremely close as distributions. Together, these observations enable approximate matrix-vector multiplication in linear (as opposed to the usual quadratic) time. We prove similar properties for Hypergeometric distributions, enabling fast computation of likelihoods for subsamples of the population. We show theoretically and in practice that this approximation is highly accurate and can scale to population sizes in the tens of millions, paving the way for rigorous biobank-scale inference. Finally, we use our results to estimate the impact of larger samples on estimating selection coefficients for loss-of-function variants. We find that increasing sample sizes beyond existing large exome sequencing cohorts will provide essentially no additional information except for genes with the most extreme fitness effects.
Collapse
Affiliation(s)
- Jeffrey P Spence
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Tony Zeng
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | | | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Biology, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
6
|
Bours A, Pruisscher P, Bascón-Cardozo K, Odenthal-Hesse L, Liedvogel M. The blackcap (Sylvia atricapilla) genome reveals a recent accumulation of LTR retrotransposons. Sci Rep 2023; 13:16471. [PMID: 37777595 PMCID: PMC10542752 DOI: 10.1038/s41598-023-43090-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 09/19/2023] [Indexed: 10/02/2023] Open
Abstract
Transposable elements (TEs) are mobile genetic elements that can move around the genome, and as such are a source of genomic variability. Based on their characteristics we can annotate TEs within the host genome and classify them into specific TE types and families. The increasing number of available high-quality genome references in recent years provides an excellent resource that will enhance the understanding of the role of recently active TEs on genetic variation and phenotypic evolution. Here we showcase the use of a high-quality TE annotation to understand the distinct effect of recent and ancient TE insertions on the evolution of genomic variation, within our study species the Eurasian blackcap (Sylvia atricapilla). We investigate how these distinct TE categories are distributed along the genome and evaluate how their coverage across the genome is correlated with four genomic features: recombination rate, gene coverage, CpG island coverage and GC content. We found within the recent TE insertions an accumulation of LTRs previously not seen in birds. While the coverage of recent TE insertions was negatively correlated with both GC content and recombination rate, the correlation with recombination rate disappeared and turned positive for GC content when considering ancient TE insertions.
Collapse
Affiliation(s)
- Andrea Bours
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany.
| | - Peter Pruisscher
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany
- Department of Evolutionary Biology, Evolutionary Biology Centre (EBC), Uppsala University, Uppsala, Sweden
| | - Karen Bascón-Cardozo
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany
| | - Linda Odenthal-Hesse
- Department Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany
| | - Miriam Liedvogel
- MPRG Behavioural Genomics, Max Planck Institute for Evolutionary Biology, 24306, Plön, Germany.
- Institute of Avian Research "Vogelwarte Helgoland", 26386, Wilhelmshaven, Germany.
| |
Collapse
|
7
|
Spence JP, Zeng T, Mostafavi H, Pritchard JK. Scaling the Discrete-time Wright Fisher model to biobank-scale datasets. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.19.541517. [PMID: 37293115 PMCID: PMC10245735 DOI: 10.1101/2023.05.19.541517] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The Discrete-Time Wright Fisher (DTWF) model and its large population diffusion limit are central to population genetics. These models describe the forward-in-time evolution of the frequency of an allele in a population and can include the fundamental forces of genetic drift, mutation, and selection. Computing like-lihoods under the diffusion process is feasible, but the diffusion approximation breaks down for large sample sizes or in the presence of strong selection. Unfortunately, existing methods for computing likelihoods under the DTWF model do not scale to current exome sequencing sample sizes in the hundreds of thousands. Here we present an algorithm that approximates the DTWF model with provably bounded error and runs in time linear in the size of the population. Our approach relies on two key observations about Binomial distributions. The first is that Binomial distributions are approximately sparse. The second is that Binomial distributions with similar success probabilities are extremely close as distributions, allowing us to approximate the DTWF Markov transition matrix as a very low rank matrix. Together, these observations enable matrix-vector multiplication in linear (as opposed to the usual quadratic) time. We prove similar properties for Hypergeometric distributions, enabling fast computation of likelihoods for subsamples of the population. We show theoretically and in practice that this approximation is highly accurate and can scale to population sizes in the billions, paving the way for rigorous biobank-scale population genetic inference. Finally, we use our results to estimate how increasing sample sizes will improve the estimation of selection coefficients acting on loss-of-function variants. We find that increasing sample sizes beyond existing large exome sequencing cohorts will provide essentially no additional information except for genes with the most extreme fitness effects.
Collapse
Affiliation(s)
| | - Tony Zeng
- Department of Genetics, Stanford University
| | | | - Jonathan K. Pritchard
- Department of Genetics, Stanford University
- Department of Biology, Stanford University
| |
Collapse
|
8
|
Palahí I Torres A, Höök L, Näsvall K, Shipilina D, Wiklund C, Vila R, Pruisscher P, Backström N. The fine-scale recombination rate variation and associations with genomic features in a butterfly. Genome Res 2023; 33:810-823. [PMID: 37308293 PMCID: PMC10317125 DOI: 10.1101/gr.277414.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 05/03/2023] [Indexed: 06/14/2023]
Abstract
Recombination is a key molecular mechanism that has profound implications on both micro- and macroevolutionary processes. However, the determinants of recombination rate variation in holocentric organisms are poorly understood, in particular in Lepidoptera (moths and butterflies). The wood white butterfly (Leptidea sinapis) shows considerable intraspecific variation in chromosome numbers and is a suitable system for studying regional recombination rate variation and its potential molecular underpinnings. Here, we developed a large whole-genome resequencing data set from a population of wood whites to obtain high-resolution recombination maps using linkage disequilibrium information. The analyses revealed that larger chromosomes had a bimodal recombination landscape, potentially caused by interference between simultaneous chiasmata. The recombination rate was significantly lower in subtelomeric regions, with exceptions associated with segregating chromosome rearrangements, showing that fissions and fusions can have considerable effects on the recombination landscape. There was no association between the inferred recombination rate and base composition, supporting a limited influence of GC-biased gene conversion in butterflies. We found significant but variable associations between the recombination rate and the density of different classes of transposable elements, most notably a significant enrichment of short interspersed nucleotide elements in genomic regions with higher recombination rate. Finally, the analyses unveiled significant enrichment of genes involved in farnesyltranstransferase activity in recombination coldspots, potentially indicating that expression of transferases can inhibit formation of chiasmata during meiotic division. Our results provide novel information about recombination rate variation in holocentric organisms and have particular implications for forthcoming research in population genetics, molecular/genome evolution, and speciation.
Collapse
Affiliation(s)
- Aleix Palahí I Torres
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, SE-752 36 Uppsala, Sweden;
| | - Lars Höök
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, SE-752 36 Uppsala, Sweden
| | - Karin Näsvall
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, SE-752 36 Uppsala, Sweden
| | - Daria Shipilina
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, SE-752 36 Uppsala, Sweden
| | - Christer Wiklund
- Department of Zoology: Division of Ecology, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Roger Vila
- Butterfly Diversity and Evolution Lab, Institut de Biologia Evolutiva (CSIC-UPF), 08003 Barcelona, Spain
| | - Peter Pruisscher
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, SE-752 36 Uppsala, Sweden
| | - Niclas Backström
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, SE-752 36 Uppsala, Sweden
| |
Collapse
|
9
|
Krishnan S, DeMaere MZ, Beck D, Ostrowski M, Seymour JR, Darling AE. Rhometa: Population recombination rate estimation from metagenomic read datasets. PLoS Genet 2023; 19:e1010683. [PMID: 36972309 PMCID: PMC10079220 DOI: 10.1371/journal.pgen.1010683] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 04/06/2023] [Accepted: 02/27/2023] [Indexed: 03/29/2023] Open
Abstract
Prokaryotic evolution is influenced by the exchange of genetic information between species through a process referred to as recombination. The rate of recombination is a useful measure for the adaptive capacity of a prokaryotic population. We introduce Rhometa (https://github.com/sid-krish/Rhometa), a new software package to determine recombination rates from shotgun sequencing reads of metagenomes. It extends the composite likelihood approach for population recombination rate estimation and enables the analysis of modern short-read datasets. We evaluated Rhometa over a broad range of sequencing depths and complexities, using simulated and real experimental short-read data aligned to external reference genomes. Rhometa offers a comprehensive solution for determining population recombination rates from contemporary metagenomic read datasets. Rhometa extends the capabilities of conventional sequence-based composite likelihood population recombination rate estimators to include modern aligned metagenomic read datasets with diverse sequencing depths, thereby enabling the effective application of these techniques and their high accuracy rates to the field of metagenomics. Using simulated datasets, we show that our method performs well, with its accuracy improving with increasing numbers of genomes. Rhometa was validated on a real S. pneumoniae transformation experiment, where we show that it obtains plausible estimates of the rate of recombination. Finally, the program was also run on ocean surface water metagenomic datasets, through which we demonstrate that the program works on uncultured metagenomic datasets.
Collapse
Affiliation(s)
- Sidaswar Krishnan
- Climate Change Cluster, Faculty of Science, University of Technology Sydney, Sydney, NSW, Australia
| | - Matthew Z. DeMaere
- Australian Institute for Microbiology & Infection, University of Technology Sydney, Sydney, NSW, Australia
- * E-mail:
| | - Dominik Beck
- Centre for Health Technologies and the School of Biomedical Engineering, University of Technology Sydney, Sydney, NSW, Australia
| | - Martin Ostrowski
- Climate Change Cluster, Faculty of Science, University of Technology Sydney, Sydney, NSW, Australia
| | - Justin R. Seymour
- Climate Change Cluster, Faculty of Science, University of Technology Sydney, Sydney, NSW, Australia
| | - Aaron E. Darling
- Australian Institute for Microbiology & Infection, University of Technology Sydney, Sydney, NSW, Australia
- Illumina Australia Pty Ltd, Ultimo, NSW, Australia
| |
Collapse
|
10
|
Samuk K, Noor MAF. Gene flow biases population genetic inference of recombination rate. G3 GENES|GENOMES|GENETICS 2022; 12:6698695. [PMID: 36103705 PMCID: PMC9635666 DOI: 10.1093/g3journal/jkac236] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 08/30/2022] [Indexed: 11/21/2022]
Abstract
Accurate estimates of the rate of recombination are key to understanding a host of evolutionary processes as well as the evolution of the recombination rate itself. Model-based population genetic methods that infer recombination rates from patterns of linkage disequilibrium in the genome have become a popular method to estimate rates of recombination. However, these linkage disequilibrium-based methods make a variety of simplifying assumptions about the populations of interest that are often not met in natural populations. One such assumption is the absence of gene flow from other populations. Here, we use forward-time population genetic simulations of isolation-with-migration scenarios to explore how gene flow affects the accuracy of linkage disequilibrium-based estimators of recombination rate. We find that moderate levels of gene flow can result in either the overestimation or underestimation of recombination rates by up to 20–50% depending on the timing of divergence. We also find that these biases can affect the detection of interpopulation differences in recombination rate, causing both false positives and false negatives depending on the scenario. We discuss future possibilities for mitigating these biases and recommend that investigators exercise caution and confirm that their study populations meet assumptions before deploying these methods.
Collapse
Affiliation(s)
- Kieran Samuk
- Department of Biology, Duke University , Durham, NC 27708, USA
- Department of Evolution, Ecology, and Organismal Biology, The University of California, Riverside ,Riverside, CA 92521, USA
| | | |
Collapse
|
11
|
Setter D, Ebdon S, Jackson B, Lohse K. Estimating the rates of crossover and gene conversion from individual genomes. Genetics 2022; 222:6623412. [PMID: 35771626 PMCID: PMC9434185 DOI: 10.1093/genetics/iyac100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 06/01/2022] [Indexed: 11/14/2022] Open
Abstract
Recombination can occur either as a result of crossover or gene conversion events. Population genetic methods for inferring the rate of recombination from patterns of linkage disequilibrium generally assume a simple model of recombination that only involves crossover events and ignore gene conversion. However, distinguishing the 2 processes is not only necessary for a complete description of recombination, but also essential for understanding the evolutionary consequences of inversions and other genomic partitions in which crossover (but not gene conversion) is reduced. We present heRho, a simple composite likelihood scheme for coestimating the rate of crossover and gene conversion from individual diploid genomes. The method is based on analytic results for the distance-dependent probability of heterozygous and homozygous states at 2 loci. We apply heRho to simulations and data from the house mouse Mus musculus castaneus, a well-studied model. Our analyses show (1) that the rates of crossover and gene conversion can be accurately coestimated at the level of individual chromosomes and (2) that previous estimates of the population scaled rate of recombination ρ=4Ner under a pure crossover model are likely biased.
Collapse
Affiliation(s)
- Derek Setter
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, EH9 3FL, UK
| | - Sam Ebdon
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, EH9 3FL, UK
| | - Ben Jackson
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, EH9 3FL, UK
| | - Konrad Lohse
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, EH9 3FL, UK
| |
Collapse
|
12
|
Chen DS, Clark AG, Wolfner MF. Octopaminergic/tyraminergic Tdc2 neurons regulate biased sperm usage in female Drosophila melanogaster. Genetics 2022; 221:6613932. [PMID: 35736370 DOI: 10.1093/genetics/iyac097] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 06/04/2022] [Indexed: 11/14/2022] Open
Abstract
In polyandrous internally fertilizing species, a multiply-mated female can use stored sperm from different males in a biased manner to fertilize her eggs. The female's ability to assess sperm quality and compatibility is essential for her reproductive success, and represents an important aspect of postcopulatory sexual selection. In Drosophila melanogaster, previous studies demonstrated that the female nervous system plays an active role in influencing progeny paternity proportion, and suggested a role for octopaminergic/tyraminergic Tdc2 neurons in this process. Here, we report that inhibiting Tdc2 neuronal activity causes females to produce a higher-than-normal proportion of first-male progeny. This difference is not due to differences in sperm storage or release, but instead is attributable to the suppression of second-male sperm usage bias that normally occurs in control females. We further show that a subset of Tdc2 neurons innervating the female reproductive tract is largely responsible for the progeny proportion phenotype that is observed when Tdc2 neurons are inhibited globally. On the contrary, overactivation of Tdc2 neurons does not further affect sperm storage and release or progeny proportion. These results suggest that octopaminergic/tyraminergic signaling allows a multiply-mated female to bias sperm usage, and identify a new role for the female nervous system in postcopulatory sexual selection.
Collapse
Affiliation(s)
- Dawn S Chen
- Department of Molecular Biology and Genetics, Cornell University, Ithaca NY 14853, USA
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca NY 14853, USA
| | - Mariana F Wolfner
- Department of Molecular Biology and Genetics, Cornell University, Ithaca NY 14853, USA
| |
Collapse
|
13
|
Biddanda A, Steinrücken M, Novembre J. Properties of Two-Locus Genealogies and Linkage Disequilibrium in Temporally Structured Samples. Genetics 2022; 221:6549526. [PMID: 35294015 PMCID: PMC9245597 DOI: 10.1093/genetics/iyac038] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Accepted: 02/06/2022] [Indexed: 11/13/2022] Open
Abstract
Archaeogenetics has been revolutionary, revealing insights into demographic history and recent positive selection. However, most studies to date have ignored the non-random association of genetic variants at different loci (i.e., linkage disequilibrium, LD). This may be in part because basic properties of LD in samples from different times are still not well understood. Here, we derive several results for summary statistics of haplotypic variation under a model with time-stratified sampling: 1) The correlation between the number of pairwise differences observed between time-staggered samples (πΔt) in models with and without strict population continuity; 2) The product of the LD coefficient, D, between ancient and modern samples, which is a measure of haplotypic similarity between modern and ancient samples; and 3) The expected switch rate in the Li and Stephens haplotype copying model. The latter has implications for genotype imputation and phasing in ancient samples with modern reference panels. Overall, these results provide a characterization of how haplotype patterns are affected by sample age, recombination rates, and population sizes. We expect these results will help guide the interpretation and analysis of haplotype data from ancient and modern samples.
Collapse
Affiliation(s)
- Arjun Biddanda
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Matthias Steinrücken
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.,Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA
| | - John Novembre
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.,Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
14
|
Friedlander E, Steinrücken M. A numerical framework for genetic hitchhiking in populations of variable size. Genetics 2022; 220:6526396. [PMID: 35143667 PMCID: PMC8893261 DOI: 10.1093/genetics/iyac012] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 12/27/2021] [Indexed: 11/13/2022] Open
Abstract
Natural selection on beneficial or deleterious alleles results in an increase or decrease, respectively, of their frequency within the population. Due to chromosomal linkage, the dynamics of the selected site affect the genetic variation at nearby neutral loci in a process commonly referred to as genetic hitchhiking. Changes in population size, however, can yield patterns in genomic data that mimic the effects of selection. Accurately modeling these dynamics is thus crucial to understanding how selection and past population size changes impact observed patterns of genetic variation. Here, we model the evolution of haplotype frequencies with the Wright-Fisher diffusion to study the impact of selection on linked neutral variation. Explicit solutions are not known for the dynamics of this diffusion when selection and recombination act simultaneously. Thus, we present a method for numerically evaluating the Wright-Fisher diffusion dynamics of 2 linked loci separated by a certain recombination distance when selection is acting. We can account for arbitrary population size histories explicitly using this approach. A key step in the method is to express the moments of the associated transition density, or sampling probabilities, as solutions to ordinary differential equations. Numerically solving these differential equations relies on a novel accurate and numerically efficient technique to estimate higher order moments from lower order moments. We demonstrate how this numerical framework can be used to quantify the reduction and recovery of genetic diversity around a selected locus over time and elucidate distortions in the site-frequency-spectra of neutral variation linked to loci under selection in various demographic settings. The method can be readily extended to more general modes of selection and applied in likelihood frameworks to detect loci under selection and infer the strength of the selective pressure.
Collapse
Affiliation(s)
- Eric Friedlander
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA,Department of Mathematics, Saint Norbert College, Green Bay, WI 54115, USA
| | - Matthias Steinrücken
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA,Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA,Corresponding author: Department of Ecology & Evolution, The University of Chicago, 1101 E. 57th Street, Chicago, IL 60637, USA.
| |
Collapse
|
15
|
Fine human genetic map based on UK10K data set. Hum Genet 2022; 141:273-281. [DOI: 10.1007/s00439-021-02415-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Accepted: 12/03/2021] [Indexed: 11/04/2022]
|
16
|
Neupane S, Xu S. Adaptive Divergence of Meiotic Recombination Rate in Ecological Speciation. Genome Biol Evol 2021; 12:1869-1881. [PMID: 32857858 PMCID: PMC7594247 DOI: 10.1093/gbe/evaa182] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/24/2020] [Indexed: 02/06/2023] Open
Abstract
Theories predict that directional selection during adaptation to a novel habitat results in elevated meiotic recombination rate. Yet the lack of population-level recombination rate data leaves this hypothesis untested in natural populations. Here, we examine the population-level recombination rate variation in two incipient ecological species, the microcrustacean Daphnia pulex (an ephemeral-pond species) and Daphnia pulicaria (a permanent-lake species). The divergence of D. pulicaria from D. pulex involved habitat shifts from pond to lake habitats as well as strong local adaptation due to directional selection. Using a novel single-sperm genotyping approach, we estimated the male-specific recombination rate of two linkage groups in multiple populations of each species in common garden experiments and identified a significantly elevated recombination rate in D. pulicaria. Most importantly, population genetic analyses show that the divergence in recombination rate between these two species is most likely due to divergent selection in distinct ecological habitats rather than neutral evolution.
Collapse
Affiliation(s)
| | - Sen Xu
- Department of Biology, University of Texas at Arlington
| |
Collapse
|
17
|
van Eeden G, Uren C, Möller M, Henn BM. Inferring recombination patterns in African populations. Hum Mol Genet 2021; 30:R11-R16. [PMID: 33445180 DOI: 10.1093/hmg/ddab020] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 01/04/2021] [Accepted: 01/06/2021] [Indexed: 11/14/2022] Open
Abstract
Although several high-resolution recombination maps exist for European-descent populations, the recombination landscape of African populations remains relatively understudied. Given that there is high genetic divergence among groups in Africa, it is possible that recombination hotspots also diverge significantly. Both limitations and opportunities exist for developing recombination maps for these populations. In this review, we discuss various recombination inference methods, and the strengths and weaknesses of these methods in analyzing recombination in African-descent populations. Furthermore, we provide a decision tree and recommendations for which inference method to use in various research contexts. Establishing an appropriate methodology for recombination rate inference in a particular study will improve the accuracy of various downstream analyses including but not limited to local ancestry inference, haplotype phasing, fine-mapping of GWAS loci and genome assemblies.
Collapse
Affiliation(s)
- Gerald van Eeden
- DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town 7505, South Africa
| | - Caitlin Uren
- DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town 7505, South Africa.,Centre for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch 7602, South Africa
| | - Marlo Möller
- DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town 7505, South Africa.,Centre for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch 7602, South Africa
| | - Brenna M Henn
- Department of Anthropology, Center for Population Biology and the Genome Center, University of California Davis, Davis, CA 95616, USA
| |
Collapse
|
18
|
Hassan S, Surakka I, Taskinen MR, Salomaa V, Palotie A, Wessman M, Tukiainen T, Pirinen M, Palta P, Ripatti S. High-resolution population-specific recombination rates and their effect on phasing and genotype imputation. Eur J Hum Genet 2020; 29:615-624. [PMID: 33249422 PMCID: PMC8114909 DOI: 10.1038/s41431-020-00768-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 10/01/2020] [Accepted: 10/20/2020] [Indexed: 11/24/2022] Open
Abstract
Previous research has shown that using population-specific reference panels has a significant effect on downstream population genomic analyses like haplotype phasing, genotype imputation, and association, especially in the context of population isolates. Here, we developed a high-resolution recombination rate mapping at 10 and 50 kb scale using high-coverage (20–30×) whole-genome sequenced data of 55 family trios from Finland and compared it to recombination rates of non-Finnish Europeans (NFE). We tested the downstream effects of the population-specific recombination rates in statistical phasing and genotype imputation in Finns as compared to the same analyses performed by using the NFE-based recombination rates. We found that Finnish recombination rates have a moderately high correlation (Spearman’s ρ = 0.67–0.79) with NFE, although on average (across all autosomal chromosomes), Finnish rates (2.268 ± 0.4209 cM/Mb) are 12–14% lower than NFE (2.641 ± 0.5032 cM/Mb). Finnish recombination map was found to have no significant effect in haplotype phasing accuracy (switch error rates ~2%) and average imputation concordance rates (97–98% for common, 92–96% for low frequency and 78–90% for rare variants). Our results suggest that haplotype phasing and genotype imputation mostly depend on population-specific contexts like appropriate reference panels and their sample size, but not on population-specific recombination maps. Even though recombination rate estimates had some differences between the Finnish and NFE populations, haplotyping and imputation had not been noticeably affected by the recombination map used. Therefore, the currently available HapMap recombination maps seem robust for population-specific phasing and imputation pipelines, even in the context of relatively isolated populations like Finland.
Collapse
Affiliation(s)
- Shabbeer Hassan
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland
| | - Ida Surakka
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland
| | - Marja-Riitta Taskinen
- Clinical and molecular metabolism, Research program unit, University of Helsinki, Helsinki, Finland
| | - Veikko Salomaa
- Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Aarno Palotie
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland.,Massachusetts General Hospital & Harvard Medical School, Boston, MA, USA.,Broad Institute of the Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA
| | - Maija Wessman
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland
| | - Taru Tukiainen
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland
| | - Matti Pirinen
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland.,Department of Public Health, Faculty of Medicine, Clinicum, University of Helsinki, Helsinki, Finland.,Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
| | - Priit Palta
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland.,Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Samuli Ripatti
- Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Helsinki, Finland. .,Broad Institute of the Massachusetts Institute of Technology and Harvard University, Cambridge, MA, USA. .,Department of Public Health, Faculty of Medicine, Clinicum, University of Helsinki, Helsinki, Finland.
| |
Collapse
|
19
|
Ragsdale AP, Gravel S. Unbiased Estimation of Linkage Disequilibrium from Unphased Data. Mol Biol Evol 2020; 37:923-932. [PMID: 31697386 DOI: 10.1093/molbev/msz265] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Linkage disequilibrium (LD) is used to infer evolutionary history, to identify genomic regions under selection, and to dissect the relationship between genotype and phenotype. In each case, we require accurate estimates of LD statistics from sequencing data. Unphased data present a challenge because multilocus haplotypes cannot be inferred exactly. Widely used estimators for the common statistics r2 and D2 exhibit large and variable upward biases that complicate interpretation and comparison across cohorts. Here, we show how to find unbiased estimators for a wide range of two-locus statistics, including D2, for both single and multiple randomly mating populations. These unbiased statistics are particularly well suited to estimate effective population sizes from unlinked loci in small populations. We develop a simple inference pipeline and use it to refine estimates of recent effective population sizes of the threatened Channel Island Fox populations.
Collapse
Affiliation(s)
- Aaron P Ragsdale
- Department of Human Genetics, McGill University, Montreal, QC, Canada
| | - Simon Gravel
- Department of Human Genetics, McGill University, Montreal, QC, Canada
| |
Collapse
|
20
|
Xue C, Rustagi N, Liu X, Raveendran M, Harris RA, Venkata MG, Rogers J, Yu F. Reduced meiotic recombination in rhesus macaques and the origin of the human recombination landscape. PLoS One 2020; 15:e0236285. [PMID: 32841250 PMCID: PMC7447010 DOI: 10.1371/journal.pone.0236285] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 07/01/2020] [Indexed: 11/18/2022] Open
Abstract
Characterizing meiotic recombination rates across the genomes of nonhuman primates is important for understanding the genetics of primate populations, performing genetic analyses of phenotypic variation and reconstructing the evolution of human recombination. Rhesus macaques (Macaca mulatta) are the most widely used nonhuman primates in biomedical research. We constructed a high-resolution genetic map of the rhesus genome based on whole genome sequence data from Indian-origin rhesus macaques. The genetic markers used were approximately 18 million SNPs, with marker density 6.93 per kb across the autosomes. We report that the genome-wide recombination rate in rhesus macaques is significantly lower than rates observed in apes or humans, while the distribution of recombination across the macaque genome is more uniform. These observations provide new comparative information regarding the evolution of recombination in primates.
Collapse
Affiliation(s)
- Cheng Xue
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- * E-mail: (FY); (JR); (CX)
| | - Navin Rustagi
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Xiaoming Liu
- USF Genomics & College of Public Health, University of South Florida, Tampa, Florida, United States of America
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - R. Alan Harris
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | | | - Jeffrey Rogers
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- * E-mail: (FY); (JR); (CX)
| | - Fuli Yu
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- * E-mail: (FY); (JR); (CX)
| |
Collapse
|
21
|
Sun Y, Lu Z, Zhu X, Ma H. Genomic basis of homoploid hybrid speciation within chestnut trees. Nat Commun 2020; 11:3375. [PMID: 32632155 PMCID: PMC7338469 DOI: 10.1038/s41467-020-17111-w] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Accepted: 06/09/2020] [Indexed: 12/30/2022] Open
Abstract
Hybridization can drive speciation. We examine the hypothesis that Castanea henryi var. omeiensis is an evolutionary lineage that originated from hybridization between two near-sympatric diploid taxa, C. henryi var. henryi and C. mollissima. We produce a high-quality genome assembly for mollissima and characterize evolutionary relationships among related chestnut taxa. Our results show that C. henryi var. omeiensis has a mosaic genome but has accumulated divergence in all 12 chromosomes. We observe positive correlation between admixture proportions and recombination rates across the genome. Candidate barrier genomic regions, which isolate var. henryi and mollissima, are re-assorted in the hybrid lineage. We further find that the putative barrier segments concentrate in genomic regions with less recombination, suggesting that interaction between natural selection and recombination shapes the evolution of hybrid genomes during hybrid speciation. This study highlights that reassortment of parental barriers is an important mechanism in generating biodiversity. Chinese chestnut is widely cultivated for nut production and harbors value as a genetic resource for restoration of American and European chestnut trees destroyed by chestnut blight. Here, the authors reveal the genomic basis of homoploid hybrid speciation within Castanea spp. and find the nonrandom distribution of reproductive barrier loci based on a high-quality reference genome.
Collapse
Affiliation(s)
- Yongshuai Sun
- CAS Key Laboratory of Tropical Forest Ecology, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, 666303, Yunnan, China. .,Center for Plant Ecology, Core Botanical Gardens, Chinese Academy of Sciences, Mengla, 666303, Yunnan, China.
| | - Zhiqiang Lu
- CAS Key Laboratory of Tropical Forest Ecology, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, 666303, Yunnan, China.,Center for Plant Ecology, Core Botanical Gardens, Chinese Academy of Sciences, Mengla, 666303, Yunnan, China
| | - Xingfu Zhu
- CAS Key Laboratory of Tropical Forest Ecology, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, 666303, Yunnan, China
| | - Hui Ma
- CAS Key Laboratory of Tropical Forest Ecology, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, 666303, Yunnan, China
| |
Collapse
|
22
|
Zhou Y, Browning BL, Browning SR. Population-Specific Recombination Maps from Segments of Identity by Descent. Am J Hum Genet 2020; 107:137-148. [PMID: 32533945 PMCID: PMC7332656 DOI: 10.1016/j.ajhg.2020.05.016] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 05/20/2020] [Indexed: 12/26/2022] Open
Abstract
Recombination rates vary significantly across the genome, and estimates of recombination rates are needed for downstream analyses such as haplotype phasing and genotype imputation. Existing methods for recombination rate estimation are limited by insufficient amounts of informative genetic data or by high computational cost. We present a method and software, called IBDrecomb, for using segments of identity by descent to infer recombination rates. IBDrecomb can be applied to sequenced population cohorts to obtain high-resolution, population-specific recombination maps. In simulated admixed data, IBDrecomb obtains higher accuracy than admixture-based estimation of recombination rates. When applied to 2,500 simulated individuals, IBDrecomb obtains similar accuracy to a linkage-disequilibrium (LD)-based method applied to 96 individuals (the largest number for which computation is tractable). Compared to LD-based maps, our IBD-based maps have the advantage of estimating recombination rates in the recent past rather than the distant past. We used IBDrecomb to generate new recombination maps for European Americans and for African Americans from TOPMed sequence data from the Framingham Heart Study (1,626 unrelated individuals) and the Jackson Heart Study (2,046 unrelated individuals), and we compare them to LD-based, admixture-based, and family-based maps.
Collapse
Affiliation(s)
- Ying Zhou
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA.
| | - Brian L Browning
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA 98195, USA
| | - Sharon R Browning
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
23
|
From molecules to populations: appreciating and estimating recombination rate variation. Nat Rev Genet 2020; 21:476-492. [DOI: 10.1038/s41576-020-0240-1] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/15/2020] [Indexed: 02/07/2023]
|
24
|
V. Barroso G, Puzović N, Dutheil JY. Inference of recombination maps from a single pair of genomes and its application to ancient samples. PLoS Genet 2019; 15:e1008449. [PMID: 31725722 PMCID: PMC6879166 DOI: 10.1371/journal.pgen.1008449] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 11/26/2019] [Accepted: 09/30/2019] [Indexed: 12/11/2022] Open
Abstract
Understanding the causes and consequences of recombination landscape evolution is a fundamental goal in genetics that requires recombination maps from across the tree of life. Such maps can be obtained from population genomic datasets, but require large sample sizes. Alternative methods are therefore necessary to research organisms where such datasets cannot be generated easily, such as non-model or ancient species. Here we extend the sequentially Markovian coalescent model to jointly infer demography and the spatial variation in recombination rate. Using extensive simulations and sequence data from humans, fruit-flies and a fungal pathogen, we demonstrate that iSMC accurately infers recombination maps under a wide range of scenarios-remarkably, even from a single pair of unphased genomes. We exploit this possibility and reconstruct the recombination maps of ancient hominins. We report that the ancient and modern maps are correlated in a manner that reflects the established phylogeny of Neanderthals, Denisovans, and modern human populations.
Collapse
Affiliation(s)
- Gustavo V. Barroso
- Max Planck Institute for Evolutionary Biology, Department of Evolutionary Genetics, August-Thienemann-Straße , Plön–GERMANY
| | - Nataša Puzović
- Max Planck Institute for Evolutionary Biology, Department of Evolutionary Genetics, August-Thienemann-Straße , Plön–GERMANY
| | - Julien Y. Dutheil
- Max Planck Institute for Evolutionary Biology, Department of Evolutionary Genetics, August-Thienemann-Straße , Plön–GERMANY
| |
Collapse
|
25
|
Spence JP, Song YS. Inference and analysis of population-specific fine-scale recombination maps across 26 diverse human populations. SCIENCE ADVANCES 2019; 5:eaaw9206. [PMID: 31681842 PMCID: PMC6810367 DOI: 10.1126/sciadv.aaw9206] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Accepted: 09/13/2019] [Indexed: 05/28/2023]
Abstract
Fine-scale rates of meiotic recombination vary by orders of magnitude across the genome and differ between species and even populations. Studying cross-population differences has been stymied by the confounding effects of demographic history. To address this problem, we developed a demography-aware method to infer fine-scale recombination rates and applied it to 26 diverse human populations, inferring population-specific recombination maps. These maps recapitulate many aspects of the history of these populations including signatures of the trans-Atlantic slave trade and the Iberian colonization of the Americas. We also investigated modulators of the local recombination rate, finding further evidence that Polycomb group proteins and the trimethylation of H3K27 elevate recombination rates. Further differences in the recombination landscape across the genome and between populations are driven by variation in the gene that encodes the DNA binding protein PRDM9, and we quantify the weak effect of meiotic drive acting to remove its binding sites.
Collapse
Affiliation(s)
- Jeffrey P. Spence
- Graduate Group in Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Yun S. Song
- Computer Science Division and Department of Statistics, University of California, Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
26
|
Ragsdale AP, Gravel S. Models of archaic admixture and recent history from two-locus statistics. PLoS Genet 2019; 15:e1008204. [PMID: 31181058 PMCID: PMC6586359 DOI: 10.1371/journal.pgen.1008204] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2019] [Revised: 06/20/2019] [Accepted: 05/17/2019] [Indexed: 11/18/2022] Open
Abstract
We learn about population history and underlying evolutionary biology through patterns of genetic polymorphism. Many approaches to reconstruct evolutionary histories focus on a limited number of informative statistics describing distributions of allele frequencies or patterns of linkage disequilibrium. We show that many commonly used statistics are part of a broad family of two-locus moments whose expectation can be computed jointly and rapidly under a wide range of scenarios, including complex multi-population demographies with continuous migration and admixture events. A full inspection of these statistics reveals that widely used models of human history fail to predict simple patterns of linkage disequilibrium. To jointly capture the information contained in classical and novel statistics, we implemented a tractable likelihood-based inference framework for demographic history. Using this approach, we show that human evolutionary models that include archaic admixture in Africa, Asia, and Europe provide a much better description of patterns of genetic diversity across the human genome. We estimate that an unidentified, deeply diverged population admixed with modern humans within Africa both before and after the split of African and Eurasian populations, contributing 4 - 8% genetic ancestry to individuals in world-wide populations.
Collapse
Affiliation(s)
- Aaron P Ragsdale
- Department of Human Genetics, McGill University, Montreal, QC, Canada
| | - Simon Gravel
- Department of Human Genetics, McGill University, Montreal, QC, Canada
| |
Collapse
|
27
|
Hermann P, Heissl A, Tiemann-Boege I, Futschik A. LDJump: Estimating variable recombination rates from population genetic data. Mol Ecol Resour 2019; 19:623-638. [PMID: 30666785 PMCID: PMC6519033 DOI: 10.1111/1755-0998.12994] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2018] [Revised: 12/13/2018] [Accepted: 01/11/2019] [Indexed: 11/27/2022]
Abstract
As recombination plays an important role in evolution, its estimation and the identification of hotspot positions is of considerable interest. We propose a novel approach for estimating population recombination rates based on genotyping or sequence data that involves a sequential multiscale change point estimator. Our method also permits demography to be taken into account. It uses several summary statistics within a regression model fitted on suitable scenarios. Our proposed method is accurate, computationally fast, and provides a parsimonious solution by ensuring a type I error control against too many changes in the recombination rate. An application to human genome data suggests a good congruence between our estimated and experimentally identified hotspots. Our method is implemented in the R‐package LDJump, which is freely available at https://github.com/PhHermann/LDJump.
Collapse
Affiliation(s)
- Philipp Hermann
- Department of Applied Statistics, Johannes Kepler University Linz, Linz, Austria
| | - Angelika Heissl
- Institute of Biophysics, Johannes Kepler University Linz, Linz, Austria
| | | | - Andreas Futschik
- Department of Applied Statistics, Johannes Kepler University Linz, Linz, Austria
| |
Collapse
|
28
|
Chan J, Perrone V, Spence JP, Jenkins PA, Mathieson S, Song YS. A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 2018; 31:8594-8605. [PMID: 33244210 PMCID: PMC7687905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential challenges need to be addressed: (1) population data are exchangeable, calling for methods that efficiently exploit the symmetries of the data, and (2) computing likelihoods is intractable as it requires integrating over a set of correlated, extremely high-dimensional latent variables. These challenges are traditionally tackled by likelihood-free methods that use scientific simulators to generate datasets and reduce them to hand-designed, permutation-invariant summary statistics, often leading to inaccurate inference. In this work, we develop an exchangeable neural network that performs summary statistic-free, likelihood-free inference. Our framework can be applied in a black-box fashion across a variety of simulation-based tasks, both within and outside biology. We demonstrate the power of our approach on the recombination hotspot testing problem, outperforming the state-of-the-art.
Collapse
|
29
|
Spence JP, Steinrücken M, Terhorst J, Song YS. Inference of population history using coalescent HMMs: review and outlook. Curr Opin Genet Dev 2018; 53:70-76. [PMID: 30056275 PMCID: PMC6296859 DOI: 10.1016/j.gde.2018.07.002] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 07/08/2018] [Accepted: 07/09/2018] [Indexed: 01/02/2023]
Abstract
Studying how diverse human populations are related is of historical and anthropological interest, in addition to providing a realistic null model for testing for signatures of natural selection or disease associations. Furthermore, understanding the demographic histories of other species is playing an increasingly important role in conservation genetics. A number of statistical methods have been developed to infer population demographic histories using whole-genome sequence data, with recent advances focusing on allowing for more flexible modeling choices, scaling to larger data sets, and increasing statistical power. Here we review coalescent hidden Markov models, a powerful class of population genetic inference methods that can utilize linkage disequilibrium information effectively. We highlight recent advances, give advice for practitioners, point out potential pitfalls, and present possible future research directions.
Collapse
Affiliation(s)
- Jeffrey P Spence
- Computational Biology Graduate Group, University of California, Berkeley, United States
| | | | | | - Yun S Song
- Computer Science Division and Department of Statistics, University of California, Berkeley, United States; Chan Zuckerberg Biohub, San Francisco, United States.
| |
Collapse
|
30
|
Sainudiin R, Véber A. Full likelihood inference from the site frequency spectrum based on the optimal tree resolution. Theor Popul Biol 2018; 124:1-15. [DOI: 10.1016/j.tpb.2018.07.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Revised: 06/11/2018] [Accepted: 07/09/2018] [Indexed: 10/28/2022]
|
31
|
Schumer M, Xu C, Powell DL, Durvasula A, Skov L, Holland C, Blazier JC, Sankararaman S, Andolfatto P, Rosenthal GG, Przeworski M. Natural selection interacts with recombination to shape the evolution of hybrid genomes. Science 2018; 360:656-660. [PMID: 29674434 PMCID: PMC6069607 DOI: 10.1126/science.aar3684] [Citation(s) in RCA: 219] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 03/23/2018] [Indexed: 12/29/2022]
Abstract
To investigate the consequences of hybridization between species, we studied three replicate hybrid populations that formed naturally between two swordtail fish species, estimating their fine-scale genetic map and inferring ancestry along the genomes of 690 individuals. In all three populations, ancestry from the "minor" parental species is more common in regions of high recombination and where there is linkage to fewer putative targets of selection. The same patterns are apparent in a reanalysis of human and archaic admixture. These results support models in which ancestry from the minor parental species is more likely to persist when rapidly uncoupled from alleles that are deleterious in hybrids. Our analyses further indicate that selection on swordtail hybrids stems predominantly from deleterious combinations of epistatically interacting alleles.
Collapse
Affiliation(s)
- Molly Schumer
- Howard Hughes Medical Institute (HHMI), Boston, MA, USA.
- Harvard Society of Fellows, Harvard University, Cambridge, MA, USA
- Department of Biological Sciences, Columbia University, New York, NY, USA
- Centro de Investigaciones Científicas de las Huastecas "Aguazarca," Calnali, Hidalgo, Mexico
| | - Chenling Xu
- Center for Computational Biology, University of California at Berkeley, Berkeley, CA, USA
| | - Daniel L Powell
- Centro de Investigaciones Científicas de las Huastecas "Aguazarca," Calnali, Hidalgo, Mexico
- Department of Biology, Texas A&M University, College Station, TX, USA
| | - Arun Durvasula
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Laurits Skov
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Chris Holland
- Centro de Investigaciones Científicas de las Huastecas "Aguazarca," Calnali, Hidalgo, Mexico
- Department of Biology, Texas A&M University, College Station, TX, USA
| | - John C Blazier
- Department of Biology, Texas A&M University, College Station, TX, USA
- Texas A&M Institute for Genome Sciences and Society, College Station, TX, USA
| | - Sriram Sankararaman
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Peter Andolfatto
- Department of Ecology and Evolutionary Biology and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Gil G Rosenthal
- Centro de Investigaciones Científicas de las Huastecas "Aguazarca," Calnali, Hidalgo, Mexico
- Department of Biology, Texas A&M University, College Station, TX, USA
| | - Molly Przeworski
- Department of Biological Sciences, Columbia University, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA
| |
Collapse
|
32
|
Dapper AL, Payseur BA. Effects of Demographic History on the Detection of Recombination Hotspots from Linkage Disequilibrium. Mol Biol Evol 2018; 35:335-353. [PMID: 29045724 PMCID: PMC5850621 DOI: 10.1093/molbev/msx272] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
In some species, meiotic recombination is concentrated in small genomic regions. These "recombination hotspots" leave signatures in fine-scale patterns of linkage disequilibrium, raising the prospect that the genomic landscape of hotspots can be characterized from sequence variation. This approach has led to the inference that hotspots evolve rapidly in some species, but are conserved in others. Historic demographic events, such as population bottlenecks, are known to affect patterns of linkage disequilibrium across the genome, violating population genetic assumptions of this approach. Although such events are prevalent, demographic history is generally ignored when making inferences about the evolution of recombination hotspots. To determine the effect of demography on the detection of recombination hotspots, we use the coalescent to simulate haplotypes with a known recombination landscape. We measure the ability of popular linkage disequilibrium-based programs to detect hotspots across a range of demographic histories, including population bottlenecks, hidden population structure, population expansions, and population contractions. We find that demographic events have the potential to greatly reduce the power and increase the false positive rate of hotspot discovery. Neither the power nor the false positive rate of hotspot detection can be predicted without also knowing the demographic history of the sample. Our results suggest that ignoring demographic history likely overestimates the power to detect hotspots and therefore underestimates the degree of hotspot sharing between species. We suggest strategies for incorporating demographic history into population genetic inferences about recombination hotspots.
Collapse
Affiliation(s)
- Amy L Dapper
- Laboratory of Genetics, University of Wisconsin, Madison, WI
| | - Bret A Payseur
- Laboratory of Genetics, University of Wisconsin, Madison, WI
| |
Collapse
|
33
|
Ragsdale AP, Gutenkunst RN. Inferring Demographic History Using Two-Locus Statistics. Genetics 2017; 206:1037-1048. [PMID: 28413158 PMCID: PMC5499162 DOI: 10.1534/genetics.117.201251] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 04/07/2017] [Indexed: 11/18/2022] Open
Abstract
Population demographic history may be learned from contemporary genetic variation data. Methods based on aggregating the statistics of many single loci into an allele frequency spectrum (AFS) have proven powerful, but such methods ignore potentially informative patterns of linkage disequilibrium (LD) between neighboring loci. To leverage such patterns, we developed a composite-likelihood framework for inferring demographic history from aggregated statistics of pairs of loci. Using this framework, we show that two-locus statistics are more sensitive to demographic history than single-locus statistics such as the AFS. In particular, two-locus statistics escape the notorious confounding of depth and duration of a bottleneck, and they provide a means to estimate effective population size based on the recombination rather than mutation rate. We applied our approach to a Zambian population of Drosophila melanogaster Notably, using both single- and two-locus statistics, we inferred a substantially lower ancestral effective population size than previous works and did not infer a bottleneck history. Together, our results demonstrate the broad potential for two-locus statistics to enable powerful population genetic inference.
Collapse
Affiliation(s)
- Aaron P Ragsdale
- Program in Applied Mathematics, University of Arizona, Tucson, Arizona 85721
| | - Ryan N Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona 85721
| |
Collapse
|
34
|
A coalescent dual process for a Wright-Fisher diffusion with recombination and its application to haplotype partitioning. Theor Popul Biol 2016; 112:126-138. [PMID: 27594345 DOI: 10.1016/j.tpb.2016.08.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Revised: 08/19/2016] [Accepted: 08/25/2016] [Indexed: 11/24/2022]
Abstract
Duality plays an important role in population genetics. It can relate results from forwards-in-time models of allele frequency evolution with those of backwards-in-time genealogical models; a well known example is the duality between the Wright-Fisher diffusion for genetic drift and its genealogical counterpart, the coalescent. There have been a number of articles extending this relationship to include other evolutionary processes such as mutation and selection, but little has been explored for models also incorporating crossover recombination. Here, we derive from first principles a new genealogical process which is dual to a Wright-Fisher diffusion model of drift, mutation, and recombination. The process is reminiscent of the ancestral recombination graph, a widely-used multilocus genealogical model, but here ancestral lineages are typed and transition rates are regarded as being conditioned on an observed configuration at the leaves of the genealogy. Our approach is based on expressing a putative duality relationship between two models via their infinitesimal generators, and then seeking an appropriate test function to ensure the validity of the duality equation. This approach is quite general, and we use it to find dualities for several important variants, including both a discrete L-locus model of a gene and a continuous model in which mutation and recombination events are scattered along the gene according to continuous distributions. As an application of our results, we derive a series expansion for the transition function of the diffusion. Finally, we study in further detail the case in which mutation is absent. Then the dual process describes the dispersal of ancestral genetic material across the ancestors of a sample. The stationary distribution of this process is of particular interest; we show how duality relates this distribution to haplotype fixation probabilities. We develop an efficient method for computing such probabilities in multilocus models.
Collapse
|