1
|
Wilder AP, Steiner CC, Hendricks S, Haller BC, Kim C, Korody ML, Ryder OA. Genetic load and viability of a future restored northern white rhino population. Evol Appl 2024; 17:e13683. [PMID: 38617823 PMCID: PMC11009427 DOI: 10.1111/eva.13683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 03/04/2024] [Accepted: 03/06/2024] [Indexed: 04/16/2024] Open
Abstract
As biodiversity loss outpaces recovery, conservationists are increasingly turning to novel tools for preventing extinction, including cloning and in vitro gametogenesis of biobanked cells. However, restoration of populations can be hindered by low genetic diversity and deleterious genetic load. The persistence of the northern white rhino (Ceratotherium simum cottoni) now depends on the cryopreserved cells of 12 individuals. These banked genomes have higher genetic diversity than southern white rhinos (C. s. simum), a sister subspecies that successfully recovered from a severe bottleneck, but the potential impact of genetic load is unknown. We estimated how demographic history has shaped genome-wide genetic load in nine northern and 13 southern white rhinos. The bottleneck left southern white rhinos with more fixed and homozygous deleterious alleles and longer runs of homozygosity, whereas northern white rhinos retained more deleterious alleles masked in heterozygosity. To gauge the impact of genetic load on the fitness of a northern white rhino population restored from biobanked cells, we simulated recovery using fitness of southern white rhinos as a benchmark for a viable population. Unlike traditional restoration, cell-derived founders can be reintroduced in subsequent generations to boost lost genetic diversity and relieve inbreeding. In simulations with repeated reintroduction of founders into a restored population, the fitness cost of genetic load remained lower than that borne by southern white rhinos. Without reintroductions, rapid growth of the restored population (>20-30% per generation) would be needed to maintain comparable fitness. Our results suggest that inbreeding depression from genetic load is not necessarily a barrier to recovery of the northern white rhino and demonstrate how restoration from biobanked cells relieves some constraints of conventional restoration from a limited founder pool. Established conservation methods that protect healthy populations will remain paramount, but emerging technologies hold promise to bolster these tools to combat the extinction crisis.
Collapse
Affiliation(s)
- Aryn P. Wilder
- Conservation GeneticsSan Diego Zoo Wildlife AllianceEscondidoCaliforniaUSA
| | - Cynthia C. Steiner
- Conservation GeneticsSan Diego Zoo Wildlife AllianceEscondidoCaliforniaUSA
| | - Sarah Hendricks
- Conservation GeneticsSan Diego Zoo Wildlife AllianceEscondidoCaliforniaUSA
- Institute for Interdisciplinary Data SciencesUniversity of IdahoMoscowIdahoUSA
| | | | - Chang Kim
- University of CaliforniaSanta Cruz Genomics InstituteSanta CruzCaliforniaUSA
- Department of Neurological SurgeryUniversity of CaliforniaSan FranciscoCaliforniaUSA
| | - Marisa L. Korody
- Conservation GeneticsSan Diego Zoo Wildlife AllianceEscondidoCaliforniaUSA
| | - Oliver A. Ryder
- Conservation GeneticsSan Diego Zoo Wildlife AllianceEscondidoCaliforniaUSA
| |
Collapse
|
2
|
Kent TV, Schrider DR, Matute DR. Demographic history and the efficacy of selection in the globally invasive mosquito Aedes aegypti. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.07.584008. [PMID: 38559089 PMCID: PMC10979846 DOI: 10.1101/2024.03.07.584008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Aedes aegypti is the main vector species of yellow fever, dengue, zika and chikungunya. The species is originally from Africa but has experienced a spectacular expansion in its geographic range to a large swath of the world, the demographic effects of which have remained largely understudied. In this report, we examine whole-genome sequences from 6 countries in Africa, North America, and South America to investigate the demographic history of the spread of Ae. aegypti into the Americas its impact on genomic diversity. In the Americas, we observe patterns of strong population structure consistent with relatively low (but probably non-zero) levels of gene flow but occasional long-range dispersal and/or recolonization events. We also find evidence that the colonization of the Americas has resulted in introduction bottlenecks. However, while each sampling location shows evidence of a past population contraction and subsequent recovery, our results suggest that the bottlenecks in America have led to a reduction in genetic diversity of only ~35% relative to African populations, and the American samples have retained high levels of genetic diversity (expected heterozygosity of ~0.02 at synonymous sites) and have experienced only a minor reduction in the efficacy of selection. These results evoke the image of an invasive species that has expanded its range with remarkable genetic resilience in the face of strong eradication pressure.
Collapse
Affiliation(s)
- Tyler V. Kent
- Department of Biology, College of Arts and Sciences, University of North Carolina, Chapel Hill, NC, USA
- Department of Genetics, School of Medicine, University of North Carolina, Chapel Hill, NC, USA
| | - Daniel R. Schrider
- Department of Genetics, School of Medicine, University of North Carolina, Chapel Hill, NC, USA
| | - Daniel R. Matute
- Department of Biology, College of Arts and Sciences, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
3
|
vonHoldt BM, Stahler DR, Brzeski KE, Musiani M, Peterson R, Phillips M, Stephenson J, Laudon K, Meredith E, Vucetich JA, Leonard JA, Wayne RK. Demographic history shapes North American gray wolf genomic diversity and informs species' conservation. Mol Ecol 2024; 33:e17231. [PMID: 38054561 DOI: 10.1111/mec.17231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 11/19/2023] [Accepted: 11/21/2023] [Indexed: 12/07/2023]
Abstract
Effective population size estimates are critical information needed for evolutionary predictions and conservation decisions. This is particularly true for species with social factors that restrict access to breeding or experience repeated fluctuations in population size across generations. We investigated the genomic estimates of effective population size along with diversity, subdivision, and inbreeding from 162,109 minimally filtered and 81,595 statistically neutral and unlinked SNPs genotyped in 437 grey wolf samples from North America collected between 1986 and 2021. We found genetic structure across North America, represented by three distinct demographic histories of western, central, and eastern regions of the continent. Further, grey wolves in the northern Rocky Mountains have lower genomic diversity than wolves of the western Great Lakes and have declined over time. Effective population size estimates revealed the historical signatures of continental efforts of predator extermination, despite a quarter century of recovery efforts. We are the first to provide molecular estimates of effective population size across distinct grey wolf populations in North America, which ranged between Ne ~ 275 and 3050 since early 1980s. We provide data that inform managers regarding the status and importance of effective population size estimates for grey wolf conservation, which are on average 5.2-9.3% of census estimates for this species. We show that while grey wolves fall above minimum effective population sizes needed to avoid extinction due to inbreeding depression in the short term, they are below sizes predicted to be necessary to avoid long-term risk of extinction.
Collapse
Affiliation(s)
- Bridgett M vonHoldt
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey, USA
| | - Daniel R Stahler
- Yellowstone Center for Resources, Yellowstone National Park, Wyoming, USA
| | - Kristin E Brzeski
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, Michigan, USA
| | - Marco Musiani
- Dipartimento di Scienze Biologiche, Geologiche e Ambientali (BiGeA), Università di Bologna, Bologna, Italy
| | - Rolf Peterson
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, Michigan, USA
| | | | | | - Kent Laudon
- California Department of Fish and Wildlife, Northern Region, Redding, California, USA
| | - Erin Meredith
- California Department of Fish and Wildlife, Wildlife Forensic Laboratory, Sacramento, California, USA
| | - John A Vucetich
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, Michigan, USA
| | - Jennifer A Leonard
- Conservation and Evolutionary Genetics Group, Estación Biológica de Doñana (EBD-CSIC), Seville, Spain
| | - Robert K Wayne
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, California, USA
| |
Collapse
|
4
|
Teixeira H, Le Corre M, Michon L, Nicoll MAC, Jaeger A, Nikolic N, Pinet P, Couzi FX, Humeau L. Past volcanic activity predisposes an endemic threatened seabird to negative anthropogenic impacts. Sci Rep 2024; 14:1960. [PMID: 38263429 PMCID: PMC10805739 DOI: 10.1038/s41598-024-52556-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Accepted: 01/19/2024] [Indexed: 01/25/2024] Open
Abstract
Humans are regularly cited as the main driver of current biodiversity extinction, but the impact of historic volcanic activity is often overlooked. Pre-human evidence of wildlife abundance and diversity are essential for disentangling anthropogenic impacts from natural events. Réunion Island, with its intense and well-documented volcanic activity, endemic biodiversity, long history of isolation and recent human colonization, provides an opportunity to disentangle these processes. We track past demographic changes of a critically endangered seabird, the Mascarene petrel Pseudobulweria aterrima, using genome-wide SNPs. Coalescent modeling suggested that a large ancestral population underwent a substantial population decline in two distinct phases, ca. 125,000 and 37,000 years ago, coinciding with periods of major eruptions of Piton des Neiges. Subsequently, the ancestral population was fragmented into the two known colonies, ca. 1500 years ago, following eruptions of Piton de la Fournaise. In the last century, both colonies declined significantly due to anthropogenic activities, and although the species was initially considered extinct, it was rediscovered in the 1970s. Our findings suggest that the current conservation status of wildlife on volcanic islands should be firstly assessed as a legacy of historic volcanic activity, and thereafter by the increasing anthropogenic impacts, which may ultimately drive species towards extinction.
Collapse
Affiliation(s)
- Helena Teixeira
- UMR ENTROPIE (Université de La Réunion, IRD, CNRS, IFREMER, Université de Nouvelle-Calédonie), 15 Avenue René Cassin, CS 92003, 97744, Saint Denis Cedex 9, Ile de La Réunion, France.
| | - Matthieu Le Corre
- UMR ENTROPIE (Université de La Réunion, IRD, CNRS, IFREMER, Université de Nouvelle-Calédonie), 15 Avenue René Cassin, CS 92003, 97744, Saint Denis Cedex 9, Ile de La Réunion, France
| | - Laurent Michon
- Université de La Réunion, Laboratoire Géosciences Réunion, 97744, Saint Denis, France
- Université Paris Cité, Institut de physique du globe de Paris, CNRS, 75005, Paris, France
| | - Malcolm A C Nicoll
- Institute of Zoology, Zoological Society of London, Regent's Park, London, NW1 4RY, UK
| | - Audrey Jaeger
- UMR ENTROPIE (Université de La Réunion, IRD, CNRS, IFREMER, Université de Nouvelle-Calédonie), 15 Avenue René Cassin, CS 92003, 97744, Saint Denis Cedex 9, Ile de La Réunion, France
| | | | - Patrick Pinet
- Parc National de La Réunion, Life+ Pétrels, 258 Rue de la République, 97431, Plaine des Palmistes, Réunion Island, France
| | - François-Xavier Couzi
- Société d'Etudes Ornithologiques de La Réunion (SEOR), 13 ruelle des Orchidées, 97440, Saint André, Réunion Island, France
| | - Laurence Humeau
- UMR PVBMT (Université de La Réunion, CIRAD), 15 Avenue René Cassin, CS 92003, 97744, Saint Denis Cedex 9, Ile de La Réunion, France
| |
Collapse
|
5
|
Lauterbur ME, Cavassim MIA, Gladstein AL, Gower G, Pope NS, Tsambos G, Adrion J, Belsare S, Biddanda A, Caudill V, Cury J, Echevarria I, Haller BC, Hasan AR, Huang X, Iasi LNM, Noskova E, Obsteter J, Pavinato VAC, Pearson A, Peede D, Perez MF, Rodrigues MF, Smith CCR, Spence JP, Teterina A, Tittes S, Unneberg P, Vazquez JM, Waples RK, Wohns AW, Wong Y, Baumdicker F, Cartwright RA, Gorjanc G, Gutenkunst RN, Kelleher J, Kern AD, Ragsdale AP, Ralph PL, Schrider DR, Gronau I. Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations. eLife 2023; 12:RP84874. [PMID: 37342968 DOI: 10.7554/elife.84874] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/23/2023] Open
Abstract
Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone.
Collapse
Affiliation(s)
- M Elise Lauterbur
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, United States
| | - Maria Izabel A Cavassim
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, United States
| | | | - Graham Gower
- Section for Molecular Ecology and Evolution, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Nathaniel S Pope
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Georgia Tsambos
- School of Mathematics and Statistics, University of Melbourne, Melbourne, Australia
| | - Jeffrey Adrion
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
- Ancestry DNA, San Francisco, United States
| | - Saurabh Belsare
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | | | - Victoria Caudill
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Jean Cury
- Universite Paris-Saclay, CNRS, INRIA, Laboratoire Interdisciplinaire des Sciences du Numerique, Orsay, France
| | | | - Benjamin C Haller
- Department of Computational Biology, Cornell University, Ithaca, United States
| | - Ahmed R Hasan
- Department of Cell and Systems Biology, University of Toronto, Toronto, Canada
- Department of Biology, University of Toronto Mississauga, Mississauga, Canada
| | - Xin Huang
- Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, Vienna, Austria
| | | | - Ekaterina Noskova
- Computer Technologies Laboratory, ITMO University, St Petersburg, Russian Federation
| | - Jana Obsteter
- Agricultural Institute of Slovenia, Department of Animal Science, Ljubljana, Slovenia
| | | | - Alice Pearson
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
- Department of Zoology, University of Cambridge, Cambridge, United Kingdom
| | - David Peede
- Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, United States
- Center for Computational Molecular Biology, Brown University, Providence, United States
| | - Manolo F Perez
- Department of Genetics and Evolution, Federal University of Sao Carlos, Sao Carlos, Brazil
| | - Murillo F Rodrigues
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Chris C R Smith
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Jeffrey P Spence
- Department of Genetics, Stanford University School of Medicine, Stanford, United States
| | - Anastasia Teterina
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Silas Tittes
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Per Unneberg
- Department of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Juan Manuel Vazquez
- Department of Integrative Biology, University of California, Berkeley, Berkeley, United States
| | - Ryan K Waples
- Department of Biostatistics, University of Washington, Seattle, United States
| | | | - Yan Wong
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| | - Franz Baumdicker
- Cluster of Excellence - Controlling Microbes to Fight Infections, Eberhard Karls Universit¨at Tubingen, Tubingen, Germany
| | - Reed A Cartwright
- School of Life Sciences and The Biodesign Institute, Arizona State University, Tempe, United States
| | - Gregor Gorjanc
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, United Kingdom
| | - Ryan N Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, United States
| | - Jerome Kelleher
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| | - Andrew D Kern
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
| | - Aaron P Ragsdale
- Department of Integrative Biology, University of Wisconsin-Madison, Madison, United States
| | - Peter L Ralph
- Institute of Ecology and Evolution, University of Oregon, Eugene, United States
- Department of Mathematics, University of Oregon, Eugene, United States
| | - Daniel R Schrider
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, United States
| | - Ilan Gronau
- Efi Arazi School of Computer Science, Reichman University, Herzliya, Israel
| |
Collapse
|
6
|
Guarracino A, Buonaiuto S, de Lima LG, Potapova T, Rhie A, Koren S, Rubinstein B, Fischer C, Gerton JL, Phillippy AM, Colonna V, Garrison E. Recombination between heterologous human acrocentric chromosomes. Nature 2023; 617:335-343. [PMID: 37165241 PMCID: PMC10172130 DOI: 10.1038/s41586-023-05976-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 03/17/2023] [Indexed: 05/12/2023]
Abstract
The short arms of the human acrocentric chromosomes 13, 14, 15, 21 and 22 (SAACs) share large homologous regions, including ribosomal DNA repeats and extended segmental duplications1,2. Although the resolution of these regions in the first complete assembly of a human genome-the Telomere-to-Telomere Consortium's CHM13 assembly (T2T-CHM13)-provided a model of their homology3, it remained unclear whether these patterns were ancestral or maintained by ongoing recombination exchange. Here we show that acrocentric chromosomes contain pseudo-homologous regions (PHRs) indicative of recombination between non-homologous sequences. Utilizing an all-to-all comparison of the human pangenome from the Human Pangenome Reference Consortium4 (HPRC), we find that contigs from all of the SAACs form a community. A variation graph5 constructed from centromere-spanning acrocentric contigs indicates the presence of regions in which most contigs appear nearly identical between heterologous acrocentric chromosomes in T2T-CHM13. Except on chromosome 15, we observe faster decay of linkage disequilibrium in the pseudo-homologous regions than in the corresponding short and long arms, indicating higher rates of recombination6,7. The pseudo-homologous regions include sequences that have previously been shown to lie at the breakpoint of Robertsonian translocations8, and their arrangement is compatible with crossover in inverted duplications on chromosomes 13, 14 and 21. The ubiquity of signals of recombination between heterologous acrocentric chromosomes seen in the HPRC draft pangenome suggests that these shared sequences form the basis for recurrent Robertsonian translocations, providing sequence and population-based confirmation of hypotheses first developed from cytogenetic studies 50 years ago9.
Collapse
Affiliation(s)
- Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Genomics Research Centre, Human Technopole, Milan, Italy
| | - Silvia Buonaiuto
- Institute of Genetics and Biophysics, National Research Council, Naples, Italy
| | | | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Christian Fischer
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | | | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Vincenza Colonna
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Institute of Genetics and Biophysics, National Research Council, Naples, Italy
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA.
| |
Collapse
|
7
|
Johri P, Pfeifer SP, Jensen JD. Developing an evolutionary baseline model for humans: jointly inferring purifying selection with population history. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.11.536488. [PMID: 37090533 PMCID: PMC10120674 DOI: 10.1101/2023.04.11.536488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Building evolutionarily appropriate baseline models for natural populations is not only important for answering fundamental questions in population genetics - including quantifying the relative contributions of adaptive vs. non-adaptive processes - but it is also essential for identifying candidate loci experiencing relatively rare and episodic forms of selection ( e.g., positive or balancing selection). Here, a baseline model was developed for a human population of West African ancestry, the Yoruba, comprising processes constantly operating on the genome ( i.e. , purifying and background selection, population size changes, recombination rate heterogeneity, and gene conversion). Specifically, to perform joint inference of selective effects with demography, an approximate Bayesian approach was employed that utilizes the decay of background selection effects around functional elements, taking into account genomic architecture. This approach inferred a recent 6-fold population growth together with a distribution of fitness effects that is skewed towards effectively neutral mutations. Importantly, these results further suggest that, while strong and/or frequent recurrent positive selection is inconsistent with observed data, weak to moderate positive selection is consistent but unidentifiable if rare.
Collapse
|
8
|
Turba R, Richmond JQ, Fitz-Gibbon S, Morselli M, Fisher RN, Swift CC, Ruiz-Campos G, Backlin AR, Dellith C, Jacobs DK. Genetic structure and historic demography of endangered unarmoured threespine stickleback at southern latitudes signals a potential new management approach. Mol Ecol 2022; 31:6515-6530. [PMID: 36205603 PMCID: PMC10092051 DOI: 10.1111/mec.16722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 09/05/2022] [Accepted: 09/29/2022] [Indexed: 01/13/2023]
Abstract
Habitat loss, flood control infrastructure, and drought have left most of southern California and northern Baja California's native freshwater fish near extinction, including the endangered unarmoured threespine stickleback (Gasterosteus aculeatus williamsoni). This subspecies, an unusual morph lacking the typical lateral bony plates of the G. aculeatus complex, occurs at arid southern latitudes in the eastern Pacific Ocean and survives in only three inland locations. Managers have lacked molecular data to answer basic questions about the ancestry and genetic distinctiveness of unarmoured populations. These data could be used to prioritize conservation efforts. We sampled G. aculeatus from 36 localities and used microsatellites and whole genome data to place unarmoured populations within the broader evolutionary context of G. aculeatus across southern California/northern Baja California. We identified three genetic groups with none consisting solely of unarmoured populations. Unlike G. aculeatus at northern latitudes, where Pleistocene glaciation has produced similar historical demographic profiles across populations, we found markedly different demographics depending on sampling location, with inland unarmoured populations showing steeper population declines and lower heterozygosity compared to low armoured populations in coastal lagoons. One exception involved the only high elevation population in the region, where the demography and alleles of unarmoured fish were similar to low armoured populations near the coast, exposing one of several cases of artificial translocation. Our results suggest that the current "management-by-phenotype" approach, based on lateral plates, is incidentally protecting the most imperilled populations; however, redirecting efforts toward evolutionary units, regardless of phenotype, may more effectively preserve adaptive potential.
Collapse
Affiliation(s)
- Rachel Turba
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA
| | | | - Sorel Fitz-Gibbon
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA
| | - Marco Morselli
- Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, California, USA
| | | | - Camm C Swift
- Emeritus, Section of Fishes, Natural History Museum of Los Angeles County, Los Angeles, California, USA
| | - Gorgonio Ruiz-Campos
- Ichthyological Collection, Facultad de Ciencias, Universidad Autónoma de Baja California, Ensenada, Baja California, Mexico
| | - Adam R Backlin
- U.S. Geological Survey, Western Ecological Research Center, San Diego Field Station-Santa Ana Office, Santa Ana, California, USA
| | - Chris Dellith
- U.S. Fish and Wildlife Service, Ventura, California, USA
| | - David K Jacobs
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, USA
| |
Collapse
|
9
|
The genomic origins of the world's first farmers. Cell 2022; 185:1842-1859.e18. [PMID: 35561686 PMCID: PMC9166250 DOI: 10.1016/j.cell.2022.04.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 03/04/2022] [Accepted: 04/06/2022] [Indexed: 11/24/2022]
Abstract
The precise genetic origins of the first Neolithic farming populations in Europe and Southwest Asia, as well as the processes and the timing of their differentiation, remain largely unknown. Demogenomic modeling of high-quality ancient genomes reveals that the early farmers of Anatolia and Europe emerged from a multiphase mixing of a Southwest Asian population with a strongly bottlenecked western hunter-gatherer population after the last glacial maximum. Moreover, the ancestors of the first farmers of Europe and Anatolia went through a period of extreme genetic drift during their westward range expansion, contributing highly to their genetic distinctiveness. This modeling elucidates the demographic processes at the root of the Neolithic transition and leads to a spatial interpretation of the population history of Southwest Asia and Europe during the late Pleistocene and early Holocene.
Collapse
|
10
|
Wall JD, Robinson JA, Cox LA. High-Resolution Estimates of Crossover and Noncrossover Recombination from a Captive Baboon Colony. Genome Biol Evol 2022; 14:evac040. [PMID: 35325119 PMCID: PMC9048888 DOI: 10.1093/gbe/evac040] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/02/2022] [Indexed: 11/17/2022] Open
Abstract
Homologous recombination has been extensively studied in humans and a handful of model organisms. Much less is known about recombination in other species, including nonhuman primates. Here, we present a study of crossovers (COs) and noncrossover (NCO) recombination in olive baboons (Papio anubis) from two pedigrees containing a total of 20 paternal and 17 maternal meioses, and compare these results to linkage disequilibrium (LD) based recombination estimates from 36 unrelated olive baboons. We demonstrate how COs, combined with LD-based recombination estimates, can be used to identify genome assembly errors. We also quantify sex-specific differences in recombination rates, including elevated male CO and reduced female CO rates near telomeres. Finally, we add to the increasing body of evidence suggesting that while most NCO recombination tracts in mammals are short (e.g., <500 bp), there is a non-negligible fraction of longer (e.g., >1 kb) NCO tracts. For NCO tracts shorter than 10 kb, we fit a mixture of two (truncated) geometric distributions model to the NCO tract length distribution and estimate that >99% of all NCO tracts are very short (mean 24 bp), but the remaining tracts can be quite long (mean 4.3 kb). A single geometric distribution model for NCO tract lengths is incompatible with the data, suggesting that LD-based methods for estimating NCO recombination rates that make this assumption may need to be modified.
Collapse
Affiliation(s)
- Jeffrey D. Wall
- Institute for Human Genetics, University of California San Francisco, USA
| | | | - Laura A. Cox
- Center for Precision Medicine, Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, USA
| |
Collapse
|
11
|
DeGiorgio M, Szpiech ZA. A spatially aware likelihood test to detect sweeps from haplotype distributions. PLoS Genet 2022; 18:e1010134. [PMID: 35404934 PMCID: PMC9022890 DOI: 10.1371/journal.pgen.1010134] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 04/21/2022] [Accepted: 03/04/2022] [Indexed: 01/13/2023] Open
Abstract
The inference of positive selection in genomes is a problem of great interest in evolutionary genomics. By identifying putative regions of the genome that contain adaptive mutations, we are able to learn about the biology of organisms and their evolutionary history. Here we introduce a composite likelihood method that identifies recently completed or ongoing positive selection by searching for extreme distortions in the spatial distribution of the haplotype frequency spectrum along the genome relative to the genome-wide expectation taken as neutrality. Furthermore, the method simultaneously infers two parameters of the sweep: the number of sweeping haplotypes and the “width” of the sweep, which is related to the strength and timing of selection. We demonstrate that this method outperforms the leading haplotype-based selection statistics, though strong signals in low-recombination regions merit extra scrutiny. As a positive control, we apply it to two well-studied human populations from the 1000 Genomes Project and examine haplotype frequency spectrum patterns at the LCT and MHC loci. We also apply it to a data set of brown rats sampled in NYC and identify genes related to olfactory perception. To facilitate use of this method, we have implemented it in user-friendly open source software. Identifying regions of the genome that contain adaptive variation is of fundamental interest in evolutionary biology, providing insight into an organism’s history and biology. When positive selection is recent or ongoing, we expect to find genomic patterns such as high frequency haplotypes and low genetic diversity in the vicinity of the adaptive locus. Here we develop a statistic to identify these regions based on distortions of the haplotype frequency spectrum from a background distribution. We evaluate the performance of this statistic under numerous realistic settings of interest to empiricists and demonstrate its superior performance relative to other haplotype-based selection statistics. We also apply this statistic to real population-genetic data. As a positive control, we explore two well-studied loci, LCT and MHC, in a European and an African human population that show strong evidence for selection. We also apply this statistic to the genomes of an urban brown rat population, where we uncover evidence for adaptation in olfactory perception genes. We release user-friendly software implementing this statistic.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, Florida, United States of America
- * E-mail: (MD); (ZAS)
| | - Zachary A. Szpiech
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States of America
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail: (MD); (ZAS)
| |
Collapse
|
12
|
Stabilizing selection on Atlantic cod supergenes through a millennium of extensive exploitation. Proc Natl Acad Sci U S A 2022; 119:2114904119. [PMID: 35165196 PMCID: PMC8872764 DOI: 10.1073/pnas.2114904119] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/04/2022] [Indexed: 12/21/2022] Open
Abstract
Ecological disruption due to human impacts is evident worldwide, and a key to mitigation lies in characterizing the underlying mechanisms of species and ecosystem stability. Here we show that three extensive “supergenes” are maintained in Atlantic cod by stabilizing selection, tying these genes to the persistence of a keystone species distributed across the northern Atlantic Ocean. Removal of this species has caused severe ecosystem reshuffling in several areas of its range. Genomic inference of historic stock sizes further shows that cod has been under pressure in the North Sea system since the Viking period, in line with zooarchaeological records. Expansion of fisheries in Northern Europe through the past millennium is well documented and supports the inferred long-term declines. Life on Earth has been characterized by recurring cycles of ecological stasis and disruption, relating biological eras to geological and climatic transitions through the history of our planet. Due to the increasing degree of ecological abruption caused by human influences many advocate that we now have entered the geological era of the Anthropocene, or “the age of man.” Considering the ongoing mass extinction and ecosystem reshuffling observed worldwide, a better understanding of the drivers of ecological stasis will be a requisite for identifying routes of intervention and mitigation. Ecosystem stability may rely on one or a few keystone species, and the loss of such species could potentially have detrimental effects. The Atlantic cod (Gadus morhua) has historically been highly abundant and is considered a keystone species in ecosystems of the northern Atlantic Ocean. Collapses of cod stocks have been observed on both sides of the Atlantic and reported to have detrimental effects that include vast ecosystem reshuffling. By whole-genome resequencing we demonstrate that stabilizing selection maintains three extensive “supergenes” in Atlantic cod, linking these genes to species persistence and ecological stasis. Genomic inference of historic effective population sizes shows continued declines for cod in the North Sea–Skagerrak–Kattegat system through the past millennia, consistent with an early onset of the marine Anthropocene through industrialization and commercialization of fisheries throughout the medieval period.
Collapse
|
13
|
Liu HL, Harris AJ, Wang ZF, Chen HF, Li ZA, Wei X. The genome of the Paleogene relic tree Bretschneidera sinensis: insights into trade-offs in gene family evolution, demographic history, and adaptive SNPs. DNA Res 2022; 29:6523039. [PMID: 35137004 PMCID: PMC8825261 DOI: 10.1093/dnares/dsac003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Indexed: 11/13/2022] Open
Abstract
Among relic species, genomic information may provide the key to inferring their long-term survival. Therefore, in this study, we investigated the genome of the Paleogene relic tree species, Bretschneidera sinensis, which is a rare endemic species within southeastern Asia. Specifically, we assembled a high-quality genome for B. sinensis using PacBio high-fidelity and high-throughput chromosome conformation capture reads and annotated it with long and short RNA sequencing reads. Using the genome, we then detected a trade-off between active and passive disease defences among the gene families. Gene families involved in salicylic acid and MAPK signalling pathways expanded as active defence mechanisms against disease, but families involved in terpene synthase activity as passive defences contracted. When inferring the long evolutionary history of B. sinensis, we detected population declines corresponding to historical climate change around the Eocene–Oligocene transition and to climatic fluctuations in the Quaternary. Additionally, based on this genome, we identified 388 single nucleotide polymorphisms (SNPs) that were likely under selection, and showed diverse functions in growth and stress responses. Among them, we further found 41 climate-associated SNPs. The genome of B. sinensis and the SNP dataset will be important resources for understanding extinction/diversification processes using comparative genomics in different lineages.
Collapse
Affiliation(s)
- Hai-Lin Liu
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.,University of Chinese Academy of Sciences, Beijing, 100049, China.,Environmental Horticulture Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou, 510640, China.,Key Laboratory of Ornamental Plant Germplasm Innovation and Utilization, Guangzhou, 510640, China
| | - A J Harris
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.,Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Zheng-Feng Wang
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.,Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, 511458, China.,Center of Plant Ecology, Core Botanical Gardens, Chinese Academy of Sciences, Guangzhou, 510650, China.,Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Hong-Feng Chen
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.,Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Zhi-An Li
- Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.,Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, 511458, China.,Center of Plant Ecology, Core Botanical Gardens, Chinese Academy of Sciences, Guangzhou, 510650, China.,Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Xiao Wei
- Guangxi Institute of Botany, Chinese Academy of Sciences, Guilin, 541006, China
| |
Collapse
|
14
|
Vecchyo DOD, Lohmueller KE, Novembre J. Haplotype-based inference of the distribution of fitness effects. Genetics 2022; 220:6501446. [PMID: 35100400 PMCID: PMC8982047 DOI: 10.1093/genetics/iyac002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 12/18/2021] [Indexed: 11/13/2022] Open
Abstract
Abstract
Recent genome sequencing studies with large sample sizes in humans have discovered a vast quantity of low-frequency variants, providing an important source of information to analyze how selection is acting on human genetic variation. In order to estimate the strength of natural selection acting on low-frequency variants, we have developed a likelihood-based method that uses the lengths of pairwise identity-by-state between haplotypes carrying low-frequency variants. We show that in some non-equilibrium populations (such as those that have had recent population expansions) it is possible to distinguish between positive or negative selection acting on a set of variants. With our new framework, one can infer a fixed selection intensity acting on a set of variants at a particular frequency, or a distribution of selection coefficients for standing variants and new mutations. We show an application of our method to the UK10K phased haplotype dataset of individuals.
Collapse
Affiliation(s)
- Diego Ortega-Del Vecchyo
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Querétaro, 76230, México
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, California, 90095, United States of America
| | - Kirk E Lohmueller
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, California, 90095, United States of America
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, California, 90095, United States of America
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, 90095, United States of America
| | - John Novembre
- Department of Human Genetics, University of Chicago, Chicago, Illinois, 60637, United States of America
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, 60637, United States of America
| |
Collapse
|
15
|
Lu CW, Yao CT, Hung CM. Domestication obscures genomic estimates of population history. Mol Ecol 2021; 31:752-766. [PMID: 34779057 DOI: 10.1111/mec.16277] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 11/05/2021] [Accepted: 11/08/2021] [Indexed: 11/28/2022]
Abstract
Domesticated species are valuable models to examine phenotypic evolution, and knowledge on domestication history is critical for understanding the trajectories of evolutionary changes. Sequentially Markov Coalescent models are often used to infer domestication history. However, domestication practices may obscure the signal left by population history, affecting demographic inference. Here we assembled the genomes of a recently domesticated species-the society finch-and its parent species-the white-rumped munia-to examine its domestication history. We applied genomic analyses to two society finch breeds and white-rumped munias to test whether domestication of the former resulted from inbreeding or hybridization. The society finch showed longer and more runs of homozygosity and lower genomic heterozygosity than the white-rumped munia, supporting an inbreeding origin in the former. Blocks of white-rumped munia and other ancestry in society finch genomes showed similar genetic distance between the two taxa, inconsistent with the hybridization origin hypothesis. We then applied two Sequentially Markov Coalescent models-psmc and smc++-to infer the demographic histories of both. Surprisingly, the two models did not reveal a recent population bottleneck, but instead the psmc model showed a specious, dramatic population increase in the society finch. Subsequently, we used simulated genomes based on an array of demographic scenarios to demonstrate that recent inbreeding, not hybridization, caused the distorted psmc population trajectory. Such analyses could have misled our understanding of the domestication process. Our findings stress caution when interpreting the histories of recently domesticated species inferred by psmc, arguing that these histories require multiple analyses to validate.
Collapse
Affiliation(s)
- Chia-Wei Lu
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Cheng-Te Yao
- Division of Zoology, Endemic Species Research Institute, Nantou, Taiwan
| | - Chih-Ming Hung
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
16
|
Nadachowska‐Brzyska K, Konczal M, Babik W. Navigating the temporal continuum of effective population size. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13740] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
| | | | - Wieslaw Babik
- Jagiellonian University in Kraków Faculty of Biology Institute of Environmental Sciences Kraków Poland
| |
Collapse
|
17
|
Patil AB, Vijay N. Repetitive genomic regions and the inference of demographic history. Heredity (Edinb) 2021; 127:151-166. [PMID: 34002046 PMCID: PMC8322061 DOI: 10.1038/s41437-021-00443-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2021] [Revised: 04/16/2021] [Accepted: 04/17/2021] [Indexed: 02/03/2023] Open
Abstract
Inference of demographic histories using whole-genome datasets has provided insights into diversification, adaptation, hybridization, and plant-pathogen interactions, and stimulated debate on the impact of anthropogenic interventions and past climate on species demography. However, the impact of repetitive genomic regions on these inferences has mostly been ignored by masking of repeats. We use the Populus trichocarpa genome (Pop_tri_v3) to show that masking of repeat regions leads to lower estimates of effective population size (Ne) in the distant past in contrast to an increase in Ne estimates in recent times. However, in human datasets, masking of repeats resulted in lower estimates of Ne at all time points. We demonstrate that repeats affect demographic inferences using diverse methods like PSMC, MSMC, SMC++, and the Stairway plot. Our genomic analysis revealed that the biases in Ne estimates were dependent on the repeat class type and its abundance in each atomic interval. Notably, we observed a weak, yet consistently significant negative correlation between the repeat abundance of an atomic interval and the Ne estimates for that interval, which potentially reflects the recombination rate variation within the genome. The rationale for the masking of repeats has been that variants identified within these regions are erroneous. We find that polymorphisms in some repeat classes occur in callable regions and reflect reliable coalescence histories (e.g., LTR Gypsy, LTR Copia). The current demography inference methods do not handle repeats explicitly, and hence the effect of individual repeat classes needs careful consideration in comparative analysis. Deciphering the repeat demographic histories might provide a clear understanding of the processes involved in repeat accumulation.
Collapse
Affiliation(s)
- Ajinkya Bharatraj Patil
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal, Bhauri, Madhya Pradesh, India
| | - Nagarjun Vijay
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal, Bhauri, Madhya Pradesh, India.
| |
Collapse
|
18
|
Garcia JA, Lohmueller KE. Negative linkage disequilibrium between amino acid changing variants reveals interference among deleterious mutations in the human genome. PLoS Genet 2021; 17:e1009676. [PMID: 34319975 PMCID: PMC8351996 DOI: 10.1371/journal.pgen.1009676] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Revised: 08/09/2021] [Accepted: 06/22/2021] [Indexed: 11/18/2022] Open
Abstract
Evolutionary forces like Hill-Robertson interference and negative epistasis can lead to deleterious mutations being found on distinct haplotypes. However, the extent to which these forces depend on the selection and dominance coefficients of deleterious mutations and shape genome-wide patterns of linkage disequilibrium (LD) in natural populations with complex demographic histories has not been tested. In this study, we first used forward-in-time simulations to predict how negative selection impacts LD. Under models where deleterious mutations have additive effects on fitness, deleterious variants less than 10 kb apart tend to be carried on different haplotypes relative to pairs of synonymous SNPs. In contrast, for recessive mutations, there is no consistent ordering of how selection coefficients affect LD decay, due to the complex interplay of different evolutionary effects. We then examined empirical data of modern humans from the 1000 Genomes Project. LD between derived alleles at nonsynonymous SNPs is lower compared to pairs of derived synonymous variants, suggesting that nonsynonymous derived alleles tend to occur on different haplotypes more than synonymous variants. This result holds when controlling for potential confounding factors by matching SNPs for frequency in the sample (allele count), physical distance, magnitude of background selection, and genetic distance between pairs of variants. Lastly, we introduce a new statistic HR(j) which allows us to detect interference using unphased genotypes. Application of this approach to high-coverage human genome sequences confirms our finding that nonsynonymous derived alleles tend to be located on different haplotypes more often than are synonymous derived alleles. Our findings suggest that interference may play a pervasive role in shaping patterns of LD between deleterious variants in the human genome, and consequently influences genome-wide patterns of LD.
Collapse
Affiliation(s)
- Jesse A. Garcia
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, United States of America
| | - Kirk E. Lohmueller
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, United States of America
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, United States of America
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California, United States of America
| |
Collapse
|
19
|
DeWitt WS, Harris KD, Ragsdale AP, Harris K. Nonparametric coalescent inference of mutation spectrum history and demography. Proc Natl Acad Sci U S A 2021; 118:e2013798118. [PMID: 34016747 PMCID: PMC8166128 DOI: 10.1073/pnas.2013798118] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
As populations boom and bust, the accumulation of genetic diversity is modulated, encoding histories of living populations in present-day variation. Many methods exist to decode these histories, and all must make strong model assumptions. It is typical to assume that mutations accumulate uniformly across the genome at a constant rate that does not vary between closely related populations. However, recent work shows that mutational processes in human and great ape populations vary across genomic regions and evolve over time. This perturbs the mutation spectrum (relative mutation rates in different local nucleotide contexts). Here, we develop theoretical tools in the framework of Kingman's coalescent to accommodate mutation spectrum dynamics. We present mutation spectrum history inference (mushi), a method to perform nonparametric inference of demographic and mutation spectrum histories from allele frequency data. We use mushi to reconstruct trajectories of effective population size and mutation spectrum divergence between human populations, identify mutation signatures and their dynamics in different human populations, and calibrate the timing of a previously reported mutational pulse in the ancestors of Europeans. We show that mutation spectrum histories can be placed in a well-studied theoretical setting and rigorously inferred from genomic variation data, like other features of evolutionary history.
Collapse
Affiliation(s)
- William S DeWitt
- Department of Genome Sciences, University of Washington, Seattle, WA 98195;
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98109
| | - Kameron Decker Harris
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA 98195
- Department of Biology, University of Washington, Seattle, WA 98195
| | - Aaron P Ragsdale
- National Laboratory of Genomics for Biodiversity, Unit of Advanced Genomics, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, Irapuato, Mexico 36821
| | - Kelley Harris
- Department of Genome Sciences, University of Washington, Seattle, WA 98195;
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98109
| |
Collapse
|
20
|
Harris AM, DeGiorgio M. A Likelihood Approach for Uncovering Selective Sweep Signatures from Haplotype Data. Mol Biol Evol 2021; 37:3023-3046. [PMID: 32392293 PMCID: PMC7530616 DOI: 10.1093/molbev/msaa115] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Selective sweeps are frequent and varied signatures in the genomes of natural populations, and detecting them is consequently important in understanding mechanisms of adaptation by natural selection. Following a selective sweep, haplotypic diversity surrounding the site under selection decreases, and this deviation from the background pattern of variation can be applied to identify sweeps. Multiple methods exist to locate selective sweeps in the genome from haplotype data, but none leverages the power of a model-based approach to make their inference. Here, we propose a likelihood ratio test statistic T to probe whole-genome polymorphism data sets for selective sweep signatures. Our framework uses a simple but powerful model of haplotype frequency spectrum distortion to find sweeps and additionally make an inference on the number of presently sweeping haplotypes in a population. We found that the T statistic is suitable for detecting both hard and soft sweeps across a variety of demographic models, selection strengths, and ages of the beneficial allele. Accordingly, we applied the T statistic to variant calls from European and sub-Saharan African human populations, yielding primarily literature-supported candidates, including LCT, RSPH3, and ZNF211 in CEU, SYT1, RGS18, and NNT in YRI, and HLA genes in both populations. We also searched for sweep signatures in Drosophila melanogaster, finding expected candidates at Ace, Uhg1, and Pimet. Finally, we provide open-source software to compute the T statistic and the inferred number of presently sweeping haplotypes from whole-genome data.
Collapse
Affiliation(s)
- Alexandre M Harris
- Department of Biology, Pennsylvania State University, University Park, PA.,Molecular, Cellular, and Integrative Biosciences, Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA
| | - Michael DeGiorgio
- Department of Computer and Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL
| |
Collapse
|
21
|
Sjödin P, McKenna J, Jakobsson M. Estimating divergence times from DNA sequences. Genetics 2021; 217:iyab008. [PMID: 33769498 PMCID: PMC8049563 DOI: 10.1093/genetics/iyab008] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 12/11/2020] [Indexed: 11/23/2022] Open
Abstract
The patterns of genetic variation within and among individuals and populations can be used to make inferences about the evolutionary forces that generated those patterns. Numerous population genetic approaches have been developed in order to infer evolutionary history. Here, we present the "Two-Two (TT)" and the "Two-Two-outgroup (TTo)" methods; two closely related approaches for estimating divergence time based in coalescent theory. They rely on sequence data from two haploid genomes (or a single diploid individual) from each of two populations. Under a simple population-divergence model, we derive the probabilities of the possible sample configurations. These probabilities form a set of equations that can be solved to obtain estimates of the model parameters, including population split times, directly from the sequence data. This transparent and computationally efficient approach to infer population divergence time makes it possible to estimate time scaled in generations (assuming a mutation rate), and not as a compound parameter of genetic drift. Using simulations under a range of demographic scenarios, we show that the method is relatively robust to migration and that the TTo method can alleviate biases that can appear from drastic ancestral population size changes. We illustrate the utility of the approaches with some examples, including estimating split times for pairs of human populations as well as providing further evidence for the complex relationship among Neandertals and Denisovans and their ancestors.
Collapse
Affiliation(s)
- Per Sjödin
- Human Evolution, Department of Organismal Biology, Uppsala University, Norbyvägen 18 A, Uppsala 752 36, Sweden
| | - James McKenna
- Human Evolution, Department of Organismal Biology, Uppsala University, Norbyvägen 18 A, Uppsala 752 36, Sweden
| | - Mattias Jakobsson
- Human Evolution, Department of Organismal Biology, Uppsala University, Norbyvägen 18 A, Uppsala 752 36, Sweden
- Science for Life Laboratory, Uppsala University, Norbyvägen 18 A, Uppsala 752 36, Sweden
| |
Collapse
|
22
|
Moodley Y, Westbury MV, Russo IRM, Gopalakrishnan S, Rakotoarivelo A, Olsen RA, Prost S, Tunstall T, Ryder OA, Dalén L, Bruford MW. Interspecific Gene Flow and the Evolution of Specialization in Black and White Rhinoceros. Mol Biol Evol 2021; 37:3105-3117. [PMID: 32585004 DOI: 10.1093/molbev/msaa148] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Africa's black (Diceros bicornis) and white (Ceratotherium simum) rhinoceros are closely related sister-taxa that evolved highly divergent obligate browsing and grazing feeding strategies. Although their precursor species Diceros praecox and Ceratotherium mauritanicum appear in the fossil record ∼5.2 Ma, by 4 Ma both were still mixed feeders, and were even spatiotemporally sympatric at several Pliocene sites in what is today Africa's Rift Valley. Here, we ask whether or not D. praecox and C. mauritanicum were reproductively isolated when they came into Pliocene secondary contact. We sequenced and de novo assembled the first annotated black rhinoceros reference genome and compared it with available genomes of other black and white rhinoceros. We show that ancestral gene flow between D. praecox and C. mauritanicum ceased sometime between 3.3 and 4.1 Ma, despite conventional methods for the detection of gene flow from whole genome data returning false positive signatures of recent interspecific migration due to incomplete lineage sorting. We propose that ongoing Pliocene genetic exchange, for up to 2 My after initial divergence, could have potentially hindered the development of obligate feeding strategies until both species were fully reproductively isolated, but that the more severe and shifting paleoclimate of the early Pleistocene was likely the ultimate driver of ecological specialization in African rhinoceros.
Collapse
Affiliation(s)
- Yoshan Moodley
- Department of Zoology, University of Venda, Thohoyandou, Republic of South Africa
| | - Michael V Westbury
- Section for Evolutionary Genomics, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| | - Isa-Rita M Russo
- School of Biosciences, Cardiff University, Cardiff, United Kingdom
| | - Shyam Gopalakrishnan
- Section for Evolutionary Genomics, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| | - Andrinajoro Rakotoarivelo
- Department of Zoology, University of Venda, Thohoyandou, Republic of South Africa.,Natiora Ahy Madagasikara, Ampahibe, Antananarivo, Madagascar
| | - Remi-Andre Olsen
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden
| | - Stefan Prost
- LOEWE-Centre for Translational Biodiversity Genomics, Senckenberg Museum, Frankfurt, Germany.,South African National Biodiversity Institute, National Zoological Gardens, Pretoria, Republic of South Africa
| | - Tate Tunstall
- San Diego Zoo Institute for Conservation Research, San Diego Zoo Global, Escondido, CA
| | - Oliver A Ryder
- San Diego Zoo Institute for Conservation Research, San Diego Zoo Global, Escondido, CA
| | - Love Dalén
- Centre for Palaeogenetics, Stockholm, Sweden.,Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
| | - Michael W Bruford
- School of Biosciences, Cardiff University, Cardiff, United Kingdom.,Sustainable Places Research Institute, Cardiff University, Cardiff, United Kingdom
| |
Collapse
|
23
|
Arredondo A, Mourato B, Nguyen K, Boitard S, Rodríguez W, Noûs C, Mazet O, Chikhi L. Inferring number of populations and changes in connectivity under the n-island model. Heredity (Edinb) 2021; 126:896-912. [PMID: 33846579 PMCID: PMC8178352 DOI: 10.1038/s41437-021-00426-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2020] [Revised: 03/11/2021] [Accepted: 03/12/2021] [Indexed: 12/11/2022] Open
Abstract
Inferring the demographic history of species is one of the greatest challenges in populations genetics. This history is often represented as a history of size changes, ignoring population structure. Alternatively, when structure is assumed, it is defined a priori as a population tree and not inferred. Here we propose a framework based on the IICR (Inverse Instantaneous Coalescence Rate). The IICR can be estimated for a single diploid individual using the PSMC method of Li and Durbin (2011). For an isolated panmictic population, the IICR matches the population size history, and this is how the PSMC outputs are generally interpreted. However, it is increasingly acknowledged that the IICR is a function of the demographic model and sampling scheme with limited connection to population size changes. Our method fits observed IICR curves of diploid individuals with IICR curves obtained under piecewise stationary symmetrical island models. In our models we assume a fixed number of time periods during which gene flow is constant, but gene flow is allowed to change between time periods. We infer the number of islands, their sizes, the periods at which connectivity changes and the corresponding rates of connectivity. Validation with simulated data showed that the method can accurately recover most of the scenario parameters. Our application to a set of five human PSMCs yielded demographic histories that are in agreement with previous studies using similar methods and with recent research suggesting ancient human structure. They are in contrast with the view of human evolution consisting of one ancestral population branching into three large continental and panmictic populations with varying degrees of connectivity and no population structure within each continent.
Collapse
Affiliation(s)
- Armando Arredondo
- Université de Toulouse, Institut National des Sciences Appliquées, Institut de Mathématiques de Toulouse, Toulouse, France. .,Institut de Mathématiques de Toulouse; UMR5219. Université de Toulouse, Toulouse, France.
| | - Beatriz Mourato
- Institut de Mathématiques de Toulouse; UMR5219. Université de Toulouse, Toulouse, France.,Instituto Gulbenkian de Ciência, Oeiras, Portugal
| | - Khoa Nguyen
- Université de Toulouse, Institut National des Sciences Appliquées, Institut de Mathématiques de Toulouse, Toulouse, France
| | - Simon Boitard
- CBGP, Université de Montpellier, CIRAD, INRAE, Institut Agro, IRD, Montpellier, France
| | - Willy Rodríguez
- Institut de Mathématiques de Toulouse; UMR5219. Université de Toulouse, Toulouse, France.,ENAC - Ecole Nationale de l'Aviation Civile, Université de Toulouse, Toulouse, France
| | | | - Olivier Mazet
- Université de Toulouse, Institut National des Sciences Appliquées, Institut de Mathématiques de Toulouse, Toulouse, France.,Institut de Mathématiques de Toulouse; UMR5219. Université de Toulouse, Toulouse, France
| | - Lounès Chikhi
- Instituto Gulbenkian de Ciência, Oeiras, Portugal. .,Laboratoire Évolution & Diversité Biologique (EDB UMR 5174), CNRS, IRD, UPS, Université de Toulouse Midi-Pyrénées, Toulouse, France.
| |
Collapse
|
24
|
Wang Z, Wang J, Kourakos M, Hoang N, Lee HH, Mathieson I, Mathieson S. Automatic inference of demographic parameters using generative adversarial networks. Mol Ecol Resour 2021; 21:2689-2705. [PMID: 33745225 PMCID: PMC8596911 DOI: 10.1111/1755-0998.13386] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 03/05/2021] [Indexed: 12/12/2022]
Abstract
Population genetics relies heavily on simulated data for validation, inference and intuition. In particular, since the evolutionary ‘ground truth’ for real data is always limited, simulated data are crucial for training supervised machine learning methods. Simulation software can accurately model evolutionary processes but requires many hand‐selected input parameters. As a result, simulated data often fail to mirror the properties of real genetic data, which limits the scope of methods that rely on it. Here, we develop a novel approach to estimating parameters in population genetic models that automatically adapts to data from any population. Our method, pg‐gan, is based on a generative adversarial network that gradually learns to generate realistic synthetic data. We demonstrate that our method is able to recover input parameters in a simulated isolation‐with‐migration model. We then apply our method to human data from the 1000 Genomes Project and show that we can accurately recapitulate the features of real data.
Collapse
Affiliation(s)
- Zhanpeng Wang
- Department of Computer Science, Haverford College, Haverford, PA, USA
| | - Jiaping Wang
- Department of Computer Science, Haverford College, Haverford, PA, USA
| | - Michael Kourakos
- Department of Computer Science, Swarthmore College, Swarthmore, PA, USA
| | - Nhung Hoang
- Department of Computer Science, Swarthmore College, Swarthmore, PA, USA
| | - Hyong Hark Lee
- Department of Computer Science, Swarthmore College, Swarthmore, PA, USA
| | - Iain Mathieson
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Sara Mathieson
- Department of Computer Science, Haverford College, Haverford, PA, USA
| |
Collapse
|
25
|
Johri P, Riall K, Becher H, Excoffier L, Charlesworth B, Jensen JD. The Impact of Purifying and Background Selection on the Inference of Population History: Problems and Prospects. Mol Biol Evol 2021; 38:2986-3003. [PMID: 33591322 PMCID: PMC8233493 DOI: 10.1093/molbev/msab050] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Current procedures for inferring population history generally assume complete neutrality—that is, they neglect both direct selection and the effects of selection on linked sites. We here examine how the presence of direct purifying selection and background selection may bias demographic inference by evaluating two commonly-used methods (MSMC and fastsimcoal2), specifically studying how the underlying shape of the distribution of fitness effects and the fraction of directly selected sites interact with demographic parameter estimation. The results show that, even after masking functional genomic regions, background selection may cause the mis-inference of population growth under models of both constant population size and decline. This effect is amplified as the strength of purifying selection and the density of directly selected sites increases, as indicated by the distortion of the site frequency spectrum and levels of nucleotide diversity at linked neutral sites. We also show how simulated changes in background selection effects caused by population size changes can be predicted analytically. We propose a potential method for correcting for the mis-inference of population growth caused by selection. By treating the distribution of fitness effect as a nuisance parameter and averaging across all potential realizations, we demonstrate that even directly selected sites can be used to infer demographic histories with reasonable accuracy.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Kellen Riall
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Hannes Becher
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Laurent Excoffier
- Institute of Ecology and Evolution, University of Berne, Berne, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
26
|
Johri P, Riall K, Becher H, Excoffier L, Charlesworth B, Jensen JD. The impact of purifying and background selection on the inference of population history: problems and prospects. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021. [PMID: 33501439 PMCID: PMC7836109 DOI: 10.1101/2020.04.28.066365] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Current procedures for inferring population history generally assume complete neutrality - that is, they neglect both direct selection and the effects of selection on linked sites. We here examine how the presence of direct purifying selection and background selection may bias demographic inference by evaluating two commonly-used methods (MSMC and fastsimcoal2), specifically studying how the underlying shape of the distribution of fitness effects (DFE) and the fraction of directly selected sites interact with demographic parameter estimation. The results show that, even after masking functional genomic regions, background selection may cause the mis-inference of population growth under models of both constant population size and decline. This effect is amplified as the strength of purifying selection and the density of directly selected sites increases, as indicated by the distortion of the site frequency spectrum and levels of nucleotide diversity at linked neutral sites. We also show how simulated changes in background selection effects caused by population size changes can be predicted analytically. We propose a potential method for correcting for the mis-inference of population growth caused by selection. By treating the DFE as a nuisance parameter and averaging across all potential realizations, we demonstrate that even directly selected sites can be used to infer demographic histories with reasonable accuracy.
Collapse
Affiliation(s)
- Parul Johri
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - Kellen Riall
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - Hannes Becher
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, EH9 3FL, United Kingdom
| | - Laurent Excoffier
- Institute of Ecology and Evolution, University of Berne, Berne 3012, Switzerland.,Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Brian Charlesworth
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh, EH9 3FL, United Kingdom
| | - Jeffrey D Jensen
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA
| |
Collapse
|
27
|
Morin PA, Archer FI, Avila CD, Balacco JR, Bukhman YV, Chow W, Fedrigo O, Formenti G, Fronczek JA, Fungtammasan A, Gulland FMD, Haase B, Peter Heide-Jorgensen M, Houck ML, Howe K, Misuraca AC, Mountcastle J, Musser W, Paez S, Pelan S, Phillippy A, Rhie A, Robinson J, Rojas-Bracho L, Rowles TK, Ryder OA, Smith CR, Stevenson S, Taylor BL, Teilmann J, Torrance J, Wells RS, Westgate AJ, Jarvis ED. Reference genome and demographic history of the most endangered marine mammal, the vaquita. Mol Ecol Resour 2020; 21:1008-1020. [PMID: 33089966 PMCID: PMC8247363 DOI: 10.1111/1755-0998.13284] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 09/08/2020] [Accepted: 10/08/2020] [Indexed: 12/12/2022]
Abstract
The vaquita is the most critically endangered marine mammal, with fewer than 19 remaining in the wild. First described in 1958, the vaquita has been in rapid decline for more than 20 years resulting from inadvertent deaths due to the increasing use of large-mesh gillnets. To understand the evolutionary and demographic history of the vaquita, we used combined long-read sequencing and long-range scaffolding methods with long- and short-read RNA sequencing to generate a near error-free annotated reference genome assembly from cell lines derived from a female individual. The genome assembly consists of 99.92% of the assembled sequence contained in 21 nearly gapless chromosome-length autosome scaffolds and the X-chromosome scaffold, with a scaffold N50 of 115 Mb. Genome-wide heterozygosity is the lowest (0.01%) of any mammalian species analysed to date, but heterozygosity is evenly distributed across the chromosomes, consistent with long-term small population size at genetic equilibrium, rather than low diversity resulting from a recent population bottleneck or inbreeding. Historical demography of the vaquita indicates long-term population stability at less than 5,000 (Ne) for over 200,000 years. Together, these analyses indicate that the vaquita genome has had ample opportunity to purge highly deleterious alleles and potentially maintain diversity necessary for population health.
Collapse
Affiliation(s)
- Phillip A Morin
- Southwest Fisheries Science Center, National Marine Fisheries Service, NOAA, La Jolla, CA, USA
| | - Frederick I Archer
- Southwest Fisheries Science Center, National Marine Fisheries Service, NOAA, La Jolla, CA, USA
| | - Catherine D Avila
- San Diego Zoo Institute for Conservation Research, Escondido, CA, USA
| | - Jennifer R Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Yury V Bukhman
- Regenerative Biology, Morgridge Institute for Research, Madison, WI, USA
| | | | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Julie A Fronczek
- San Diego Zoo Institute for Conservation Research, Escondido, CA, USA
| | | | | | - Bettina Haase
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | | | - Marlys L Houck
- San Diego Zoo Institute for Conservation Research, Escondido, CA, USA
| | | | - Ann C Misuraca
- San Diego Zoo Institute for Conservation Research, Escondido, CA, USA
| | | | | | - Sadye Paez
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA
| | | | - Adam Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA
| | - Jacqueline Robinson
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | | | - Teri K Rowles
- Office of Protected Resources, National Marine Fisheries Service, NOAA, Silver Spring, MD, USA
| | - Oliver A Ryder
- San Diego Zoo Institute for Conservation Research, Escondido, CA, USA
| | | | | | - Barbara L Taylor
- Southwest Fisheries Science Center, National Marine Fisheries Service, NOAA, La Jolla, CA, USA
| | - Jonas Teilmann
- Marine Mammal Research, Department of Bioscience, Aarhus University, Roskilde, Denmark
| | | | - Randall S Wells
- Chicago Zoological Society's Sarasota Dolphin Research Program, c/o Mote Marine Laboratory, Sarasota, FL, USA
| | | | - Erich D Jarvis
- Laboratory of Neurogenetics of Language, The Rockefeller University, New York, NY, USA.,Howard Hughes Medical Institute, Chevy Chase, MD, USA
| |
Collapse
|
28
|
Adrion JR, Cole CB, Dukler N, Galloway JG, Gladstein AL, Gower G, Kyriazis CC, Ragsdale AP, Tsambos G, Baumdicker F, Carlson J, Cartwright RA, Durvasula A, Gronau I, Kim BY, McKenzie P, Messer PW, Noskova E, Ortega-Del Vecchyo D, Racimo F, Struck TJ, Gravel S, Gutenkunst RN, Lohmueller KE, Ralph PL, Schrider DR, Siepel A, Kelleher J, Kern AD. A community-maintained standard library of population genetic models. eLife 2020; 9:e54967. [PMID: 32573438 PMCID: PMC7438115 DOI: 10.7554/elife.54967] [Citation(s) in RCA: 77] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 06/15/2020] [Indexed: 12/18/2022] Open
Abstract
The explosion in population genomic data demands ever more complex modes of analysis, and increasingly, these analyses depend on sophisticated simulations. Recent advances in population genetic simulation have made it possible to simulate large and complex models, but specifying such models for a particular simulation engine remains a difficult and error-prone task. Computational genetics researchers currently re-implement simulation models independently, leading to inconsistency and duplication of effort. This situation presents a major barrier to empirical researchers seeking to use simulations for power analyses of upcoming studies or sanity checks on existing genomic data. Population genetics, as a field, also lacks standard benchmarks by which new tools for inference might be measured. Here, we describe a new resource, stdpopsim, that attempts to rectify this situation. Stdpopsim is a community-driven open source project, which provides easy access to a growing catalog of published simulation models from a range of organisms and supports multiple simulation engine backends. This resource is available as a well-documented python library with a simple command-line interface. We share some examples demonstrating how stdpopsim can be used to systematically compare demographic inference methods, and we encourage a broader community of developers to contribute to this growing resource.
Collapse
Affiliation(s)
- Jeffrey R Adrion
- Department of Biology and Institute of Ecology and Evolution, University of OregonEugeneUnited States
| | - Christopher B Cole
- Weatherall Institute of Molecular Medicine, University of OxfordOxfordUnited Kingdom
| | - Noah Dukler
- Simons Center for Quantitative Biology, Cold Spring Harbor LaboratoryCold Spring HarborUnited States
| | - Jared G Galloway
- Department of Biology and Institute of Ecology and Evolution, University of OregonEugeneUnited States
| | - Ariella L Gladstein
- Department of Genetics, University of North Carolina at Chapel HillChapel HillUnited States
| | - Graham Gower
- Lundbeck GeoGenetics Centre, Globe Institute, University of CopenhagenCopenhagenDenmark
| | - Christopher C Kyriazis
- Department of Ecology and Evolutionary Biology, University of California, Los AngelesLos AngelesUnited States
| | | | - Georgia Tsambos
- Melbourne Integrative Genomics, School of Mathematics and Statistics, University of MelbourneMelbourneAustralia
| | - Franz Baumdicker
- Department of Mathematical Stochastics, University of FreiburgFreiburgGermany
| | - Jedidiah Carlson
- Department of Genome Sciences, University of WashingtonSeattleUnited States
| | - Reed A Cartwright
- The Biodesign Institute and The School of Life Sciences, Arizona State UniversityTempeUnited States
| | - Arun Durvasula
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los AngelesLos AngelesUnited States
| | - Ilan Gronau
- The Efi Arazi School of Computer Science, Herzliya Interdisciplinary CenterHerzliyaIsrael
| | - Bernard Y Kim
- Department of Biology, Stanford UniversityStanfordUnited States
| | - Patrick McKenzie
- Department of Ecology, Evolution, and Environmental Biology, Columbia UniversityNew YorkUnited States
| | - Philipp W Messer
- Department of Computational BiologyCornell UniversityIthacaUnited States
| | - Ekaterina Noskova
- Computer Technologies Laboratory, ITMO UniversitySaint PetersburgRussian Federation
| | - Diego Ortega-Del Vecchyo
- International Laboratory for Human Genome Research, National Autonomous University of MexicoJuriquillaMexico
| | - Fernando Racimo
- Lundbeck GeoGenetics Centre, Globe Institute, University of CopenhagenCopenhagenDenmark
| | - Travis J Struck
- Departmentof Molecular and Cellular Biology, University of ArizonaTucsonUnited States
| | - Simon Gravel
- Department of Human Genetics, McGill UniversityMontrealCanada
| | - Ryan N Gutenkunst
- Departmentof Molecular and Cellular Biology, University of ArizonaTucsonUnited States
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los AngelesLos AngelesUnited States
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los AngelesLos AngelesUnited States
| | - Peter L Ralph
- Department of Biology and Institute of Ecology and Evolution, University of OregonEugeneUnited States
- Department of Mathematics, University of OregonEugeneUnited States
| | - Daniel R Schrider
- Department of Genetics, University of North Carolina at Chapel HillChapel HillUnited States
| | - Adam Siepel
- Simons Center for Quantitative Biology, Cold Spring Harbor LaboratoryCold Spring HarborUnited States
| | - Jerome Kelleher
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of OxfordOxfordUnited Kingdom
| | - Andrew D Kern
- Department of Biology and Institute of Ecology and Evolution, University of OregonEugeneUnited States
| |
Collapse
|
29
|
Humble E, Dobrynin P, Senn H, Chuven J, Scott AF, Mohr DW, Dudchenko O, Omer AD, Colaric Z, Lieberman Aiden E, Al Dhaheri SS, Wildt D, Oliaji S, Tamazian G, Pukazhenthi B, Ogden R, Koepfli KP. Chromosomal-level genome assembly of the scimitar-horned oryx: Insights into diversity and demography of a species extinct in the wild. Mol Ecol Resour 2020; 20:1668-1681. [PMID: 32365406 DOI: 10.1111/1755-0998.13181] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Revised: 04/09/2020] [Accepted: 04/24/2020] [Indexed: 01/04/2023]
Abstract
Captive populations provide a valuable insurance against extinctions in the wild. However, they are also vulnerable to the negative impacts of inbreeding, selection and drift. Genetic information is therefore considered a critical aspect of conservation management. Recent developments in sequencing technologies have the potential to improve the outcomes of management programmes; however, the transfer of these approaches to applied conservation has been slow. The scimitar-horned oryx (Oryx dammah) is a North African antelope that has been extinct in the wild since the early 1980s and is the focus of a large-scale and long-term reintroduction project. To enable the selection of suitable founder individuals, facilitate post-release monitoring and improve captive breeding management, comprehensive genomic resources are required. Here, we used 10X Chromium sequencing together with Hi-C contact mapping to develop a chromosomal-level genome assembly for the species. The resulting assembly contained 29 chromosomes with a scaffold N50 of 100.4 Mb, and displayed strong chromosomal synteny with the cattle genome. Using resequencing data from six additional individuals, we demonstrated relatively high genetic diversity in the scimitar-horned oryx compared to other mammals, despite it having experienced a strong founding event in captivity. Additionally, the level of diversity across populations varied according to management strategy. Finally, we uncovered a dynamic demographic history that coincided with periods of climate variation during the Pleistocene. Overall, our study provides a clear example of how genomic data can uncover valuable insights into captive populations and contributes important resources to guide future management decisions of an endangered species.
Collapse
Affiliation(s)
- Emily Humble
- Royal (Dick) School of Veterinary Studies and the Roslin Institute, University of Edinburgh, Edinburgh, UK
| | - Pavel Dobrynin
- Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Front Royal, VA, USA.,Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Washington, DC, USA.,Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, St. Petersburg, Russia
| | - Helen Senn
- RZSS WildGenes Laboratory, Conservation Department, Royal Zoological Society of Scotland, Edinburgh, UK
| | - Justin Chuven
- Terrestrial & Marine Biodiversity Sector, Environment Agency, Abu Dhabi, United Arab Emirates
| | - Alan F Scott
- Genetic Resources Core Facility, McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - David W Mohr
- Genetic Resources Core Facility, McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Olga Dudchenko
- The Center for Genome Architecture, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX, USA.,Department of Computer Science, Department of Computational and Applied Mathematics, Rice University, Houston, TX, USA.,Center for Theoretical and Biological Physics, Rice University, Houston, TX, USA
| | - Arina D Omer
- The Center for Genome Architecture, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX, USA.,Department of Computer Science, Department of Computational and Applied Mathematics, Rice University, Houston, TX, USA
| | - Zane Colaric
- The Center for Genome Architecture, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX, USA.,Department of Computer Science, Department of Computational and Applied Mathematics, Rice University, Houston, TX, USA
| | - Erez Lieberman Aiden
- The Center for Genome Architecture, Department of Molecular and Human Genetics Baylor College of Medicine, Houston, TX, USA.,Department of Computer Science, Department of Computational and Applied Mathematics, Rice University, Houston, TX, USA.,Center for Theoretical and Biological Physics, Rice University, Houston, TX, USA.,Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, Shanghai, China
| | | | - David Wildt
- Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Front Royal, VA, USA.,Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Washington, DC, USA
| | - Shireen Oliaji
- Royal (Dick) School of Veterinary Studies and the Roslin Institute, University of Edinburgh, Edinburgh, UK
| | - Gaik Tamazian
- Computer Technologies Laboratory, ITMO University, St. Petersburg, Russia
| | - Budhan Pukazhenthi
- Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Front Royal, VA, USA.,Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Washington, DC, USA
| | - Rob Ogden
- Royal (Dick) School of Veterinary Studies and the Roslin Institute, University of Edinburgh, Edinburgh, UK
| | - Klaus-Peter Koepfli
- Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Front Royal, VA, USA.,Smithsonian Conservation Biology Institute, Center for Species Survival, National Zoological Park, Washington, DC, USA
| |
Collapse
|
30
|
Jay F, Boitard S, Austerlitz F. An ABC Method for Whole-Genome Sequence Data: Inferring Paleolithic and Neolithic Human Expansions. Mol Biol Evol 2020; 36:1565-1579. [PMID: 30785202 DOI: 10.1093/molbev/msz038] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Species generally undergo a complex demographic history consisting, in particular, of multiple changes in population size. Genome-wide sequencing data are potentially highly informative for reconstructing this demographic history. A crucial point is to extract the relevant information from these very large data sets. Here, we design an approach for inferring past demographic events from a moderate number of fully sequenced genomes. Our new approach uses Approximate Bayesian Computation, a simulation-based statistical framework that allows 1) identifying the best demographic scenario among several competing scenarios and 2) estimating the best-fitting parameters under the chosen scenario. Approximate Bayesian Computation relies on the computation of summary statistics. Using a cross-validation approach, we show that statistics such as the lengths of haplotypes shared between individuals, or the decay of linkage disequilibrium with distance, can be combined with classical statistics (e.g., heterozygosity and Tajima's D) to accurately infer complex demographic scenarios including bottlenecks and expansion periods. We also demonstrate the importance of simultaneously estimating the genotyping error rate. Applying our method on genome-wide human-sequence databases, we finally show that a model consisting in a bottleneck followed by a Paleolithic and a Neolithic expansion is the most relevant for Eurasian populations.
Collapse
Affiliation(s)
- Flora Jay
- Laboratoire EcoAnthropologie et Ethnobiologie, CNRS/MNHN/Université Paris Diderot, Paris, France.,Laboratoire de Recherche en Informatique, CNRS/Université Paris-Sud/Université Paris-Saclay, Orsay, France
| | - Simon Boitard
- GenPhySE, Université de Toulouse, INRA, INPT, INP-ENVT, Castanet Tolosan, France
| | - Frédéric Austerlitz
- Laboratoire EcoAnthropologie et Ethnobiologie, CNRS/MNHN/Université Paris Diderot, Paris, France
| |
Collapse
|
31
|
Mather N, Traves SM, Ho SYW. A practical introduction to sequentially Markovian coalescent methods for estimating demographic history from genomic data. Ecol Evol 2020; 10:579-589. [PMID: 31988743 PMCID: PMC6972798 DOI: 10.1002/ece3.5888] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Revised: 10/11/2019] [Accepted: 11/12/2019] [Indexed: 12/31/2022] Open
Abstract
A common goal of population genomics and molecular ecology is to reconstruct the demographic history of a species of interest. A pair of powerful tools based on the sequentially Markovian coalescent have been developed to infer past population sizes using genome sequences. These methods are most useful when sequences are available for only a limited number of genomes and when the aim is to study ancient demographic events. The results of these analyses can be difficult to interpret accurately, because doing so requires some understanding of their theoretical basis and of their sensitivity to confounding factors. In this practical review, we explain some of the key concepts underpinning the pairwise and multiple sequentially Markovian coalescent methods (PSMC and MSMC, respectively). We relate these concepts to the use and interpretation of these methods, and we explain how the choice of different parameter values by the user can affect the accuracy and precision of the inferences. Based on our survey of 100 PSMC studies and 30 MSMC studies, we describe how the two methods are used in practice. Readers of this article will become familiar with the principles, practice, and interpretation of the sequentially Markovian coalescent for inferring demographic history.
Collapse
Affiliation(s)
- Niklas Mather
- School of Life and Environmental SciencesUniversity of SydneySydneyNSWAustralia
| | - Samuel M. Traves
- School of Life and Environmental SciencesUniversity of SydneySydneyNSWAustralia
| | - Simon Y. W. Ho
- School of Life and Environmental SciencesUniversity of SydneySydneyNSWAustralia
| |
Collapse
|
32
|
Mattingsdal M, Jorde PE, Knutsen H, Jentoft S, Stenseth NC, Sodeland M, Robalo JI, Hansen MM, André C, Blanco Gonzalez E. Demographic history has shaped the strongly differentiated corkwing wrasse populations in Northern Europe. Mol Ecol 2019; 29:160-171. [DOI: 10.1111/mec.15310] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 11/06/2019] [Accepted: 11/13/2019] [Indexed: 12/11/2022]
Affiliation(s)
- Morten Mattingsdal
- Department of Natural Sciences Centre for Coastal Research University of Agder Kristiansand Norway
| | | | - Halvor Knutsen
- Department of Natural Sciences Centre for Coastal Research University of Agder Kristiansand Norway
- Institute of Marine Research Flødevigen Norway
| | - Sissel Jentoft
- Department of Biosciences Centre for Ecological and Evolutionary Synthesis University of Oslo Oslo Norway
| | - Nils Christian Stenseth
- Department of Natural Sciences Centre for Coastal Research University of Agder Kristiansand Norway
- Department of Biosciences Centre for Ecological and Evolutionary Synthesis University of Oslo Oslo Norway
| | - Marte Sodeland
- Department of Natural Sciences Centre for Coastal Research University of Agder Kristiansand Norway
| | - Joana I. Robalo
- Marine and Environmental Sciences Centre ISPA Instituto Universitário de Ciências Psicológicas, Sociais e da Vida Lisboa Portugal
| | | | - Carl André
- Department of Marine Sciences‐Tjärnö Göteborg University Strömstad Sweden
| | - Enrique Blanco Gonzalez
- Department of Natural Sciences Centre for Coastal Research University of Agder Kristiansand Norway
- Norwegian College of Fishery Science UiT The Arctic University of Norway Tromsø Norway
| |
Collapse
|
33
|
Beichman AC, Koepfli KP, Li G, Murphy W, Dobrynin P, Kliver S, Tinker MT, Murray MJ, Johnson J, Lindblad-Toh K, Karlsson EK, Lohmueller KE, Wayne RK. Aquatic Adaptation and Depleted Diversity: A Deep Dive into the Genomes of the Sea Otter and Giant Otter. Mol Biol Evol 2019; 36:2631-2655. [PMID: 31212313 PMCID: PMC7967881 DOI: 10.1093/molbev/msz101] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Despite its recent invasion into the marine realm, the sea otter (Enhydra lutris) has evolved a suite of adaptations for life in cold coastal waters, including limb modifications and dense insulating fur. This uniquely dense coat led to the near-extinction of sea otters during the 18th-20th century fur trade and an extreme population bottleneck. We used the de novo genome of the southern sea otter (E. l. nereis) to reconstruct its evolutionary history, identify genes influencing aquatic adaptation, and detect signals of population bottlenecks. We compared the genome of the southern sea otter with the tropical freshwater-living giant otter (Pteronura brasiliensis) to assess common and divergent genomic trends between otter species, and with the closely related northern sea otter (E. l. kenyoni) to uncover population-level trends. We found signals of positive selection in genes related to aquatic adaptations, particularly limb development and polygenic selection on genes related to hair follicle development. We found extensive pseudogenization of olfactory receptor genes in both the sea otter and giant otter lineages, consistent with patterns of sensory gene loss in other aquatic mammals. At the population level, the southern sea otter and the northern sea otter showed extremely low genomic diversity, signals of recent inbreeding, and demographic histories marked by population declines. These declines may predate the fur trade and appear to have resulted in an increase in putatively deleterious variants that could impact the future recovery of the sea otter.
Collapse
Affiliation(s)
- Annabel C Beichman
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA
| | - Klaus-Peter Koepfli
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, DC
- Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russian Federation
| | - Gang Li
- College of Life Science, Shaanxi Normal University, Xi’an, Shaanxi, China
| | - William Murphy
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX
| | - Pasha Dobrynin
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Washington, DC
- Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russian Federation
| | - Sergei Kliver
- Institute of Molecular and Cellular Biology, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russian Federation
| | - Martin T Tinker
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, CA
| | | | - Jeremy Johnson
- Vertebrate Genome Biology, Broad Institute of MIT and Harvard, Cambridge, MA
| | - Kerstin Lindblad-Toh
- Vertebrate Genome Biology, Broad Institute of MIT and Harvard, Cambridge, MA
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Elinor K Karlsson
- Vertebrate Genome Biology, Broad Institute of MIT and Harvard, Cambridge, MA
- Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, CA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA
| | - Robert K Wayne
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA
| |
Collapse
|
34
|
Stam R, Silva-Arias GA, Tellier A. Subsets of NLR genes show differential signatures of adaptation during colonization of new habitats. THE NEW PHYTOLOGIST 2019; 224:367-379. [PMID: 31230368 DOI: 10.1111/nph.16017] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 06/14/2019] [Indexed: 06/09/2023]
Abstract
Nucleotide binding site, leucine-rich repeat receptors (NLRs) are canonical resistance (R) genes in plants, fungi and animals, functioning as central (helper) and peripheral (sensor) genes in a signalling network. We investigate NLR evolution during the colonization of novel habitats in a model tomato species, Solanum chilense. We used R-gene enrichment sequencing to obtain polymorphism data at NLRs of 140 plants sampled across 14 populations covering the whole species range. We inferred the past demographic history of habitat colonization by resequencing whole genomes from three S. chilense plants from three key populations and performing approximate Bayesian computation using data from the 14 populations. Using these parameters, we simulated the genetic differentiation statistics distribution expected under neutral NLR evolution and identified small subsets of outlier NLRs exhibiting signatures of selection across populations. NLRs under selection between habitats are more often helper genes, whereas those showing signatures of adaptation in single populations are more often sensor-NLRs. Thus, centrality in the NLR network does not constrain NLR evolvability, and new mutations in central genes in the network are key for R-gene adaptation during colonization of different habitats.
Collapse
Affiliation(s)
- Remco Stam
- Phytopathology, Technical University Munich, 85354, Freising, Germany
- Population Genetics, Technical University Munich, 85354, Freising, Germany
| | | | - Aurelien Tellier
- Population Genetics, Technical University Munich, 85354, Freising, Germany
| |
Collapse
|
35
|
Warmuth VM, Ellegren H. Genotype‐free estimation of allele frequencies reduces bias and improves demographic inference from RADSeq data. Mol Ecol Resour 2019; 19:586-596. [DOI: 10.1111/1755-0998.12990] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 12/19/2018] [Accepted: 12/20/2018] [Indexed: 02/07/2023]
Affiliation(s)
- Vera M. Warmuth
- Department of Evolutionary Biology, Evolutionary Biology Centre Uppsala University Uppsala Sweden
- Division of Evolutionary Biology, Faculty of Biology Ludwig‐Maximilians‐Universität München Martinsried Germany
| | - Hans Ellegren
- Department of Evolutionary Biology, Evolutionary Biology Centre Uppsala University Uppsala Sweden
| |
Collapse
|
36
|
Abstract
Identifying genomic locations of natural selection from sequence data is an ongoing challenge in population genetics. Current methods utilizing information combined from several summary statistics typically assume no correlation of summary statistics regardless of the genomic location from which they are calculated. However, due to linkage disequilibrium, summary statistics calculated at nearby genomic positions are highly correlated. We introduce an approach termed Trendsetter that accounts for the similarity of statistics calculated from adjacent genomic regions through trend filtering, while reducing the effects of multicollinearity through regularization. Our penalized regression framework has high power to detect sweeps, is capable of classifying sweep regions as either hard or soft, and can be applied to other selection scenarios as well. We find that Trendsetter is robust to both extensive missing data and strong background selection, and has comparable power to similar current approaches. Moreover, the model learned by Trendsetter can be viewed as a set of curves modeling the spatial distribution of summary statistics in the genome. Application to human genomic data revealed positively selected regions previously discovered such as LCT in Europeans and EDAR in East Asians. We also identified a number of novel candidates and show that populations with greater relatedness share more sweep signals.
Collapse
Affiliation(s)
- Mehreen R Mughal
- Bioinformatics and Genomics at the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA
| | - Michael DeGiorgio
- Departments of Biology and Statistics, Pennsylvania State University,University Park, PA
- Institute for CyberScience, Pennsylvania State University, University Park, PA
| |
Collapse
|
37
|
Spence JP, Steinrücken M, Terhorst J, Song YS. Inference of population history using coalescent HMMs: review and outlook. Curr Opin Genet Dev 2018; 53:70-76. [PMID: 30056275 PMCID: PMC6296859 DOI: 10.1016/j.gde.2018.07.002] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 07/08/2018] [Accepted: 07/09/2018] [Indexed: 01/02/2023]
Abstract
Studying how diverse human populations are related is of historical and anthropological interest, in addition to providing a realistic null model for testing for signatures of natural selection or disease associations. Furthermore, understanding the demographic histories of other species is playing an increasingly important role in conservation genetics. A number of statistical methods have been developed to infer population demographic histories using whole-genome sequence data, with recent advances focusing on allowing for more flexible modeling choices, scaling to larger data sets, and increasing statistical power. Here we review coalescent hidden Markov models, a powerful class of population genetic inference methods that can utilize linkage disequilibrium information effectively. We highlight recent advances, give advice for practitioners, point out potential pitfalls, and present possible future research directions.
Collapse
Affiliation(s)
- Jeffrey P Spence
- Computational Biology Graduate Group, University of California, Berkeley, United States
| | | | | | - Yun S Song
- Computer Science Division and Department of Statistics, University of California, Berkeley, United States; Chan Zuckerberg Biohub, San Francisco, United States.
| |
Collapse
|
38
|
Henn BM, Steele TE, Weaver TD. Clarifying distinct models of modern human origins in Africa. Curr Opin Genet Dev 2018; 53:148-156. [PMID: 30423527 DOI: 10.1016/j.gde.2018.10.003] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 10/09/2018] [Accepted: 10/15/2018] [Indexed: 11/29/2022]
Abstract
Accumulating genomic, fossil and archaeological data from Africa have led to a renewed interest in models of modern human origins. However, such discussions are often discipline-specific, with limited integration of evidence across the different fields. Further, geneticists typically require explicit specification of parameters to test competing demographic models, but these have been poorly outlined for some scenarios. Here, we describe four possible models for the origins of Homo sapiens in Africa based on published literature from paleoanthropology and human genetics. We briefly outline expectations for data patterns under each model, with a special focus on genetic data. Additionally, we present schematics for each model, doing our best to qualitatively describe demographic histories for which genetic parameters can be specifically attached. Finally, it is our hope that this perspective provides context for discussions of human origins in other manuscripts presented in this special issue.
Collapse
Affiliation(s)
- Brenna M Henn
- Department of Anthropology, University of California, Davis, CA, 95616, United States; UC Davis Genome Center, University of California, Davis, CA, 95616, United States.
| | - Teresa E Steele
- Department of Anthropology, University of California, Davis, CA, 95616, United States
| | - Timothy D Weaver
- Department of Anthropology, University of California, Davis, CA, 95616, United States
| |
Collapse
|
39
|
Beichman AC, Huerta-Sanchez E, Lohmueller KE. Using Genomic Data to Infer Historic Population Dynamics of Nonmodel Organisms. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2018. [DOI: 10.1146/annurev-ecolsys-110617-062431] [Citation(s) in RCA: 89] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Genome sequence data are now being routinely obtained from many nonmodel organisms. These data contain a wealth of information about the demographic history of the populations from which they originate. Many sophisticated statistical inference procedures have been developed to infer the demographic history of populations from this type of genomic data. In this review, we discuss the different statistical methods available for inference of demography, providing an overview of the underlying theory and logic behind each approach. We also discuss the types of data required and the pros and cons of each method. We then discuss how these methods have been applied to a variety of nonmodel organisms. We conclude by presenting some recommendations for researchers looking to use genomic data to infer demographic history.
Collapse
Affiliation(s)
- Annabel C. Beichman
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095, USA
| | - Emilia Huerta-Sanchez
- Department of Molecular and Cell Biology, University of California, Merced, California 95343, USA
- Current affiliation: Department of Ecology and Evolutionary Biology, Brown University, Providence, Rhode Island 02912, USA
| | - Kirk E. Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095, USA
- Interdepartmental Program in Bioinformatics and Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
40
|
Luikart G, Kardos M, Hand BK, Rajora OP, Aitken SN, Hohenlohe PA. Population Genomics: Advancing Understanding of Nature. POPULATION GENOMICS 2018. [DOI: 10.1007/13836_2018_60] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|