1
|
Modica A, Lalagüe H, Muratorio S, Scotti I. Rolling down that mountain: microgeographical adaptive divergence during a fast population expansion along a steep environmental gradient in European beech. Heredity (Edinb) 2024; 133:99-112. [PMID: 38890557 PMCID: PMC11286953 DOI: 10.1038/s41437-024-00696-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 05/23/2024] [Accepted: 05/23/2024] [Indexed: 06/20/2024] Open
Abstract
Forest tree populations harbour high genetic diversity thanks to large effective population sizes and strong gene flow, allowing them to diversify through adaptation to local environmental pressures within dispersal distance. Many tree populations also experienced historical demographic fluctuations, including spatial population contraction or expansions at various temporal scales, which may constrain their ability to adapt to environmental variations. Our aim is to investigate how recent contraction and expansion events interfere with local adaptation, by studying patterns of adaptive divergence between closely related stands undergoing environmentally contrasted conditions, and having or not recently expanded. To investigate genome-wide signatures of local adaptation while accounting for demography, we analysed divergence in a European beech population by testing pairwise differentiation among four tree stands at ~35k Single Nucleotide Polymorphisms from ~9k genomic regions. We applied three divergence outlier search methods resting on different assumptions and targeting either single SNPs or contiguous genomic regions, while accounting for the effect of population size variations on genetic divergence. We found 27 signals of selective signatures in 19 target regions. Putatively adaptive divergence involved all stand pairs. We retrieved signals both when comparing old-growth stands and recently colonised areas and when comparing stands within the old-growth area. Therefore, adaptive divergence processes have taken place both over short time spans, under strong environmental contrasts, and over short ecological gradients, in populations that have been stable in the long term. This suggests that standing genetic variation supports local, microgeographic divergence processes, which can maintain genetic diversity at the landscape level.
Collapse
Affiliation(s)
- Andrea Modica
- INRAE, URFM, 228, Route de l'Aérodrome, 84914, Avignon, France
| | - Hadrien Lalagüe
- INRAE, EcoFoG, Campus agronomique, 97310, Kourou, French Guiana
| | - Sylvie Muratorio
- INRAE, EcoBioP, 173, Route de Saint-Jean-de-Luz RD 918, 64310, Saint-Pée-sur-Nivelle, France
| | - Ivan Scotti
- INRAE, URFM, 228, Route de l'Aérodrome, 84914, Avignon, France.
| |
Collapse
|
2
|
Li Z, Li C, Zhang R, Duan M, Tian H, Yi H, Xu L, Wang F, Shi Z, Wang X, Wang J, Su A, Wang S, Sun X, Zhao Y, Wang S, Zhang Y, Wang Y, Song W, Zhao J. Genomic analysis of a new heterotic maize group reveals key loci for pedigree breeding. FRONTIERS IN PLANT SCIENCE 2023; 14:1213675. [PMID: 37636101 PMCID: PMC10451083 DOI: 10.3389/fpls.2023.1213675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 07/21/2023] [Indexed: 08/29/2023]
Abstract
Genome-wide analyses of maize populations have clarified the genetic basis of crop domestication and improvement. However, limited information is available on how breeding improvement reshaped the genome in the process of the formation of heterotic groups. In this study, we identified a new heterotic group (X group) based on an examination of 512 Chinese maize inbred lines. The X group was clearly distinct from the other non-H&L groups, implying that X × HIL is a new heterotic pattern. We selected the core inbred lines for an analysis of yield-related traits. Almost all yield-related traits were better in the X lines than those in the parental lines, indicating that the primary genetic improvement in the X group during breeding was yield-related traits. We generated whole-genome sequences of these lines with an average coverage of 17.35× to explore genome changes further. We analyzed the identity-by-descent (IBD) segments transferred from the two parents to the X lines and identified 29 and 28 IBD conserved regions (ICRs) from the parents PH4CV and PH6WC, respectively, accounting for 28.8% and 12.8% of the genome. We also identified 103, 89, and 131 selective sweeps (SSWs) using methods that involved the π, Tajima's D, and CLR values, respectively. Notably, 96.13% of the ICRs co-localized with SSWs, indicating that SSW signals concentrated in ICRs. We identified 171 annotated genes associated with yield-related traits in maize both in ICRs and SSWs. To identify the genetic factors associated with yield improvement, we conducted QTL mapping for 240 lines from a DH population (PH4CV × PH6WC, which are the parents of X1132X) for ten key yield-related traits and identified a total of 55 QTLs. Furthermore, we detected three QTL clusters both in ICRs and SSWs. Based on the genetic evidence, we finally identified three key genes contributing to yield improvement in breeding the X group. These findings reveal key loci and genes targeted during pedigree breeding and provide new insights for future genomic breeding.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Yuandong Wang
- Beijing Key Laboratory of Maize DNA Fingerprinting and Molecular Breeding, Maize Research Institute, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Wei Song
- Beijing Key Laboratory of Maize DNA Fingerprinting and Molecular Breeding, Maize Research Institute, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Jiuran Zhao
- Beijing Key Laboratory of Maize DNA Fingerprinting and Molecular Breeding, Maize Research Institute, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| |
Collapse
|
3
|
Barroso GV, Lohmueller KE. Inferring the mode and strength of ongoing selection. Genome Res 2023; 33:632-643. [PMID: 37055196 PMCID: PMC10234300 DOI: 10.1101/gr.276386.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 03/29/2023] [Indexed: 04/15/2023]
Abstract
Genome sequence data are no longer scarce. The UK Biobank alone comprises 200,000 individual genomes, with more on the way, leading the field of human genetics toward sequencing entire populations. Within the next decades, other model organisms will follow suit, especially domesticated species such as crops and livestock. Having sequences from most individuals in a population will present new challenges for using these data to improve health and agriculture in the pursuit of a sustainable future. Existing population genetic methods are designed to model hundreds of randomly sampled sequences but are not optimized for extracting the information contained in the larger and richer data sets that are beginning to emerge, with thousands of closely related individuals. Here we develop a new method called trio-based inference of dominance and selection (TIDES) that uses data from tens of thousands of family trios to make inferences about natural selection acting in a single generation. TIDES further improves on the state of the art by making no assumptions regarding demography, linkage, or dominance. We discuss how our method paves the way for studying natural selection from new angles.
Collapse
Affiliation(s)
- Gustavo V Barroso
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095-1606, USA; Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA
| | - Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095-1606, USA; Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
4
|
Roux C, Vekemans X, Pannell J. Inferring the Demographic History and Inheritance Mode of Tetraploid Species Using ABC. Methods Mol Biol 2023; 2545:325-348. [PMID: 36720821 DOI: 10.1007/978-1-0716-2561-3_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Genomic patterns of diversity and divergence are impacted by certain life history traits, reproductive systems, and demographic history. The latter is characterized by fluctuations in population sizes over time, as well as by temporal patterns of introgression. For a given organism, identifying a demographic history that deviates from the standard neutral model allows a better understanding of its evolution but also helps to reduce the risk of false positives when screening for molecular targets of natural selection. Tetraploid organisms and beyond have demographic histories that are complicated by the mode of polyploidization, the mode of inheritance, and different scenarios of gene flow between sub-genomes and diploid parental species. Here we provide guidelines for experimenters wishing to address these issues through a flexible statistical framework: approximate Bayesian computation (ABC). The emphasis is on the general philosophy of the approach to encourage future users to exploit the enormous flexibility of ABC beyond the limitations imposed by generalist data analysis pipelines.
Collapse
Affiliation(s)
- Camille Roux
- Univ. Lille, CNRS, UMR 8198 - Evo-Eco-Paleo, Lille, France.
| | | | - John Pannell
- Department of Ecology and Evolution, Biophore Building, University of Lausanne, Lausanne, Switzerland
| |
Collapse
|
5
|
Imwattana K, Aguero B, Duffy A, Shaw AJ. Demographic history and gene flow in the peatmosses Sphagnum recurvum and Sphagnum flexuosum (Bryophyta: Sphagnaceae). Ecol Evol 2022; 12:e9489. [PMID: 36407896 PMCID: PMC9667404 DOI: 10.1002/ece3.9489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 09/28/2022] [Accepted: 10/19/2022] [Indexed: 11/18/2022] Open
Abstract
Population size changes and gene flow are processes that can have significant impacts on evolution. The aim of this study was to investigate the relationship of geography to patterns of gene flow and population size changes in a pair of closely related Sphagnum (peatmoss) species: S. recurvum and S. flexuosum. Both species occur in eastern North America, and S. flexuosum also occurs in Europe. Genetic data from restriction-site-associated DNA sequencing (RAD-seq) were used in this study. Analyses of gene flow were accomplished using coalescent simulations of site frequency spectra (SFSs). Signatures of gene flow were confirmed by f 4 statistics. For S. flexuosum, genetic diversity of plants in glaciated areas appeared to be lower than that in unglaciated areas, suggesting that glaciation can have an impact on effective population sizes. There is asymmetric gene flow from eastern North America to Europe, suggesting that Europe might have been colonized by plants from eastern North America after the last glacial maximum. The rate of gene flow between S. flexuosum and S. recurvum is lower than that between geographically disjunct S. flexuosum populations. The rate of gene flow between species is higher among sympatric plants of the two species than between currently allopatric S. flexuosum populations. There was also gene flow from S. recurvum to the ancestor S. flexuosum on both continents which occurred through secondary contact. These results illustrate a complex history of interspecific gene flow between S. flexuosum and S. recurvum, which occurred in at least two phases: between ancestral populations after secondary contact and between currently sympatric plants.
Collapse
Affiliation(s)
- Karn Imwattana
- Department of Biology & L. E. Anderson Bryophyte HerbariumDuke UniversityDurhamNorth CarolinaUSA
| | - Blanka Aguero
- Department of Biology & L. E. Anderson Bryophyte HerbariumDuke UniversityDurhamNorth CarolinaUSA
| | - Aaron Duffy
- Department of Biology & L. E. Anderson Bryophyte HerbariumDuke UniversityDurhamNorth CarolinaUSA
| | - A. Jonathan Shaw
- Department of Biology & L. E. Anderson Bryophyte HerbariumDuke UniversityDurhamNorth CarolinaUSA
| |
Collapse
|
6
|
Ebert P, Audano PA, Zhu Q, Rodriguez-Martin B, Porubsky D, Bonder MJ, Sulovari A, Ebler J, Zhou W, Serra Mari R, Yilmaz F, Zhao X, Hsieh P, Lee J, Kumar S, Lin J, Rausch T, Chen Y, Ren J, Santamarina M, Höps W, Ashraf H, Chuang NT, Yang X, Munson KM, Lewis AP, Fairley S, Tallon LJ, Clarke WE, Basile AO, Byrska-Bishop M, Corvelo A, Evani US, Lu TY, Chaisson MJP, Chen J, Li C, Brand H, Wenger AM, Ghareghani M, Harvey WT, Raeder B, Hasenfeld P, Regier AA, Abel HJ, Hall IM, Flicek P, Stegle O, Gerstein MB, Tubio JMC, Mu Z, Li YI, Shi X, Hastie AR, Ye K, Chong Z, Sanders AD, Zody MC, Talkowski ME, Mills RE, Devine SE, Lee C, Korbel JO, Marschall T, Eichler EE. Haplotype-resolved diverse human genomes and integrated analysis of structural variation. Science 2021; 372:eabf7117. [PMID: 33632895 PMCID: PMC8026704 DOI: 10.1126/science.abf7117] [Citation(s) in RCA: 336] [Impact Index Per Article: 112.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 02/09/2021] [Indexed: 12/14/2022]
Abstract
Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.
Collapse
Affiliation(s)
- Peter Ebert
- Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Moorenstraße 20, 40225 Düsseldorf, Germany
| | - Peter A Audano
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA
| | - Qihui Zhu
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - Bernardo Rodriguez-Martin
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA
| | - Marc Jan Bonder
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Arvis Sulovari
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA
| | - Jana Ebler
- Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Moorenstraße 20, 40225 Düsseldorf, Germany
| | - Weichen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Rebecca Serra Mari
- Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Moorenstraße 20, 40225 Düsseldorf, Germany
| | - Feyza Yilmaz
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA
| | - Xuefang Zhao
- Center for Genomic Medicine, Massachusetts General Hospital, Department of Neurology, Harvard Medical School, Boston, MA 02114, USA
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - PingHsun Hsieh
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA
| | - Joyce Lee
- Bionano Genomics, San Diego, CA 92121, USA
| | - Sushant Kumar
- Program in Computational Biology and Bioinformatics, Yale University, BASS 432 and 437, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Jiadong Lin
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, 710049, China
| | - Tobias Rausch
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Yu Chen
- Department of Genetics and Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Jingwen Ren
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Martin Santamarina
- Genomes and Disease, Centre for Research in Molecular Medicine and Chronic Diseases (CIMUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
- Department of Zoology, Genetics, and Physical Anthropology, Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - Wolfram Höps
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Hufsah Ashraf
- Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Moorenstraße 20, 40225 Düsseldorf, Germany
| | - Nelson T Chuang
- Institute for Genome Sciences, University of Maryland School of Medicine, 670 W Baltimore Street, Baltimore, MD 21201, USA
| | - Xiaofei Yang
- School of Computer Science and Technology, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, 710049, China
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA
| | - Susan Fairley
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Luke J Tallon
- Institute for Genome Sciences, University of Maryland School of Medicine, 670 W Baltimore Street, Baltimore, MD 21201, USA
| | | | | | | | | | | | - Tsung-Yu Lu
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Mark J P Chaisson
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Junjie Chen
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA 19122, USA
| | - Chong Li
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA 19122, USA
| | - Harrison Brand
- Center for Genomic Medicine, Massachusetts General Hospital, Department of Neurology, Harvard Medical School, Boston, MA 02114, USA
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Aaron M Wenger
- Pacific Biosciences of California, Menlo Park, CA 94025, USA
| | - Maryam Ghareghani
- Max Planck Institute for Informatics, Saarland Informatics Campus E1.4, 66123 Saarbrücken, Germany
- Saarbrücken Graduate School of Computer Science, Saarland University, Saarland Informatics Campus E1.3, 66123 Saarbrücken, Germany
- Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Moorenstraße 20, 40225 Düsseldorf, Germany
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA
| | - Benjamin Raeder
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Patrick Hasenfeld
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany
| | - Allison A Regier
- Department of Medicine, Washington University, St. Louis, MO 63108, USA
| | - Haley J Abel
- Department of Medicine, Washington University, St. Louis, MO 63108, USA
| | - Ira M Hall
- Department of Genetics, Yale School of Medicine, 333 Cedar Street, New Haven, CT 06510, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Oliver Stegle
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Mark B Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, BASS 432 and 437, 266 Whitney Avenue, New Haven, CT 06520, USA
| | - Jose M C Tubio
- Genomes and Disease, Centre for Research in Molecular Medicine and Chronic Diseases (CIMUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
- Department of Zoology, Genetics, and Physical Anthropology, Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - Zepeng Mu
- Genetics, Genomics, and Systems Biology, University of Chicago, Chicago, IL 60637, USA
| | - Yang I Li
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Xinghua Shi
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA 19122, USA
| | | | - Kai Ye
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi, 710049, China
- Department of Human Genetics, University of Michigan, 1241 E. Catherine Street, Ann Arbor, MI 48109, USA
| | - Zechen Chong
- Department of Genetics and Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Ashley D Sanders
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany
| | | | - Michael E Talkowski
- Center for Genomic Medicine, Massachusetts General Hospital, Department of Neurology, Harvard Medical School, Boston, MA 02114, USA
- Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Ryan E Mills
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
- Department of Human Genetics, University of Michigan, 1241 E. Catherine Street, Ann Arbor, MI 48109, USA
| | - Scott E Devine
- Institute for Genome Sciences, University of Maryland School of Medicine, 670 W Baltimore Street, Baltimore, MD 21201, USA
| | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT 06032, USA.
- Precision Medicine Center, The First Affiliated Hospital of Xi'an Jiaotong University, 277 West Yanta Road, Xi'an, 710061, Shaanxi, China
- Department of Graduate Studies-Life Sciences, Ewha Womans University, Ewhayeodae-gil, Seodaemun-gu, Seoul 120-750, South Korea
| | - Jan O Korbel
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany.
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tobias Marschall
- Heinrich Heine University, Medical Faculty, Institute for Medical Biometry and Bioinformatics, Moorenstraße 20, 40225 Düsseldorf, Germany.
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
7
|
Blischak PD, Barker MS, Gutenkunst RN. Inferring the Demographic History of Inbred Species from Genome-Wide SNP Frequency Data. Mol Biol Evol 2021; 37:2124-2136. [PMID: 32068861 DOI: 10.1093/molbev/msaa042] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 02/04/2020] [Accepted: 02/13/2020] [Indexed: 01/04/2023] Open
Abstract
Demographic inference using the site frequency spectrum (SFS) is a common way to understand historical events affecting genetic variation. However, most methods for estimating demography from the SFS assume random mating within populations, precluding these types of analyses in inbred populations. To address this issue, we developed a model for the expected SFS that includes inbreeding by parameterizing individual genotypes using beta-binomial distributions. We then take the convolution of these genotype probabilities to calculate the expected frequency of biallelic variants in the population. Using simulations, we evaluated the model's ability to coestimate demography and inbreeding using one- and two-population models across a range of inbreeding levels. We also applied our method to two empirical examples, American pumas (Puma concolor) and domesticated cabbage (Brassica oleracea var. capitata), inferring models both with and without inbreeding to compare parameter estimates and model fit. Our simulations showed that we are able to accurately coestimate demographic parameters and inbreeding even for highly inbred populations (F = 0.9). In contrast, failing to include inbreeding generally resulted in inaccurate parameter estimates in simulated data and led to poor model fit in our empirical analyses. These results show that inbreeding can have a strong effect on demographic inference, a pattern that was especially noticeable for parameters involving changes in population size. Given the importance of these estimates for informing practices in conservation, agriculture, and elsewhere, our method provides an important advancement for accurately estimating the demographic histories of these species.
Collapse
Affiliation(s)
- Paul D Blischak
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ.,Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ
| | - Michael S Barker
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ
| | - Ryan N Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ
| |
Collapse
|
8
|
Rapaport F, Boisson B, Gregor A, Béziat V, Boisson-Dupuis S, Bustamante J, Jouanguy E, Puel A, Rosain J, Zhang Q, Zhang SY, Gleeson JG, Quintana-Murci L, Casanova JL, Abel L, Patin E. Negative selection on human genes underlying inborn errors depends on disease outcome and both the mode and mechanism of inheritance. Proc Natl Acad Sci U S A 2021; 118:e2001248118. [PMID: 33408250 PMCID: PMC7826345 DOI: 10.1073/pnas.2001248118] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Genetic variants underlying life-threatening diseases, being unlikely to be transmitted to the next generation, are gradually and selectively eliminated from the population through negative selection. We study the determinants of this evolutionary process in human genes underlying monogenic diseases by comparing various negative selection scores and an integrative approach, CoNeS, at 366 loci underlying inborn errors of immunity (IEI). We find that genes underlying autosomal dominant (AD) or X-linked IEI have stronger negative selection scores than those underlying autosomal recessive (AR) IEI, whose scores are not different from those of genes not known to be disease causing. Nevertheless, genes underlying AR IEI that are lethal before reproductive maturity with complete penetrance have stronger negative selection scores than other genes underlying AR IEI. We also show that genes underlying AD IEI by loss of function have stronger negative selection scores than genes underlying AD IEI by gain of function, while genes underlying AD IEI by haploinsufficiency are under stronger negative selection than other genes underlying AD IEI. These results are replicated in 1,140 genes underlying inborn errors of neurodevelopment. Finally, we propose a supervised classifier, SCoNeS, which predicts better than state-of-the-art approaches whether a gene is more likely to underlie an AD or AR disease. The clinical outcomes of monogenic inborn errors, together with their mode and mechanisms of inheritance, determine the levels of negative selection at their corresponding loci. Integrating scores of negative selection may facilitate the prioritization of candidate genes and variants in patients suspected to carry an inborn error.
Collapse
Affiliation(s)
- Franck Rapaport
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065;
| | - Bertrand Boisson
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR 1163, Necker Hospital for Sick Children, 75015 Paris, France
- University of Paris, Imagine Institute, 75015 Paris, France
| | - Anne Gregor
- Institute of Human Genetics, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91054 Erlangen, Germany
| | - Vivien Béziat
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR 1163, Necker Hospital for Sick Children, 75015 Paris, France
- University of Paris, Imagine Institute, 75015 Paris, France
| | - Stéphanie Boisson-Dupuis
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR 1163, Necker Hospital for Sick Children, 75015 Paris, France
- University of Paris, Imagine Institute, 75015 Paris, France
| | - Jacinta Bustamante
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR 1163, Necker Hospital for Sick Children, 75015 Paris, France
- University of Paris, Imagine Institute, 75015 Paris, France
- Center for the Study of Primary Immunodeficiencies, Necker Hospital for Sick Children, Assistance Publique-Hôpitaux de Paris, 75015 Paris, France
| | - Emmanuelle Jouanguy
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR 1163, Necker Hospital for Sick Children, 75015 Paris, France
- University of Paris, Imagine Institute, 75015 Paris, France
| | - Anne Puel
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR 1163, Necker Hospital for Sick Children, 75015 Paris, France
- University of Paris, Imagine Institute, 75015 Paris, France
| | - Jérémie Rosain
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR 1163, Necker Hospital for Sick Children, 75015 Paris, France
- University of Paris, Imagine Institute, 75015 Paris, France
- Center for the Study of Primary Immunodeficiencies, Necker Hospital for Sick Children, Assistance Publique-Hôpitaux de Paris, 75015 Paris, France
| | - Qian Zhang
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065
| | - Shen-Ying Zhang
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR 1163, Necker Hospital for Sick Children, 75015 Paris, France
- University of Paris, Imagine Institute, 75015 Paris, France
| | - Joseph G Gleeson
- Howard Hughes Medical Institute, La Jolla, CA 92093
- Rady Children's Institute of Genomic Medicine, Department of Neurosciences, University of California San Diego, La Jolla, CA 92093
- Laboratory for Pediatric Brain Disease, The Rockefeller University, New York, NY 10065
| | - Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, 75015 Paris, France
- Chair of Human Genomics and Evolution, Collège de France, 75231 Paris, France
| | - Jean-Laurent Casanova
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065;
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR 1163, Necker Hospital for Sick Children, 75015 Paris, France
- University of Paris, Imagine Institute, 75015 Paris, France
- Howard Hughes Medical Institute, New York, NY 10065
| | - Laurent Abel
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR 1163, Necker Hospital for Sick Children, 75015 Paris, France
- University of Paris, Imagine Institute, 75015 Paris, France
| | - Etienne Patin
- Human Evolutionary Genetics Unit, Institut Pasteur, UMR 2000, CNRS, 75015 Paris, France
| |
Collapse
|
9
|
Brousseau L, Fine PVA, Dreyer E, Vendramin GG, Scotti I. Genomic and phenotypic divergence unveil microgeographic adaptation in the Amazonian hyperdominant tree Eperua falcata Aubl. (Fabaceae). Mol Ecol 2020; 30:1136-1154. [PMID: 32786115 DOI: 10.1111/mec.15595] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2018] [Revised: 06/19/2020] [Accepted: 07/31/2020] [Indexed: 01/04/2023]
Abstract
Plant populations can undergo very localized adaptation, allowing widely distributed populations to adapt to divergent habitats in spite of recurrent gene flow. Neotropical trees-whose large and undisturbed populations often span a variety of environmental conditions and local habitats-are particularly good models to study this process. Here, we explore patterns of adaptive divergence from large (i.e., regional) to small (i.e., microgeographic) spatial scales in the hyperdominant Amazonian tree Eperua falcata Aubl. (Fabaceae) under a replicated design involving two microhabitats (~300 m apart) in two study sites (~300 km apart). A three-year reciprocal transplant illustrates that, beyond strong maternal effects and phenotypic plasticity, genetically driven divergence in seedling growth and leaf traits was detected both between seedlings originating from different regions, and between seedlings from different microhabitats. In parallel, a complementary genome scan for selection was carried out through whole-genome sequencing of tree population pools. A set of 290 divergence outlier SNPs was detected at the regional scale (between study sites), while 185 SNPs located in the vicinity of 106 protein-coding genes were detected as replicated outliers between microhabitats within regions. Outlier-surrounding genomic regions are involved in a variety of physiological processes, including plant responses to stress (e.g., oxidative stress, hypoxia and metal toxicity) and biotic interactions. Together with evidence of microgeographic divergence in functional traits, the discovery of genomic candidates for microgeographic adaptive divergence represents a promising advance in our understanding of local adaptation, which probably operates across multiple spatial scales and underpins divergence and diversification in Neotropical trees.
Collapse
Affiliation(s)
- Louise Brousseau
- UMR EcoFoG, AgroParisTech, CIRAD, CNRS, INRAE, Université de Guyane, Université des Antilles, Kourou Cedex, France.,AMAP, Univ. Montpellier, CIRAD, CNRS, INRAE, IRD, Montpellier, France
| | - Paul V A Fine
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Erwin Dreyer
- Université de Lorraine, AgroParisTech, INRAE, Silva, Nancy, France
| | - Giovanni G Vendramin
- Institute of Biosciences and BioResources (IBBR-CNR), National Research Council, Division of Florence, Sesto Fiorentino, Italy
| | - Ivan Scotti
- UR629 Ecologie des Forêts Méditerranéennes (URFM), INRAE, Avignon, France
| |
Collapse
|
10
|
Abstract
We present selected topics of population genetics and molecular phylogeny. As several excellent review articles have been published and generally focus on European and American scientists, here, we emphasize contributions by Japanese researchers. Our review may also be seen as a belated 50-year celebration of Motoo Kimura's early seminal paper on the molecular clock, published in 1968.
Collapse
|
11
|
Kioukis A, Michalopoulou VA, Briers L, Pirintsos S, Studholme DJ, Pavlidis P, Sarris PF. Intraspecific diversification of the crop wild relative Brassica cretica Lam. using demographic model selection. BMC Genomics 2020; 21:48. [PMID: 31937246 PMCID: PMC6961386 DOI: 10.1186/s12864-019-6439-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 12/29/2019] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Crop wild relatives (CWRs) contain genetic diversity, representing an invaluable resource for crop improvement. Many of their traits have the potential to help crops to adapt to changing conditions that they experience due to climate change. An impressive global effort for the conservation of various CWR will facilitate their use in crop breeding for food security. The genus Brassica is listed in Annex I of the International Treaty on Plant Genetic Resources for Food and Agriculture. Brassica oleracea (or wild cabbage), a species native to southern and western Europe, has become established as an important human food crop plant because of its large reserves stored over the winter in its leaves. Brassica cretica Lam. (Bc) is a CWR in the brassica group and B. cretica subsp. nivea (Bcn) has been suggested as a separate subspecies. The species Bc has been proposed as a potential gene donor to brassica crops, including broccoli, cabbage, cauliflower, oilseed rape, etc. RESULTS: We sequenced genomes of four Bc individuals, including two Bcn and two Bc. Demographic analysis based on our whole-genome sequence data suggests that populations of Bc are not isolated. Classification of the Bc into distinct subspecies is not supported by the data. Using only the non-coding part of the data (thus, the parts of the genome that has evolved nearly neutrally), we find the gene flow between different Bc population is recent and its genomic diversity is high. CONCLUSIONS Despite predictions on the disruptive effect of gene flow in adaptation, when selection is not strong enough to prevent the loss of locally adapted alleles, studies show that gene flow can promote adaptation, that local adaptations can be maintained despite high gene flow, and that genetic architecture plays a fundamental role in the origin and maintenance of local adaptation with gene flow. Thus, in the genomic era it is important to link the selected demographic models with the underlying processes of genomic variation because, if this variation is largely selectively neutral, we cannot assume that a diverse population of crop wild relatives will necessarily exhibit the wide-ranging adaptive diversity required for further crop improvement.
Collapse
Affiliation(s)
- Antonios Kioukis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, Heraklion, 70013, Crete, Greece
| | - Vassiliki A Michalopoulou
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Heraklion, 70013, Crete, Greece
| | - Laura Briers
- Biosciences, College of Life and Environmental Sciences, University of Exeter, Exeter, UK
| | - Stergios Pirintsos
- Department of Biology, University of Crete, 714 09, Heraklion, Greece
- Botanical Garden, University of Crete, Gallos Campus, 741 00, Rethymnon, Greece
| | - David J Studholme
- Biosciences, College of Life and Environmental Sciences, University of Exeter, Exeter, UK.
| | - Pavlos Pavlidis
- Institute of Computer Science, Foundation for Research and Technology-Hellas, Heraklion, 70013, Crete, Greece
| | - Panagiotis F Sarris
- Institute of Molecular Biology and Biotechnology, Foundation for Research and Technology-Hellas, Heraklion, 70013, Crete, Greece.
- Biosciences, College of Life and Environmental Sciences, University of Exeter, Exeter, UK.
- Department of Biology, University of Crete, 714 09, Heraklion, Greece.
| |
Collapse
|
12
|
Perdomo-Sabogal Á, Nowick K. Genetic Variation in Human Gene Regulatory Factors Uncovers Regulatory Roles in Local Adaptation and Disease. Genome Biol Evol 2019; 11:2178-2193. [PMID: 31228201 PMCID: PMC6685493 DOI: 10.1093/gbe/evz131] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/18/2019] [Indexed: 01/13/2023] Open
Abstract
Differences in gene regulation have been suggested to play essential roles in the evolution of phenotypic changes. Although DNA changes in cis-regulatory elements affect only the regulation of its corresponding gene, variations in gene regulatory factors (trans) can have a broader effect, because the expression of many target genes might be affected. Aiming to better understand how natural selection may have shaped the diversity of gene regulatory factors in human, we assembled a catalog of all proteins involved in controlling gene expression. We found that at least five DNA-binding transcription factor classes are enriched among genes located in candidate regions for selection, suggesting that they might be relevant for understanding regulatory mechanisms involved in human local adaptation. The class of KRAB-ZNFs, zinc-finger (ZNF) genes with a Krüppel-associated box, stands out by first, having the most genes located on candidate regions for positive selection. Second, displaying most nonsynonymous single nucleotide polymorphisms (SNPs) with high genetic differentiation between populations within these regions. Third, having 27 KRAB-ZNF gene clusters with high extended haplotype homozygosity. Our further characterization of nonsynonymous SNPs in ZNF genes located within candidate regions for selection, suggests regulatory modifications that might influence the expression of target genes at population level. Our detailed investigation of three candidate regions revealed possible explanations for how SNPs may influence the prevalence of schizophrenia, eye development, and fertility in humans, among other phenotypes. The genetic variation we characterized here may be responsible for subtle to rough regulatory changes that could be important for understanding human adaptation.
Collapse
Affiliation(s)
- Álvaro Perdomo-Sabogal
- Human Biology Group, Department of Biology, Chemistry and Pharmacy, Institute for Zoology, Freie Universität Berlin, Germany
| | - Katja Nowick
- Human Biology Group, Department of Biology, Chemistry and Pharmacy, Institute for Zoology, Freie Universität Berlin, Germany
| |
Collapse
|
13
|
Preite V, Sailer C, Syllwasschy L, Bray S, Ahmadi H, Krämer U, Yant L. Convergent evolution in Arabidopsis halleri and Arabidopsis arenosa on calamine metalliferous soils. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180243. [PMID: 31154972 PMCID: PMC6560266 DOI: 10.1098/rstb.2018.0243] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/25/2019] [Indexed: 01/09/2023] Open
Abstract
It is a plausible hypothesis that parallel adaptation events to the same environmental challenge should result in genetic changes of similar or identical effects, depending on the underlying fitness landscapes. However, systematic testing of this is scarce. Here we examine this hypothesis in two closely related plant species, Arabidopsis halleri and Arabidopsis arenosa, which co-occur at two calamine metalliferous (M) sites harbouring toxic levels of the heavy metals zinc and cadmium. We conduct individual genome resequencing alongside soil elemental analysis for 64 plants from eight populations on M and non-metalliferous (NM) soils, and identify genomic footprints of selection and local adaptation. Selective sweep and environmental association analyses indicate a modest degree of gene as well as functional network convergence, whereby the proximal molecular factors mediating this convergence mostly differ between site pairs and species. Notably, we observe repeated selection on identical single nucleotide polymorphisms in several A. halleri genes at two independently colonized M sites. Our data suggest that species-specific metal handling and other biological features could explain a low degree of convergence between species. The parallel establishment of plant populations on calamine M soils involves convergent evolution, which will probably be more pervasive across sites purposely chosen for maximal similarity in soil composition. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- Veronica Preite
- Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr University Bochum, 44801 Bochum, Germany
| | - Christian Sailer
- Cell and Developmental Biology, John Innes Centre, Norwich NR4 7UH, UK
| | - Lara Syllwasschy
- Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr University Bochum, 44801 Bochum, Germany
| | - Sian Bray
- Cell and Developmental Biology, John Innes Centre, Norwich NR4 7UH, UK
| | - Hassan Ahmadi
- Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr University Bochum, 44801 Bochum, Germany
| | - Ute Krämer
- Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr University Bochum, 44801 Bochum, Germany
| | - Levi Yant
- Cell and Developmental Biology, John Innes Centre, Norwich NR4 7UH, UK
- School of Life Sciences, University of Nottingham, Nottingham NG7 2RD, UK
| |
Collapse
|
14
|
Preite V, Sailer C, Syllwasschy L, Bray S, Ahmadi H, Krämer U, Yant L. Convergent evolution in Arabidopsis halleri and Arabidopsis arenosa on calamine metalliferous soils. Philos Trans R Soc Lond B Biol Sci 2019; 374:20180243. [PMID: 31154972 DOI: 10.5061/dryad.jg30j4v] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023] Open
Abstract
It is a plausible hypothesis that parallel adaptation events to the same environmental challenge should result in genetic changes of similar or identical effects, depending on the underlying fitness landscapes. However, systematic testing of this is scarce. Here we examine this hypothesis in two closely related plant species, Arabidopsis halleri and Arabidopsis arenosa, which co-occur at two calamine metalliferous (M) sites harbouring toxic levels of the heavy metals zinc and cadmium. We conduct individual genome resequencing alongside soil elemental analysis for 64 plants from eight populations on M and non-metalliferous (NM) soils, and identify genomic footprints of selection and local adaptation. Selective sweep and environmental association analyses indicate a modest degree of gene as well as functional network convergence, whereby the proximal molecular factors mediating this convergence mostly differ between site pairs and species. Notably, we observe repeated selection on identical single nucleotide polymorphisms in several A. halleri genes at two independently colonized M sites. Our data suggest that species-specific metal handling and other biological features could explain a low degree of convergence between species. The parallel establishment of plant populations on calamine M soils involves convergent evolution, which will probably be more pervasive across sites purposely chosen for maximal similarity in soil composition. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- Veronica Preite
- 1 Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr University Bochum , 44801 Bochum , Germany
| | - Christian Sailer
- 2 Cell and Developmental Biology, John Innes Centre , Norwich NR4 7UH , UK
| | - Lara Syllwasschy
- 1 Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr University Bochum , 44801 Bochum , Germany
| | - Sian Bray
- 2 Cell and Developmental Biology, John Innes Centre , Norwich NR4 7UH , UK
| | - Hassan Ahmadi
- 1 Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr University Bochum , 44801 Bochum , Germany
| | - Ute Krämer
- 1 Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr University Bochum , 44801 Bochum , Germany
| | - Levi Yant
- 2 Cell and Developmental Biology, John Innes Centre , Norwich NR4 7UH , UK
- 3 School of Life Sciences, University of Nottingham , Nottingham NG7 2RD , UK
| |
Collapse
|
15
|
Preite V, Sailer C, Syllwasschy L, Bray S, Ahmadi H, Krämer U, Yant L. Convergent evolution in Arabidopsis halleri and Arabidopsis arenosa on calamine metalliferous soils. Philos Trans R Soc Lond B Biol Sci 2019. [PMID: 31154972 DOI: 10.1101/459362] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2023] Open
Abstract
It is a plausible hypothesis that parallel adaptation events to the same environmental challenge should result in genetic changes of similar or identical effects, depending on the underlying fitness landscapes. However, systematic testing of this is scarce. Here we examine this hypothesis in two closely related plant species, Arabidopsis halleri and Arabidopsis arenosa, which co-occur at two calamine metalliferous (M) sites harbouring toxic levels of the heavy metals zinc and cadmium. We conduct individual genome resequencing alongside soil elemental analysis for 64 plants from eight populations on M and non-metalliferous (NM) soils, and identify genomic footprints of selection and local adaptation. Selective sweep and environmental association analyses indicate a modest degree of gene as well as functional network convergence, whereby the proximal molecular factors mediating this convergence mostly differ between site pairs and species. Notably, we observe repeated selection on identical single nucleotide polymorphisms in several A. halleri genes at two independently colonized M sites. Our data suggest that species-specific metal handling and other biological features could explain a low degree of convergence between species. The parallel establishment of plant populations on calamine M soils involves convergent evolution, which will probably be more pervasive across sites purposely chosen for maximal similarity in soil composition. This article is part of the theme issue 'Convergent evolution in the genomics era: new insights and directions'.
Collapse
Affiliation(s)
- Veronica Preite
- 1 Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr University Bochum , 44801 Bochum , Germany
| | - Christian Sailer
- 2 Cell and Developmental Biology, John Innes Centre , Norwich NR4 7UH , UK
| | - Lara Syllwasschy
- 1 Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr University Bochum , 44801 Bochum , Germany
| | - Sian Bray
- 2 Cell and Developmental Biology, John Innes Centre , Norwich NR4 7UH , UK
| | - Hassan Ahmadi
- 1 Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr University Bochum , 44801 Bochum , Germany
| | - Ute Krämer
- 1 Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr University Bochum , 44801 Bochum , Germany
| | - Levi Yant
- 2 Cell and Developmental Biology, John Innes Centre , Norwich NR4 7UH , UK
- 3 School of Life Sciences, University of Nottingham , Nottingham NG7 2RD , UK
| |
Collapse
|
16
|
Neininger K, Marschall T, Helms V. SNP and indel frequencies at transcription start sites and at canonical and alternative translation initiation sites in the human genome. PLoS One 2019; 14:e0214816. [PMID: 30978217 PMCID: PMC6461226 DOI: 10.1371/journal.pone.0214816] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Accepted: 03/20/2019] [Indexed: 11/30/2022] Open
Abstract
Single-nucleotide polymorphisms (SNPs) are the most common form of genetic variation in humans and drive phenotypic variation. Due to evolutionary conservation, SNPs and indels (insertion and deletions) are depleted in functionally important sequence elements. Recently, population-scale sequencing efforts such as the 1000 Genomes Project and the Genome of the Netherlands Project have catalogued large numbers of sequence variants. Here, we present a systematic analysis of the polymorphisms reported by these two projects in different coding and non-coding genomic elements of the human genome (intergenic regions, CpG islands, promoters, 5’ UTRs, coding exons, 3’ UTRs, introns, and intragenic regions). Furthermore, we were especially interested in the distribution of SNPs and indels in direct vicinity to the transcription start site (TSS) and translation start site (CSS). Thereby, we discovered an enrichment of dinucleotides CpG and CpA and an accumulation of SNPs at base position −1 relative to the TSS that involved primarily CpG and CpA dinucleotides. Genes having a CpG dinucleotide at TSS position -1 were enriched in the functional GO terms “Phosphoprotein”, “Alternative splicing”, and “Protein binding”. Focusing on the CSS, we compared SNP patterns in the flanking regions of canonical and alternative AUG and near-cognate start sites where we considered alternative starts previously identified by experimental ribosome profiling. We observed similar conservation patterns of canonical and alternative translation start sites, which underlines the importance of alternative translation mechanisms for cellular function.
Collapse
Affiliation(s)
- Kerstin Neininger
- Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
- Graduate School of Computer Science, Saarland University, 66123 Saarbrücken, Germany
| | - Tobias Marschall
- Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
- Max Planck Institute for Informatics, 66123 Saarbrücken, Germany
| | - Volkhard Helms
- Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
- * E-mail:
| |
Collapse
|
17
|
Hoelzel AR, Bruford MW, Fleischer RC. Conservation of adaptive potential and functional diversity. CONSERV GENET 2019. [DOI: 10.1007/s10592-019-01151-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
18
|
Bitarello BD, de Filippo C, Teixeira JC, Schmidt JM, Kleinert P, Meyer D, Andrés AM. Signatures of Long-Term Balancing Selection in Human Genomes. Genome Biol Evol 2018; 10:939-955. [PMID: 29608730 PMCID: PMC5952967 DOI: 10.1093/gbe/evy054] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/14/2018] [Indexed: 12/15/2022] Open
Abstract
Balancing selection maintains advantageous diversity in populations through various mechanisms. Although extensively explored from a theoretical perspective, an empirical understanding of its prevalence and targets lags behind our knowledge of positive selection. Here, we describe the Non-central Deviation (NCD), a simple yet powerful statistic to detect long-term balancing selection (LTBS) that quantifies how close frequencies are to expectations under LTBS, and provides the basis for a neutrality test. NCD can be applied to a single locus or genomic data, and can be implemented considering only polymorphisms (NCD1) or also considering fixed differences with respect to an outgroup (NCD2) species. Incorporating fixed differences improves power, and NCD2 has higher power to detect LTBS in humans under different frequencies of the balanced allele(s) than other available methods. Applied to genome-wide data from African and European human populations, in both cases using chimpanzee as an outgroup, NCD2 shows that, albeit not prevalent, LTBS affects a sizable portion of the genome: ∼0.6% of analyzed genomic windows and 0.8% of analyzed positions. Significant windows (P < 0.0001) contain 1.6% of SNPs in the genome, which disproportionally fall within exons and change protein sequence, but are not enriched in putatively regulatory sites. These windows overlap ∼8% of the protein-coding genes, and these have larger number of transcripts than expected by chance even after controlling for gene length. Our catalog includes known targets of LTBS but a majority of them (90%) are novel. As expected, immune-related genes are among those with the strongest signatures, although most candidates are involved in other biological functions, suggesting that LTBS potentially influences diverse human phenotypes.
Collapse
Affiliation(s)
- Bárbara D Bitarello
- Department of Genetics and Evolutionary Biology, University of São Paulo, São Paulo, Brazil.,Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Cesare de Filippo
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - João C Teixeira
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany.,Unit of Human Evolutionary Genetics, Institut Pasteur, Paris, France
| | - Joshua M Schmidt
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Philip Kleinert
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany.,Computational Molecular Biology Department, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Diogo Meyer
- Department of Genetics and Evolutionary Biology, University of São Paulo, São Paulo, Brazil
| | - Aida M Andrés
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany.,Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London, United Kingdom
| |
Collapse
|
19
|
Nigenda‐Morales SF, Hu Y, Beasley JC, Ruiz‐Piña HA, Valenzuela‐Galván D, Wayne RK. Transcriptomic analysis of skin pigmentation variation in the Virginia opossum (
Didelphis virginiana
). Mol Ecol 2018; 27:2680-2697. [DOI: 10.1111/mec.14712] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2017] [Revised: 04/05/2018] [Accepted: 04/17/2018] [Indexed: 12/19/2022]
Affiliation(s)
- Sergio F. Nigenda‐Morales
- Department of Ecology and Evolutionary Biology University of California, Los Angeles Los Angeles California
| | - Yibo Hu
- Key Lab of Animal Ecology and Conservation Biology Institute of Zoology Chinese Academy of Sciences Chaoyang, Beijing China
| | - James C. Beasley
- Savannah River Ecology Lab Warnell School of Forestry and Natural Resources University of Georgia Aiken South Carolina
| | - Hugo A. Ruiz‐Piña
- Centro de Investigaciones Regionales “Dr. Hideyo Noguchi” Universidad Autónoma de Yucatán Mérida Yucatán Mexico
| | - David Valenzuela‐Galván
- Departamento de Ecología Evolutiva Centro de Investigación en Biodiversidad y Conservación Universidad Autónoma del Estado de Morelos Cuernavaca Morelos Mexico
| | - Robert K. Wayne
- Department of Ecology and Evolutionary Biology University of California, Los Angeles Los Angeles California
| |
Collapse
|
20
|
Ruffley M, Smith ML, Espíndola A, Carstens BC, Sullivan J, Tank DC. Combining allele frequency and tree-based approaches improves phylogeographic inference from natural history collections. Mol Ecol 2018; 27:1012-1024. [PMID: 29334417 PMCID: PMC5878120 DOI: 10.1111/mec.14491] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Revised: 12/07/2017] [Accepted: 12/08/2017] [Indexed: 01/25/2023]
Abstract
Model selection approaches in phylogeography have allowed researchers to evaluate the support for competing demographic histories, which provides a mode of inference and a measure of uncertainty in understanding climatic and spatial influences on intraspecific diversity. Here, to rank all models in the comparison set and determine what proportion of the total support the top-ranked model garners, we conduct model selection using two analytical approaches-allele frequency-based, implemented in fastsimcoal2, and gene tree-based, implemented in phrapl. We then expand this model selection framework by including an assessment of absolute fit of the models to the data. For this, we utilize DNA isolated from existing natural history collections that span the distribution of red alder (Alnus rubra) in the Pacific Northwest of North America to generate genomic data for the evaluation of 13 demographic scenarios. The quality of DNA recovered from herbarium specimen leaf tissue was assessed for its utility and effectiveness in demographic model selection, specifically in the two approaches mentioned. We present strong support for the use of herbarium tissue in the generation of genomic DNA, albeit with the inclusion of additional quality control checks prior to library preparation and analyses with multiple approaches that incorporate various data. Analyses with allele frequency spectra and gene trees predominantly support A. rubra having experienced an ancient vicariance event with intermittent and frequent gene flow between the disjunct populations. Additionally, the data consistently fit the most frequently selected model, corroborating the model selection techniques. Finally, these results suggest that the A. rubra disjunct populations do not represent separate species.
Collapse
Affiliation(s)
- Megan Ruffley
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, Moscow, ID, USA
- Stillinger Herbarium, University of Idaho, Moscow, ID, USA
| | - Megan L Smith
- Department of Evolution, Ecology, & Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Anahí Espíndola
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, Moscow, ID, USA
| | - Bryan C Carstens
- Department of Evolution, Ecology, & Organismal Biology, The Ohio State University, Columbus, OH, USA
| | - Jack Sullivan
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, Moscow, ID, USA
| | - David C Tank
- Department of Biological Sciences, University of Idaho, Moscow, ID, USA
- Institute for Bioinformatics and Evolutionary Studies (IBEST), Biological Sciences, Moscow, ID, USA
- Stillinger Herbarium, University of Idaho, Moscow, ID, USA
| |
Collapse
|
21
|
Viscardi LH, Paixão-Côrtes VR, Comas D, Salzano FM, Rovaris D, Bau CD, Amorim CEG, Bortolini MC. Searching for ancient balanced polymorphisms shared between Neanderthals and Modern Humans. Genet Mol Biol 2018; 41:67-81. [PMID: 29658973 PMCID: PMC5901502 DOI: 10.1590/1678-4685-gmb-2017-0308] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Accepted: 11/26/2017] [Indexed: 01/06/2023] Open
Abstract
Hominin evolution is characterized by adaptive solutions often rooted in behavioral and cognitive changes. If balancing selection had an important and long-lasting impact on the evolution of these traits, it can be hypothesized that genes associated with them should carry an excess of shared polymorphisms (trans- SNPs) across recent Homo species. In this study, we investigate the role of balancing selection in human evolution using available exomes from modern (Homo sapiens) and archaic humans (H. neanderthalensis and Denisovan) for an excess of trans-SNP in two gene sets: one associated with the immune system (IMMS) and another one with behavioral system (BEHS). We identified a significant excess of trans-SNPs in IMMS (N=547), of which six of these located within genes previously associated with schizophrenia. No excess of trans-SNPs was found in BEHS, but five genes in this system harbor potential signals for balancing selection and are associated with psychiatric or neurodevelopmental disorders. Our approach evidenced recent Homo trans-SNPs that have been previously implicated in psychiatric diseases such as schizophrenia, suggesting that a genetic repertoire common to the immune and behavioral systems could have been maintained by balancing selection starting before the split between archaic and modern humans.
Collapse
Affiliation(s)
- Lucas Henriques Viscardi
- Departamento de Genética, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | | | - David Comas
- Institut de Biologia Evolutiva (CSIC-UPF), Departament de Ciències Experimentals i de LaSalut, Universitat Pompeu Fabra, Barcelona, Spain
| | - Francisco Mauro Salzano
- Departamento de Genética, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Diego Rovaris
- Departamento de Genética, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Claiton Dotto Bau
- Departamento de Genética, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| | - Carlos Eduardo G. Amorim
- Department of Biological Sciences, Columbia University, New York, NY, U.S.A
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY, U.S.A
| | - Maria Cátira Bortolini
- Departamento de Genética, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS, Brazil
| |
Collapse
|
22
|
Comparison of Single Genome and Allele Frequency Data Reveals Discordant Demographic Histories. G3-GENES GENOMES GENETICS 2017; 7:3605-3620. [PMID: 28893846 PMCID: PMC5677151 DOI: 10.1534/g3.117.300259] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Inference of demographic history from genetic data is a primary goal of population genetics of model and nonmodel organisms. Whole genome-based approaches such as the pairwise/multiple sequentially Markovian coalescent methods use genomic data from one to four individuals to infer the demographic history of an entire population, while site frequency spectrum (SFS)-based methods use the distribution of allele frequencies in a sample to reconstruct the same historical events. Although both methods are extensively used in empirical studies and perform well on data simulated under simple models, there have been only limited comparisons of them in more complex and realistic settings. Here we use published demographic models based on data from three human populations (Yoruba, descendants of northwest-Europeans, and Han Chinese) as an empirical test case to study the behavior of both inference procedures. We find that several of the demographic histories inferred by the whole genome-based methods do not predict the genome-wide distribution of heterozygosity, nor do they predict the empirical SFS. However, using simulated data, we also find that the whole genome methods can reconstruct the complex demographic models inferred by SFS-based methods, suggesting that the discordant patterns of genetic variation are not attributable to a lack of statistical power, but may reflect unmodeled complexities in the underlying demography. More generally, our findings indicate that demographic inference from a small number of genomes, routine in genomic studies of nonmodel organisms, should be interpreted cautiously, as these models cannot recapitulate other summaries of the data.
Collapse
|
23
|
Xu P, Wu X, Muñoz‐Amatriaín M, Wang B, Wu X, Hu Y, Huynh B, Close TJ, Roberts PA, Zhou W, Lu Z, Li G. Genomic regions, cellular components and gene regulatory basis underlying pod length variations in cowpea (V. unguiculata L. Walp). PLANT BIOTECHNOLOGY JOURNAL 2017; 15:547-557. [PMID: 27658053 PMCID: PMC5399003 DOI: 10.1111/pbi.12639] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2016] [Revised: 08/30/2016] [Accepted: 09/15/2016] [Indexed: 05/19/2023]
Abstract
Cowpea (V. unguiculata L. Walp) is a climate resilient legume crop important for food security. Cultivated cowpea (V. unguiculata L) generally comprises the bushy, short-podded grain cowpea dominant in Africa and the climbing, long-podded vegetable cowpea popular in Asia. How selection has contributed to the diversification of the two types of cowpea remains largely unknown. In the current study, a novel genotyping assay for over 50 000 SNPs was employed to delineate genomic regions governing pod length. Major, minor and epistatic QTLs were identified through QTL mapping. Seventy-two SNPs associated with pod length were detected by genome-wide association studies (GWAS). Population stratification analysis revealed subdivision among a cowpea germplasm collection consisting of 299 accessions, which is consistent with pod length groups. Genomic scan for selective signals suggested that domestication of vegetable cowpea was accompanied by selection of multiple traits including pod length, while the further improvement process was featured by selection of pod length primarily. Pod growth kinetics assay demonstrated that more durable cell proliferation rather than cell elongation or enlargement was the main reason for longer pods. Transcriptomic analysis suggested the involvement of sugar, gibberellin and nutritional signalling in regulation of pod length. This study establishes the basis for map-based cloning of pod length genes in cowpea and for marker-assisted selection of this trait in breeding programmes.
Collapse
Affiliation(s)
- Pei Xu
- Institute of VegetablesZhejiang Academy of Agricultural SciencesHangzhouChina
- State Key Lab Breeding Base for Sustainable Control of Plant Pest and DiseaseZhejiang Academy of Agricultural SciencesHangzhouChina
| | - Xinyi Wu
- Institute of VegetablesZhejiang Academy of Agricultural SciencesHangzhouChina
| | - María Muñoz‐Amatriaín
- Department of Botany and Plant SciencesUniversity of California‐RiversideRiversideCAUSA
| | - Baogen Wang
- Institute of VegetablesZhejiang Academy of Agricultural SciencesHangzhouChina
| | - Xiaohua Wu
- Institute of VegetablesZhejiang Academy of Agricultural SciencesHangzhouChina
| | - Yaowen Hu
- Institute of VegetablesZhejiang Academy of Agricultural SciencesHangzhouChina
| | - Bao‐Lam Huynh
- Department of NematologyUniversity of California‐RiversideRiversideCAUSA
| | - Timothy J. Close
- Department of Botany and Plant SciencesUniversity of California‐RiversideRiversideCAUSA
| | - Philip A. Roberts
- Department of NematologyUniversity of California‐RiversideRiversideCAUSA
| | - Wen Zhou
- Institute of VegetablesZhejiang Academy of Agricultural SciencesHangzhouChina
| | - Zhongfu Lu
- Institute of VegetablesZhejiang Academy of Agricultural SciencesHangzhouChina
| | - Guojing Li
- Institute of VegetablesZhejiang Academy of Agricultural SciencesHangzhouChina
- State Key Lab Breeding Base for Sustainable Control of Plant Pest and DiseaseZhejiang Academy of Agricultural SciencesHangzhouChina
| |
Collapse
|
24
|
Li LF, Li YL, Jia Y, Caicedo AL, Olsen KM. Signatures of adaptation in the weedy rice genome. Nat Genet 2017; 49:811-814. [PMID: 28369039 DOI: 10.1038/ng.3825] [Citation(s) in RCA: 99] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Accepted: 03/02/2017] [Indexed: 12/19/2022]
Abstract
Crop domestication provided the calories that fueled the rise of civilization. For many crop species, domestication was accompanied by the evolution of weedy crop relatives, which aggressively outcompete crops and reduce harvests. Understanding the genetic mechanisms that underlie the evolution of weedy crop relatives is critical for agricultural weed management and food security. Here we use whole-genome sequences to examine the origin and adaptation of the two major strains of weedy rice found in the United States. We find that de-domestication from cultivated ancestors has had a major role in their evolution, with relatively few genetic changes required for the emergence of weediness traits. Weed strains likely evolved both early and late in the history of rice cultivation and represent an under-recognized component of the domestication process. Genomic regions identified here that show evidence of selection can be considered candidates for future genetic and functional analyses for rice improvement.
Collapse
Affiliation(s)
- Lin-Feng Li
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China.,Department of Biology, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Ya-Ling Li
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, China
| | - Yulin Jia
- US Department of Agriculture-Agricultural Research Service, Dale Bumpers National Rice Research Center, Stuttgart, Arkansas, USA
| | - Ana L Caicedo
- Department of Biology, University of Massachusetts, Amherst, Massachusetts, USA
| | - Kenneth M Olsen
- Department of Biology, Washington University in St. Louis, St. Louis, Missouri, USA
| |
Collapse
|
25
|
Park L. Evidence of Recent Intricate Adaptation in Human Populations. PLoS One 2016; 11:e0165870. [PMID: 27992444 PMCID: PMC5167553 DOI: 10.1371/journal.pone.0165870] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Accepted: 10/19/2016] [Indexed: 11/18/2022] Open
Abstract
Recent human adaptations have shaped population differentiation in genomic regions containing putative functional variants, mostly located in predicted regulatory elements. However, their actual functionalities and the underlying mechanism of recent adaptation remain poorly understood. In the current study, regions of genes and repeats were investigated for functionality depending on the degree of population differentiation, FST or ΔDAF (a difference in derived allele frequency). The high FST in the 5´ or 3´ untranslated regions (UTRs), in particular, confirmed that population differences arose mainly from differences in regulation. Expression quantitative trait loci (eQTL) analyses using lymphoblastoid cell lines indicated that the majority of the highly population-specific regions represented cis- and/or trans-eQTL. However, groups having the highest ΔDAFs did not necessarily have higher proportions of eQTL variants; in these groups, the patterns were complex, indicating recent intricate adaptations. The results indicated that East Asian (EAS) and European populations (EUR) experienced mutual selection pressures. The mean derived allele frequency of the high ΔDAF groups suggested that EAS and EUR underwent strong adaptation; however, the African population in Africa (AFR) experienced slight, yet broad, adaptation. The DAF distributions of variants in the gene regions showed clear selective pressure in each population, which implies the existence of more recent regulatory adaptations in cells other than lymphoblastoid cell lines. In-depth analysis of population-differentiated regions indicated that the coding gene, RNF135, represented a trans-regulation hotspot via cis-regulation by the population-specific variants in the region of selective sweep. Together, the results provide strong evidence of actual intricate adaptation of human populations via regulatory manipulation.
Collapse
Affiliation(s)
- Leeyoung Park
- Natural Science Research Institute, Yonsei University, Seoul, Korea
- * E-mail:
| |
Collapse
|
26
|
Cagan A, Theunert C, Laayouni H, Santpere G, Pybus M, Casals F, Prüfer K, Navarro A, Marques-Bonet T, Bertranpetit J, Andrés AM. Natural Selection in the Great Apes. Mol Biol Evol 2016; 33:3268-3283. [PMID: 27795229 PMCID: PMC5100057 DOI: 10.1093/molbev/msw215] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Natural selection is crucial for the adaptation of populations to their environments. Here, we present the first global study of natural selection in the Hominidae (humans and great apes) based on genome-wide information from population samples representing all extant species (including most subspecies). Combining several neutrality tests we create a multi-species map of signatures of natural selection covering all major types of natural selection. We find that the estimated efficiency of both purifying and positive selection varies between species and is significantly correlated with their long-term effective population size. Thus, even the modest differences in population size among the closely related Hominidae lineages have resulted in differences in their ability to remove deleterious alleles and to adapt to changing environments. Most signatures of balancing and positive selection are species-specific, with signatures of balancing selection more often being shared among species. We also identify loci with evidence of positive selection across several lineages. Notably, we detect signatures of positive selection in several genes related to brain function, anatomy, diet and immune processes. Our results contribute to a better understanding of human evolution by putting the evidence of natural selection in humans within its larger evolutionary context. The global map of natural selection in our closest living relatives is available as an interactive browser at http://tinyurl.com/nf8qmzh.
Collapse
Affiliation(s)
- Alexander Cagan
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Christoph Theunert
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA
| | - Hafid Laayouni
- Departament de Ciencies Experimentals i de la Salut, Institut de Biologia Evolutiva, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
- Departament de Genètica i de Microbiologia, Universitat Autonòma de Barcelona, Bellaterra, Barcelona, Catalonia, Spain
| | - Gabriel Santpere
- Departament de Ciencies Experimentals i de la Salut, Institut de Biologia Evolutiva, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
- Department of Neuroscience, Yale University School of Medicine, New Haven, CT
| | - Marc Pybus
- Departament de Ciencies Experimentals i de la Salut, Institut de Biologia Evolutiva, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Ferran Casals
- Genomics Core Facility, Departament de Ciencies Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Kay Prüfer
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Arcadi Navarro
- Departament de Ciencies Experimentals i de la Salut, Institut de Biologia Evolutiva, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia, Spain
| | - Tomas Marques-Bonet
- Departament de Ciencies Experimentals i de la Salut, Institut de Biologia Evolutiva, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia, Spain
| | - Jaume Bertranpetit
- Departament de Ciencies Experimentals i de la Salut, Institut de Biologia Evolutiva, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
- Department of Archaeology and Anthropology, Leverhulme Centre for Human Evolutionary Studies, University of Cambridge, Cambridge, United Kingdom
| | - Aida M Andrés
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| |
Collapse
|
27
|
Abstract
The wealth of available genetic information is allowing the reconstruction of human demographic and adaptive history. Demography and purifying selection affect the purge of rare, deleterious mutations from the human population, whereas positive and balancing selection can increase the frequency of advantageous variants, improving survival and reproduction in specific environmental conditions. In this review, I discuss how theoretical and empirical population genetics studies, using both modern and ancient DNA data, are a powerful tool for obtaining new insight into the genetic basis of severe disorders and complex disease phenotypes, rare and common, focusing particularly on infectious disease risk.
Collapse
Affiliation(s)
- Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Department of Genomes & Genetics, Institut Pasteur, Paris, 75015, France.
- Centre National de la Recherche Scientifique, URA3012, Paris, 75015, France.
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, Paris, 75015, France.
| |
Collapse
|
28
|
Freedman AH, Lohmueller KE, Wayne RK. Evolutionary History, Selective Sweeps, and Deleterious Variation in the Dog. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2016. [DOI: 10.1146/annurev-ecolsys-121415-032155] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The dog is our oldest domesticate and has experienced a wide variety of demographic histories, including a bottleneck associated with domestication and individual bottlenecks associated with the formation of modern breeds. Admixture with gray wolves, and among dog breeds and populations, has also occurred throughout its history. Likewise, the intensity and focus of selection have varied, from an initial focus on traits enhancing cohabitation with humans, to more directed selection on specific phenotypic characteristics and behaviors. In this review, we summarize and synthesize genetic findings from genome-wide and complete genome studies that document the genomic consequences of demography and selection, including the effects on adaptive and deleterious variation. Consistent with the evolutionary history of the dog, signals of natural and artificial selection are evident in the dog genome. However, conclusions from studies of positive selection are fraught with the problem of false positives given that demographic history is often not taken into account.
Collapse
Affiliation(s)
- Adam H. Freedman
- Informatics Group, Faculty of Arts and Sciences, Harvard University, Cambridge, Massachusetts 02138
| | - Kirk E. Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095
| | - Robert K. Wayne
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California 90095
| |
Collapse
|
29
|
Xiang-Yu J, Yang Z, Tang K, Li H. Revisiting the false positive rate in detecting recent positive selection. QUANTITATIVE BIOLOGY 2016. [DOI: 10.1007/s40484-016-0077-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
30
|
Rougemont Q, Gagnaire PA, Perrier C, Genthon C, Besnard AL, Launey S, Evanno G. Inferring the demographic history underlying parallel genomic divergence among pairs of parasitic and nonparasitic lamprey ecotypes. Mol Ecol 2016; 26:142-162. [PMID: 27105132 DOI: 10.1111/mec.13664] [Citation(s) in RCA: 91] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2016] [Revised: 03/22/2016] [Accepted: 04/06/2016] [Indexed: 12/20/2022]
Abstract
Understanding the evolutionary mechanisms generating parallel genomic divergence patterns among replicate ecotype pairs remains an important challenge in speciation research. We investigated the genomic divergence between the anadromous parasitic river lamprey (Lampetra fluviatilis) and the freshwater-resident nonparasitic brook lamprey (Lampetra planeri) in nine population pairs displaying variable levels of geographic connectivity. We genotyped 338 individuals with RAD sequencing and inferred the demographic divergence history of each population pair using a diffusion approximation method. Divergence patterns in geographically connected population pairs were better explained by introgression after secondary contact, whereas disconnected population pairs have retained a signal of ancient migration. In all ecotype pairs, models accounting for differential introgression among loci outperformed homogeneous migration models. Generating neutral predictions from the inferred divergence scenarios to detect highly differentiated markers identified greater proportions of outliers in disconnected population pairs than in connected pairs. However, increased similarity in the most divergent genomic regions was found among connected ecotype pairs, indicating that gene flow was instrumental in generating parallelism at the molecular level. These results suggest that heterogeneous genomic differentiation and parallelism among replicate ecotype pairs have partly emerged through restricted introgression in genomic islands.
Collapse
Affiliation(s)
- Quentin Rougemont
- INRA, UMR 985 Ecologie et Santé des Ecosystèmes, 35042, Rennes, France.,Agrocampus Ouest, UMR ESE, 65 rue de Saint-Brieuc, 35042, Rennes, France
| | - Pierre-Alexandre Gagnaire
- Institut des Sciences de l'Evolution (UMR 5554), CNRS-UM2-IRD, Place Eugène Bataillon, F-34095, Montpellier, France.,Station Méditerranéenne de l'Environnement Littoral, Université de Montpellier, 2 Rue des Chantiers, F-34200, Sète, France
| | - Charles Perrier
- CEFE-CNRS, Centre D'Ecologie Fonctionnelle et Evolutive, Route de Mende, 34090, Montpellier, France
| | - Clémence Genthon
- Plateforme génomique INRA GenoToul Chemin de Borderouge - Auzeville, 31320, Castanet-Tolosan, France
| | - Anne-Laure Besnard
- INRA, UMR 985 Ecologie et Santé des Ecosystèmes, 35042, Rennes, France.,Agrocampus Ouest, UMR ESE, 65 rue de Saint-Brieuc, 35042, Rennes, France
| | - Sophie Launey
- INRA, UMR 985 Ecologie et Santé des Ecosystèmes, 35042, Rennes, France.,Agrocampus Ouest, UMR ESE, 65 rue de Saint-Brieuc, 35042, Rennes, France
| | - Guillaume Evanno
- INRA, UMR 985 Ecologie et Santé des Ecosystèmes, 35042, Rennes, France.,Agrocampus Ouest, UMR ESE, 65 rue de Saint-Brieuc, 35042, Rennes, France
| |
Collapse
|
31
|
Kimmel M, Wojdyła T. Genetic demographic networks: Mathematical model and applications. Theor Popul Biol 2016; 111:75-86. [PMID: 27378746 DOI: 10.1016/j.tpb.2016.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2015] [Revised: 06/21/2016] [Accepted: 06/24/2016] [Indexed: 10/21/2022]
Abstract
Recent improvement in the quality of genetic data obtained from extinct human populations and their ancestors encourages searching for answers to basic questions regarding human population history. The most common and successful are model-based approaches, in which genetic data are compared to the data obtained from the assumed demography model. Using such approach, it is possible to either validate or adjust assumed demography. Model fit to data can be obtained based on reverse-time coalescent simulations or forward-time simulations. In this paper we introduce a computational method based on mathematical equation that allows obtaining joint distributions of pairs of individuals under a specified demography model, each of them characterized by a genetic variant at a chosen locus. The two individuals are randomly sampled from either the same or two different populations. The model assumes three types of demographic events (split, merge and migration). Populations evolve according to the time-continuous Moran model with drift and Markov-process mutation. This latter process is described by the Lyapunov-type equation introduced by O'Brien and generalized in our previous works. Application of this equation constitutes an original contribution. In the result section of the paper we present sample applications of our model to both simulated and literature-based demographies. Among other we include a study of the Slavs-Balts-Finns genetic relationship, in which we model split and migrations between the Balts and Slavs. We also include another example that involves the migration rates between farmers and hunters-gatherers, based on modern and ancient DNA samples. This latter process was previously studied using coalescent simulations. Our results are in general agreement with the previous method, which provides validation of our approach. Although our model is not an alternative to simulation methods in the practical sense, it provides an algorithm to compute pairwise distributions of alleles, in the case of haploid non-recombining loci such as mitochondrial and Y-chromosome loci in humans.
Collapse
Affiliation(s)
- Marek Kimmel
- Department of Statistics, Rice University, 6100 Main Street, Houston, TX 77005, USA; Systems Engineering Group, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland.
| | - Tomasz Wojdyła
- Institute of Automatic Control, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland.
| |
Collapse
|
32
|
Wang J, Street NR, Scofield DG, Ingvarsson PK. Variation in Linked Selection and Recombination Drive Genomic Divergence during Allopatric Speciation of European and American Aspens. Mol Biol Evol 2016; 33:1754-1767. [PMID: 26983554 DOI: 10.1101/029561] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2023] Open
Abstract
Despite the global economic and ecological importance of forest trees, the genomic basis of differential adaptation and speciation in tree species is still poorly understood. Populus tremula and Populus tremuloides are two of the most widespread tree species in the Northern Hemisphere. Using whole-genome re-sequencing data of 24 P. tremula and 22 P. tremuloides individuals, we find that the two species diverged ∼2.2-3.1 million years ago, coinciding with the severing of the Bering land bridge and the onset of dramatic climatic oscillations during the Pleistocene. Both species have experienced substantial population expansions following long-term declines after species divergence. We detect widespread and heterogeneous genomic differentiation between species, and in accordance with the expectation of allopatric speciation, coalescent simulations suggest that neutral evolutionary processes can account for most of the observed patterns of genetic differentiation. However, there is an excess of regions exhibiting extreme differentiation relative to those expected under demographic simulations, which is indicative of the action of natural selection. Overall genetic differentiation is negatively associated with recombination rate in both species, providing strong support for a role of linked selection in generating the heterogeneous genomic landscape of differentiation between species. Finally, we identify a number of candidate regions and genes that may have been subject to positive and/or balancing selection during the speciation process.
Collapse
Affiliation(s)
- Jing Wang
- Department of Ecology and Environmental Science, Umeå University, Umeå, SE, Sweden
| | - Nathaniel R Street
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, Umeå, SE, Sweden
| | - Douglas G Scofield
- Department of Ecology and Environmental Science, Umeå University, Umeå, SE, Sweden Department of Ecology and Genetics: Evolutionary Biology, Uppsala University, Uppsala, Sweden Uppsala Multidisciplinary Center for Advanced Computational Science, Uppsala University, Uppsala, Sweden
| | - Pär K Ingvarsson
- Department of Ecology and Environmental Science, Umeå University, Umeå, SE, Sweden
| |
Collapse
|
33
|
Abstract
Serpentine barrens represent extreme hazards for plant colonists. These sites are characterized by high porosity leading to drought, lack of essential mineral nutrients, and phytotoxic levels of metals. Nevertheless, nature forged populations adapted to these challenges. Here, we use a population-based evolutionary genomic approach coupled with elemental profiling to assess how autotetraploid Arabidopsis arenosa adapted to a multichallenge serpentine habitat in the Austrian Alps. We first demonstrate that serpentine-adapted plants exhibit dramatically altered elemental accumulation levels in common conditions, and then resequence 24 autotetraploid individuals from three populations to perform a genome scan. We find evidence for highly localized selective sweeps that point to a polygenic, multitrait basis for serpentine adaptation. Comparing our results to a previous study of independent serpentine colonizations in the closely related diploid Arabidopsis lyrata in the United Kingdom and United States, we find the highest levels of differentiation in 11 of the same loci, providing candidate alleles for mediating convergent evolution. This overlap between independent colonizations in different species suggests that a limited number of evolutionary strategies are suited to overcome the multiple challenges of serpentine adaptation. Interestingly, we detect footprints of selection in A. arenosa in the context of substantial gene flow from nearby off-serpentine populations of A. arenosa, as well as from A. lyrata In several cases, quantitative tests of introgression indicate that some alleles exhibiting strong selective sweep signatures appear to have been introgressed from A. lyrata This finding suggests that migrant alleles may have facilitated adaptation of A. arenosa to this multihazard environment.
Collapse
|
34
|
A flexible method for estimating the fraction of fitness influencing mutations from large sequencing data sets. Genome Res 2016; 26:834-43. [PMID: 27197222 PMCID: PMC4889975 DOI: 10.1101/gr.203059.115] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Accepted: 04/14/2016] [Indexed: 01/07/2023]
Abstract
A continuing challenge in the analysis of massively large sequencing data sets is quantifying and interpreting non-neutrally evolving mutations. Here, we describe a flexible and robust approach based on the site frequency spectrum to estimate the fraction of deleterious and adaptive variants from large-scale sequencing data sets. We applied our method to approximately 1 million single nucleotide variants (SNVs) identified in high-coverage exome sequences of 6515 individuals. We estimate that the fraction of deleterious nonsynonymous SNVs is higher than previously reported; quantify the effects of genomic context, codon bias, chromatin accessibility, and number of protein-protein interactions on deleterious protein-coding SNVs; and identify pathways and networks that have likely been influenced by positive selection. Furthermore, we show that the fraction of deleterious nonsynonymous SNVs is significantly higher for Mendelian versus complex disease loci and in exons harboring dominant versus recessive Mendelian mutations. In summary, as genome-scale sequencing data accumulate in progressively larger sample sizes, our method will enable increasingly high-resolution inferences into the characteristics and determinants of non-neutral variation.
Collapse
|
35
|
Wang J, Street NR, Scofield DG, Ingvarsson PK. Variation in Linked Selection and Recombination Drive Genomic Divergence during Allopatric Speciation of European and American Aspens. Mol Biol Evol 2016; 33:1754-67. [PMID: 26983554 PMCID: PMC4915356 DOI: 10.1093/molbev/msw051] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Despite the global economic and ecological importance of forest trees, the genomic basis of differential adaptation and speciation in tree species is still poorly understood. Populus tremula and Populus tremuloides are two of the most widespread tree species in the Northern Hemisphere. Using whole-genome re-sequencing data of 24 P. tremula and 22 P. tremuloides individuals, we find that the two species diverged ∼2.2–3.1 million years ago, coinciding with the severing of the Bering land bridge and the onset of dramatic climatic oscillations during the Pleistocene. Both species have experienced substantial population expansions following long-term declines after species divergence. We detect widespread and heterogeneous genomic differentiation between species, and in accordance with the expectation of allopatric speciation, coalescent simulations suggest that neutral evolutionary processes can account for most of the observed patterns of genetic differentiation. However, there is an excess of regions exhibiting extreme differentiation relative to those expected under demographic simulations, which is indicative of the action of natural selection. Overall genetic differentiation is negatively associated with recombination rate in both species, providing strong support for a role of linked selection in generating the heterogeneous genomic landscape of differentiation between species. Finally, we identify a number of candidate regions and genes that may have been subject to positive and/or balancing selection during the speciation process.
Collapse
Affiliation(s)
- Jing Wang
- Department of Ecology and Environmental Science, Umeå University, Umeå, SE, Sweden
| | - Nathaniel R Street
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, Umeå, SE, Sweden
| | - Douglas G Scofield
- Department of Ecology and Environmental Science, Umeå University, Umeå, SE, Sweden Department of Ecology and Genetics: Evolutionary Biology, Uppsala University, Uppsala, Sweden Uppsala Multidisciplinary Center for Advanced Computational Science, Uppsala University, Uppsala, Sweden
| | - Pär K Ingvarsson
- Department of Ecology and Environmental Science, Umeå University, Umeå, SE, Sweden
| |
Collapse
|
36
|
Abstract
The current epidemic of artemisinin resistant Plasmodium falciparum in Southeast Asia is the result of a soft selective sweep involving at least 20 independent kelch13 mutations. In a large global survey, we find that kelch13 mutations which cause resistance in Southeast Asia are present at low frequency in Africa. We show that African kelch13 mutations have originated locally, and that kelch13 shows a normal variation pattern relative to other genes in Africa, whereas in Southeast Asia there is a great excess of non-synonymous mutations, many of which cause radical amino-acid changes. Thus, kelch13 is not currently undergoing strong selection in Africa, despite a deep reservoir of variations that could potentially allow resistance to emerge rapidly. The practical implications are that public health surveillance for artemisinin resistance should not rely on kelch13 data alone, and interventions to prevent resistance must account for local evolutionary conditions, shown by genomic epidemiology to differ greatly between geographical regions. DOI:http://dx.doi.org/10.7554/eLife.08714.001 Malaria is an infectious disease caused by a microscopic parasite called Plasmodium, which is transferred between humans by mosquitos. One species of malaria parasite called Plasmodium falciparum can cause particularly severe and life-threatening forms of the disease. Currently, the most widely used treatment for P. falciparum infections is artemisinin combination therapy, a treatment that combines the drug artemisinin (or a closely related molecule) with another antimalarial drug. However, resistance to artemisinin has started to spread throughout Southeast Asia. Artemisinin resistance is caused by mutations in a parasite gene called kelch13, and researchers have identified over 20 different mutations in P. falciparum that confer artemisinin resistance. The diversity of mutations involved, and the fact that the same mutation can arise independently in different locations, make it difficult to track the spread of resistance using conventional molecular marker approaches. Here, Amato, Miotto et al. sequenced the entire genomes of more than 3,000 clinical samples of P. falciparum from Southeast Asia and Africa, collected as part of a global network of research groups called the MalariaGEN Plasmodium falciparum Community Project. Amato, Miotto et al. found that African parasites had independently acquired many of the same kelch13 mutations that are known to cause resistance to artemisinin in Southeast Asia. However the kelch13 mutations seen in Africa remained at low levels in the parasite population, and appeared to be under much less pressure for evolutionary selection than those found in Southeast Asia. These findings demonstrate that the emergence and spread of resistance to antimalarial drugs does not depend solely on the mutational process, but also on other factors that influence whether the mutations will spread in the population. Understanding how this is affected by different patterns of drug treatments and other environmental conditions will be important in developing more effective strategies for combating malaria. DOI:http://dx.doi.org/10.7554/eLife.08714.002
Collapse
|
37
|
Hsieh P, Veeramah KR, Lachance J, Tishkoff SA, Wall JD, Hammer MF, Gutenkunst RN. Whole-genome sequence analyses of Western Central African Pygmy hunter-gatherers reveal a complex demographic history and identify candidate genes under positive natural selection. Genome Res 2016; 26:279-90. [PMID: 26888263 PMCID: PMC4772011 DOI: 10.1101/gr.192971.115] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Accepted: 01/07/2016] [Indexed: 12/20/2022]
Abstract
African Pygmies practicing a mobile hunter-gatherer lifestyle are phenotypically and genetically diverged from other anatomically modern humans, and they likely experienced strong selective pressures due to their unique lifestyle in the Central African rainforest. To identify genomic targets of adaptation, we sequenced the genomes of four Biaka Pygmies from the Central African Republic and jointly analyzed these data with the genome sequences of three Baka Pygmies from Cameroon and nine Yoruba famers. To account for the complex demographic history of these populations that includes both isolation and gene flow, we fit models using the joint allele frequency spectrum and validated them using independent approaches. Our two best-fit models both suggest ancient divergence between the ancestors of the farmers and Pygmies, 90,000 or 150,000 yr ago. We also find that bidirectional asymmetric gene flow is statistically better supported than a single pulse of unidirectional gene flow from farmers to Pygmies, as previously suggested. We then applied complementary statistics to scan the genome for evidence of selective sweeps and polygenic selection. We found that conventional statistical outlier approaches were biased toward identifying candidates in regions of high mutation or low recombination rate. To avoid this bias, we assigned P-values for candidates using whole-genome simulations incorporating demography and variation in both recombination and mutation rates. We found that genes and gene sets involved in muscle development, bone synthesis, immunity, reproduction, cell signaling and development, and energy metabolism are likely to be targets of positive natural selection in Western African Pygmies or their recent ancestors.
Collapse
Affiliation(s)
- PingHsun Hsieh
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 85721, USA
| | - Krishna R Veeramah
- Arizona Research Laboratories Division of Biotechnology, University of Arizona, Tucson, Arizona 85721, USA; Department of Ecology and Evolution, Stony Brook University, Stony Brook, New York 11794, USA
| | - Joseph Lachance
- Department of Biology and Genetics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA; Department of Biology, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Sarah A Tishkoff
- Department of Biology and Genetics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Jeffrey D Wall
- Institute for Human Genetics, University of California, San Francisco, California 94143, USA
| | - Michael F Hammer
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 85721, USA; Arizona Research Laboratories Division of Biotechnology, University of Arizona, Tucson, Arizona 85721, USA
| | - Ryan N Gutenkunst
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 85721, USA; Department of Molecular and Cellular Biology, University of Arizona, Tucson, Arizona 85721, USA
| |
Collapse
|
38
|
de Filippo C, Key FM, Ghirotto S, Benazzo A, Meneu JR, Weihmann A, Parra G, Green ED, Andrés AM. Recent Selection Changes in Human Genes under Long-Term Balancing Selection. Mol Biol Evol 2016; 33:1435-47. [PMID: 26831942 DOI: 10.1093/molbev/msw023] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Balancing selection is an important evolutionary force that maintains genetic and phenotypic diversity in populations. Most studies in humans have focused on long-standing balancing selection, which persists over long periods of time and is generally shared across populations. But balanced polymorphisms can also promote fast adaptation, especially when the environment changes. To better understand the role of previously balanced alleles in novel adaptations, we analyzed in detail four loci as case examples of this mechanism. These loci show hallmark signatures of long-term balancing selection in African populations, but not in Eurasian populations. The disparity between populations is due to changes in allele frequencies, with intermediate frequency alleles in Africans (likely due to balancing selection) segregating instead at low- or high-derived allele frequency in Eurasia. We explicitly tested the support for different evolutionary models with an approximate Bayesian computation approach and show that the patterns in PKDREJ, SDR39U1, and ZNF473 are best explained by recent changes in selective pressure in certain populations. Specifically, we infer that alleles previously under long-term balancing selection, or alleles linked to them, were recently targeted by positive selection in Eurasian populations. Balancing selection thus likely served as a source of functional alleles that mediated subsequent adaptations to novel environments.
Collapse
Affiliation(s)
- Cesare de Filippo
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Felix M Key
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Silvia Ghirotto
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - Andrea Benazzo
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - Juan R Meneu
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Antje Weihmann
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | | | - Genís Parra
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Eric D Green
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Aida M Andrés
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| |
Collapse
|
39
|
He Y, Wang M, Huang X, Li R, Xu H, Xu S, Jin L. A probabilistic method for testing and estimating selection differences between populations. Genome Res 2015; 25:1903-9. [PMID: 26463656 PMCID: PMC4665011 DOI: 10.1101/gr.192336.115] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2015] [Accepted: 10/13/2015] [Indexed: 01/18/2023]
Abstract
Human populations around the world encounter various environmental challenges and, consequently, develop genetic adaptations to different selection forces. Identifying the differences in natural selection between populations is critical for understanding the roles of specific genetic variants in evolutionary adaptation. Although numerous methods have been developed to detect genetic loci under recent directional selection, a probabilistic solution for testing and quantifying selection differences between populations is lacking. Here we report the development of a probabilistic method for testing and estimating selection differences between populations. By use of a probabilistic model of genetic drift and selection, we showed that logarithm odds ratios of allele frequencies provide estimates of the differences in selection coefficients between populations. The estimates approximate a normal distribution, and variance can be estimated using genome-wide variants. This allows us to quantify differences in selection coefficients and to determine the confidence intervals of the estimate. Our work also revealed the link between genetic association testing and hypothesis testing of selection differences. It therefore supplies a solution for hypothesis testing of selection differences. This method was applied to a genome-wide data analysis of Han and Tibetan populations. The results confirmed that both the EPAS1 and EGLN1 genes are under statistically different selection in Han and Tibetan populations. We further estimated differences in the selection coefficients for genetic variants involved in melanin formation and determined their confidence intervals between continental population groups. Application of the method to empirical data demonstrated the outstanding capability of this novel approach for testing and quantifying differences in natural selection.
Collapse
Affiliation(s)
- Yungang He
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Society Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Minxian Wang
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Society Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Xin Huang
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Society Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Ran Li
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Society Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Hongyang Xu
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Society Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Shuhua Xu
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Society Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Li Jin
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Society Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China; State Key Laboratory of Genetic Engineering and Ministry of Education Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai 200433, China
| |
Collapse
|
40
|
Fijarczyk A, Babik W. Detecting balancing selection in genomes: limits and prospects. Mol Ecol 2015; 24:3529-45. [DOI: 10.1111/mec.13226] [Citation(s) in RCA: 144] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Revised: 04/27/2015] [Accepted: 04/30/2015] [Indexed: 12/17/2022]
Affiliation(s)
- Anna Fijarczyk
- Institute of Environmental Sciences; Jagiellonian University; Gronostajowa 7 30-387 Kraków Poland
| | - Wiesław Babik
- Institute of Environmental Sciences; Jagiellonian University; Gronostajowa 7 30-387 Kraków Poland
| |
Collapse
|
41
|
Anisimova M. Darwin and Fisher meet at biotech: on the potential of computational molecular evolution in industry. BMC Evol Biol 2015; 15:76. [PMID: 25928234 PMCID: PMC4422139 DOI: 10.1186/s12862-015-0352-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2014] [Accepted: 04/15/2015] [Indexed: 12/22/2022] Open
Abstract
Background Today computational molecular evolution is a vibrant research field that benefits from the availability of large and complex new generation sequencing data – ranging from full genomes and proteomes to microbiomes, metabolomes and epigenomes. The grounds for this progress were established long before the discovery of the DNA structure. Specifically, Darwin’s theory of evolution by means of natural selection not only remains relevant today, but also provides a solid basis for computational research with a variety of applications. But a long-term progress in biology was ensured by the mathematical sciences, as exemplified by Sir R. Fisher in early 20th century. Now this is true more than ever: The data size and its complexity require biologists to work in close collaboration with experts in computational sciences, modeling and statistics. Results Natural selection drives function conservation and adaptation to emerging pathogens or new environments; selection plays key role in immune and resistance systems. Here I focus on computational methods for evaluating selection in molecular sequences, and argue that they have a high potential for applications. Pharma and biotech industries can successfully use this potential, and should take the initiative to enhance their research and development with state of the art bioinformatics approaches. Conclusions This review provides a quick guide to the current computational approaches that apply the evolutionary principles of natural selection to real life problems – from drug target validation, vaccine design and protein engineering to applications in agriculture, ecology and conservation.
Collapse
Affiliation(s)
- Maria Anisimova
- Institute of Applied Simulations, School of Life Sciences and Facility Management, Zürich University of Applied Sciences, Einsiedlerstrasse 31a, Wädenswil, 8820, Switzerland. .,Department of Computer Science, ETH, Zurich, Switzerland. .,Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
42
|
Differential Natural Selection of Human Zinc Transporter Genes between African and Non-African Populations. Sci Rep 2015; 5:9658. [PMID: 25927708 PMCID: PMC5386188 DOI: 10.1038/srep09658] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Accepted: 03/13/2015] [Indexed: 12/22/2022] Open
Abstract
Zinc transporters play important roles in all eukaryotes by maintaining the rational zinc concentration in cells. However, the diversity of zinc transporter genes (ZTGs) remains poorly studied. Here, we investigated the genetic diversity of 24 human ZTGs based on the 1000 Genomes data. Some ZTGs show small population differences, such as SLC30A6 with a weighted-average FST (WA-FST = 0.015), while other ZTGs exhibit considerably large population differences, such as SLC30A9 (WA-FST = 0.284). Overall, ZTGs harbor many more highly population-differentiated variants compared with random genes. Intriguingly, we found that SLC30A9 was underlying natural selection in both East Asians (EAS) and Africans (AFR) but in different directions. Notably, a non-synonymous variant (rs1047626) in SLC30A9 is almost fixed with 96.4% A in EAS and 92% G in AFR, respectively. Consequently, there are two different functional haplotypes exhibiting dominant abundance in AFR and EAS, respectively. Furthermore, a strong correlation was observed between the haplotype frequencies of SLC30A9 and distributions of zinc contents in soils or crops. We speculate that the genetic differentiation of ZTGs could directly contribute to population heterogeneity in zinc transporting capabilities and local adaptations of human populations in regard to the local zinc state or diets, which have both evolutionary and medical implications.
Collapse
|
43
|
Shafer ABA, Gattepaille LM, Stewart REA, Wolf JBW. Demographic inferences using short-read genomic data in an approximate Bayesian computation framework: in silico evaluation of power, biases and proof of concept in Atlantic walrus. Mol Ecol 2015; 24:328-45. [DOI: 10.1111/mec.13034] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2014] [Revised: 11/29/2014] [Accepted: 12/03/2014] [Indexed: 01/01/2023]
Affiliation(s)
- Aaron B. A. Shafer
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala SE-75236 Sweden
| | - Lucie M. Gattepaille
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala SE-75236 Sweden
| | - Robert E. A. Stewart
- Fisheries and Oceans Canada; Freshwater Institute; 501 University Crescent Winnipeg Manitoba R3T 2N6 Canada
| | - Jochen B. W. Wolf
- Department of Evolutionary Biology; Evolutionary Biology Centre; Uppsala SE-75236 Sweden
| |
Collapse
|
44
|
Micro- and macro-geographic scale effect on the molecular imprint of selection and adaptation in Norway spruce. PLoS One 2014; 9:e115499. [PMID: 25551624 PMCID: PMC4281139 DOI: 10.1371/journal.pone.0115499] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Accepted: 11/19/2014] [Indexed: 11/19/2022] Open
Abstract
Forest tree species of temperate and boreal regions have undergone a long history of demographic changes and evolutionary adaptations. The main objective of this study was to detect signals of selection in Norway spruce (Picea abies [L.] Karst), at different sampling-scales and to investigate, accounting for population structure, the effect of environment on species genetic diversity. A total of 384 single nucleotide polymorphisms (SNPs) representing 290 genes were genotyped at two geographic scales: across 12 populations distributed along two altitudinal-transects in the Alps (micro-geographic scale), and across 27 populations belonging to the range of Norway spruce in central and south-east Europe (macro-geographic scale). At the macrogeographic scale, principal component analysis combined with Bayesian clustering revealed three major clusters, corresponding to the main areas of southern spruce occurrence, i.e. the Alps, Carpathians, and Hercynia. The populations along the altitudinal transects were not differentiated. To assess the role of selection in structuring genetic variation, we applied a Bayesian and coalescent-based FST-outlier method and tested for correlations between allele frequencies and climatic variables using regression analyses. At the macro-geographic scale, the FST-outlier methods detected together 11 FST-outliers. Six outliers were detected when the same analyses were carried out taking into account the genetic structure. Regression analyses with population structure correction resulted in the identification of two (micro-geographic scale) and 38 SNPs (macro-geographic scale) significantly correlated with temperature and/or precipitation. Six of these loci overlapped with FST-outliers, among them two loci encoding an enzyme involved in riboflavin biosynthesis and a sucrose synthase. The results of this study indicate a strong relationship between genetic and environmental variation at both geographic scales. It also suggests that an integrative approach combining different outlier detection methods and population sampling at different geographic scales is useful to identify loci potentially involved in adaptation.
Collapse
|
45
|
Robinson JD, Coffman AJ, Hickerson MJ, Gutenkunst RN. Sampling strategies for frequency spectrum-based population genomic inference. BMC Evol Biol 2014; 14:254. [PMID: 25471595 PMCID: PMC4269862 DOI: 10.1186/s12862-014-0254-4] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2014] [Accepted: 11/24/2014] [Indexed: 01/25/2023] Open
Abstract
Background The allele frequency spectrum (AFS) consists of counts of the number of single nucleotide polymorphism (SNP) loci with derived variants present at each given frequency in a sample. Multiple approaches have recently been developed for parameter estimation and calculation of model likelihoods based on the joint AFS from two or more populations. We conducted a simulation study of one of these approaches, implemented in the Python module δaδi, to compare parameter estimation and model selection accuracy given different sample sizes under one- and two-population models. Results Our simulations included a variety of demographic models and two parameterizations that differed in the timing of events (divergence or size change). Using a number of SNPs reasonably obtained through next-generation sequencing approaches (10,000 - 50,000), accurate parameter estimates and model selection were possible for models with more ancient demographic events, even given relatively small numbers of sampled individuals. However, for recent events, larger numbers of individuals were required to achieve accuracy and precision in parameter estimates similar to that seen for models with older divergence or population size changes. We quantify i) the uncertainty in model selection, using tools from information theory, and ii) the accuracy and precision of parameter estimates, using the root mean squared error, as a function of the timing of demographic events, sample sizes used in the analysis, and complexity of the simulated models. Conclusions Here, we illustrate the utility of the genome-wide AFS for estimating demographic history and provide recommendations to guide sampling in population genomics studies that seek to draw inference from the AFS. Our results indicate that larger samples of individuals (and thus larger AFS) provide greater power for model selection and parameter estimation for more recent demographic events. Electronic supplementary material The online version of this article (doi:10.1186/s12862-014-0254-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- John D Robinson
- Department of Biology, City College of New York, New York, NY, 10031, USA. .,Current Address: South Carolina Department of Natural Resources, Marine Resources Research Institute, Charleston, SC, 29412, USA.
| | - Alec J Coffman
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, 85721, USA.
| | - Michael J Hickerson
- Department of Biology, City College of New York, New York, NY, 10031, USA. .,Subprogram in Ecology, Evolution and Behavior, the Graduate Center of the City University of New York, New York, NY, 10016, USA. .,Division of Invertebrate Zoology, American Museum of Natural History, New York, NY, 10024, USA.
| | - Ryan N Gutenkunst
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, 85721, USA.
| |
Collapse
|
46
|
DeGiorgio M, Lohmueller KE, Nielsen R. A model-based approach for identifying signatures of ancient balancing selection in genetic data. PLoS Genet 2014; 10:e1004561. [PMID: 25144706 PMCID: PMC4140648 DOI: 10.1371/journal.pgen.1004561] [Citation(s) in RCA: 112] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 06/26/2014] [Indexed: 01/19/2023] Open
Abstract
While much effort has focused on detecting positive and negative directional selection in the human genome, relatively little work has been devoted to balancing selection. This lack of attention is likely due to the paucity of sophisticated methods for identifying sites under balancing selection. Here we develop two composite likelihood ratio tests for detecting balancing selection. Using simulations, we show that these methods outperform competing methods under a variety of assumptions and demographic models. We apply the new methods to whole-genome human data, and find a number of previously-identified loci with strong evidence of balancing selection, including several HLA genes. Additionally, we find evidence for many novel candidates, the strongest of which is FANK1, an imprinted gene that suppresses apoptosis, is expressed during meiosis in males, and displays marginal signs of segregation distortion. We hypothesize that balancing selection acts on this locus to stabilize the segregation distortion and negative fitness effects of the distorter allele. Thus, our methods are able to reproduce many previously-hypothesized signals of balancing selection, as well as discover novel interesting candidates. In the past, balancing selection was a topic of great theoretical interest that received much attention. However, there has been little focus toward developing methods to identify regions of the genome that are under balancing selection. In this article, we present the first set of likelihood-based methods that explicitly model the spatial distribution of polymorphism expected near a site under long-term balancing selection. Simulation results show that our methods outperform commonly-used summary statistics for identifying regions under balancing selection. Finally, we performed a scan for balancing selection in Africans and Europeans using our new methods and identified a gene called FANK1 as our top candidate outside the HLA region. We hypothesize that the maintenance of polymorphism at FANK1 is the result of segregation distortion.
Collapse
Affiliation(s)
- Michael DeGiorgio
- Department of Biology, Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail:
| | - Kirk E. Lohmueller
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, California, United States of America
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
- Department of Statistics, University of California, Berkeley, Berkeley, California, United States of America
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
47
|
Sanchez-Mazas A, Meyer D. The relevance of HLA sequencing in population genetics studies. J Immunol Res 2014; 2014:971818. [PMID: 25126587 PMCID: PMC4122113 DOI: 10.1155/2014/971818] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2014] [Accepted: 06/20/2014] [Indexed: 11/18/2022] Open
Abstract
Next generation sequencing (NGS) is currently being adapted by different biotechnological platforms to the standard typing method for HLA polymorphism, the huge diversity of which makes this initiative particularly challenging. Boosting the molecular characterization of the HLA genes through efficient, rapid, and low-cost technologies is expected to amplify the success of tissue transplantation by enabling us to find donor-recipient matching for rare phenotypes. But the application of NGS technologies to the molecular mapping of the MHC region also anticipates essential changes in population genetic studies. Huge amounts of HLA sequence data will be available in the next years for different populations, with the potential to change our understanding of HLA variation in humans. In this review, we first explain how HLA sequencing allows a better assessment of the HLA diversity in human populations, taking also into account the methodological difficulties it introduces at the statistical level; secondly, we show how analyzing HLA sequence variation may improve our comprehension of population genetic relationships by facilitating the identification of demographic events that marked human evolution; finally, we discuss the interest of both HLA and genome-wide sequencing and genotyping in detecting functionally significant SNPs in the MHC region, the latter having also contributed to the makeup of the HLA molecular diversity observed today.
Collapse
Affiliation(s)
- Alicia Sanchez-Mazas
- Department of Genetics and Evolution—Anthropology Unit, University of Geneva and Institute of Genetics and Genomics of Geneva (IGE3), 12 Rue Gustave-Revilliod, 1211 Geneva 4, Switzerland
| | - Diogo Meyer
- Department of Genetics and Evolutionary Biology, University of São Paulo, Rua do Matão 277, São Paulo, SP 05508-090, Brazil
| |
Collapse
|
48
|
Schrago CG. Estimation of the ancestral effective population sizes of African great apes under different selection regimes. Genetica 2014; 142:273-80. [PMID: 24925265 DOI: 10.1007/s10709-014-9773-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2014] [Accepted: 06/04/2014] [Indexed: 11/29/2022]
Abstract
Reliable estimates of ancestral effective population sizes are necessary to unveil the population-level phenomena that shaped the phylogeny and molecular evolution of the African great apes. Although several methods have previously been applied to infer ancestral effective population sizes, an analysis of the influence of the selective regime on the estimates of ancestral demography has not been thoroughly conducted. In this study, three independent data sets under different selective regimes were used were composed to tackle this issue. The results showed that selection had a significant impact on the estimates of ancestral effective population sizes of the African great apes. The inference of the ancestral demography of African great apes was affected by the selection regime. The effects, however, were not homogeneous along the ancestral populations of great apes. The effective population size of the ancestor of humans and chimpanzees was more impacted by the selection regime when compared to the same parameter in the ancestor of humans, chimpanzees and gorillas. Because the selection regime influenced the estimates of ancestral effective population size, it is reasonable to assume that a portion of the discrepancy found in previous studies that inferred the ancestral effective population size may be attributable to the differential action of selection on the genes sampled.
Collapse
Affiliation(s)
- Carlos G Schrago
- Departamento de Genética, CCS, A2-092, Instituto de Biologia, Universidade Federal do Rio de Janeiro, Rua Prof. Rodolpho Paulo Rocco, SN, Cidade Universitária, Rio de Janeiro, RJ, CEP: 21941-617, Brazil,
| |
Collapse
|
49
|
Abstract
The past fifty years have seen the development and application of numerous statistical methods to identify genomic regions that appear to be shaped by natural selection. These methods have been used to investigate the macro- and microevolution of a broad range of organisms, including humans. Here, we provide a comprehensive outline of these methods, explaining their conceptual motivations and statistical interpretations. We highlight areas of recent and future development in evolutionary genomics methods and discuss ongoing challenges for researchers employing such tests. In particular, we emphasize the importance of functional follow-up studies to characterize putative selected alleles and the use of selection scans as hypothesis-generating tools for investigating evolutionary histories.
Collapse
Affiliation(s)
- Joseph J Vitti
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138; ,
| | | | | |
Collapse
|
50
|
Brousseau L, Tinaut A, Duret C, Lang T, Garnier-Gere P, Scotti I. High-throughput transcriptome sequencing and preliminary functional analysis in four Neotropical tree species. BMC Genomics 2014; 15:238. [PMID: 24673733 PMCID: PMC3986928 DOI: 10.1186/1471-2164-15-238] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2014] [Accepted: 03/13/2014] [Indexed: 12/30/2022] Open
Abstract
Background The Amazonian rainforest is predicted to suffer from ongoing environmental changes. Despite the need to evaluate the impact of such changes on tree genetic diversity, we almost entirely lack genomic resources. Results In this study, we analysed the transcriptome of four tropical tree species (Carapa guianensis, Eperua falcata, Symphonia globulifera and Virola michelii) with contrasting ecological features, belonging to four widespread botanical families (respectively Meliaceae, Fabaceae, Clusiaceae and Myristicaceae). We sequenced cDNA libraries from three organs (leaves, stems, and roots) using 454 pyrosequencing. We have developed an R and bioperl-based bioinformatic procedure for de novo assembly, gene functional annotation and marker discovery. Mismatch identification takes into account single-base quality values as well as the likelihood of false variants as a function of contig depth and number of sequenced chromosomes. Between 17103 (for Symphonia globulifera) and 23390 (for Eperua falcata) contigs were assembled. Organs varied in the numbers of unigenes they apparently express, with higher number in roots. Patterns of gene expression were similar across species, with metabolism of aromatic compounds standing out as an overrepresented gene function. Transcripts corresponding to several gene functions were found to be over- or underrepresented in each organ. We identified between 4434 (for Symphonia globulifera) and 9076 (for Virola surinamensis) well-supported mismatches. The resulting overall mismatch density was comprised between 0.89 (S. globulifera) and 1.05 (V. surinamensis) mismatches/100 bp in variation-containing contigs. Conclusion The relative representation of gene functions in the four transcriptomes suggests that secondary metabolism may be particularly important in tropical trees. The differential representation of transcripts among tissues suggests differential gene expression, which opens the way to functional studies in these non-model, ecologically important species. We found substantial amounts of mismatches in the four species. These newly identified putative variants are a first step towards acquiring much needed genomic resources for tropical tree species. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-238) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | - Ivan Scotti
- INRA, UMR 0745 EcoFoG, Campus agronomique BP 709, F-97387 Cedex, France.
| |
Collapse
|