1
|
Schraiber JG, Edge MD. Heritability within groups is uninformative about differences among groups: Cases from behavioral, evolutionary, and statistical genetics. Proc Natl Acad Sci U S A 2024; 121:e2319496121. [PMID: 38470926 PMCID: PMC10962975 DOI: 10.1073/pnas.2319496121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 02/13/2024] [Indexed: 03/14/2024] Open
Abstract
Without the ability to control or randomize environments (or genotypes), it is difficult to determine the degree to which observed phenotypic differences between two groups of individuals are due to genetic vs. environmental differences. However, some have suggested that these concerns may be limited to pathological cases, and methods have appeared that seem to give-directly or indirectly-some support to claims that aggregate heritable variation within groups can be related to heritable variation among groups. We consider three families of approaches: the "between-group heritability" sometimes invoked in behavior genetics, the statistic [Formula: see text] used in empirical work in evolutionary quantitative genetics, and methods based on variation in ancestry in an admixed population, used in anthropological and statistical genetics. We take up these examples to show mathematically that information on within-group genetic and phenotypic information in the aggregate cannot separate among-group differences into genetic and environmental components, and we provide simulation results that support our claims. We discuss these results in terms of the long-running debate on this topic.
Collapse
Affiliation(s)
- Joshua G. Schraiber
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA90089-2911
| | - Michael D. Edge
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA90089-2911
| |
Collapse
|
2
|
Schraiber JG, Edge MD, Pennell M. Unifying approaches from statistical genetics and phylogenetics for mapping phenotypes in structured populations. bioRxiv 2024:2024.02.10.579721. [PMID: 38496530 PMCID: PMC10942266 DOI: 10.1101/2024.02.10.579721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
In both statistical genetics and phylogenetics, a major goal is to identify correlations between genetic loci or other aspects of the phenotype or environment and a focal trait. In these two fields, there are sophisticated but disparate statistical traditions aimed at these tasks. The disconnect between their respective approaches is becoming untenable as questions in medicine, conservation biology, and evolutionary biology increasingly rely on integrating data from within and among species, and once-clear conceptual divisions are becoming increasingly blurred. To help bridge this divide, we derive a general model describing the covariance between the genetic contributions to the quantitative phenotypes of different individuals. Taking this approach shows that standard models in both statistical genetics (e.g., Genome-Wide Association Studies; GWAS) and phylogenetic comparative biology (e.g., phylogenetic regression) can be interpreted as special cases of this more general quantitative-genetic model. The fact that these models share the same core architecture means that we can build a unified understanding of the strengths and limitations of different methods for controlling for genetic structure when testing for associations. We develop intuition for why and when spurious correlations may occur using analytical theory and conduct population-genetic and phylogenetic simulations of quantitative traits. The structural similarity of problems in statistical genetics and phylogenetics enables us to take methodological advances from one field and apply them in the other. We demonstrate this by showing how a standard GWAS technique-including both the genetic relatedness matrix (GRM) as well as its leading eigenvectors, corresponding to the principal components of the genotype matrix, in a regression model-can mitigate spurious correlations in phylogenetic analyses. As a case study of this, we re-examine an analysis testing for co-evolution of expression levels between genes across a fungal phylogeny, and show that including covariance matrix eigenvectors as covariates decreases the false positive rate while simultaneously increasing the true positive rate. More generally, this work provides a foundation for more integrative approaches for understanding the genetic architecture of phenotypes and how evolutionary processes shape it.
Collapse
|
3
|
Shen Y, Masoero L, Schraiber JG, Broderick T. Double trouble: Predicting new variant counts across two heterogeneous populations. ArXiv 2024:arXiv:2403.02154v1. [PMID: 38495567 PMCID: PMC10942485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Collecting genomics data across multiple heterogeneous populations (e.g., across different cancer types) has the potential to improve our understanding of disease. Despite sequencing advances, though, resources often remain a constraint when gathering data. So it would be useful for experimental design if experimenters with access to a pilot study could predict the number of new variants they might expect to find in a follow-up study: both the number of new variants shared between the populations and the total across the populations. While many authors have developed prediction methods for the single-population case, we show that these predictions can fare poorly across multiple populations that are heterogeneous. We prove that, surprisingly, a natural extension of a state-of-the-art single-population predictor to multiple populations fails for fundamental reasons. We provide the first predictor for the number of new shared variants and new total variants that can handle heterogeneity in multiple populations. We show that our proposed method works well empirically using real cancer and population genetics data.
Collapse
|
4
|
Schraiber JG, Edge MD. Heritability within groups is uninformative about differences among groups: cases from behavioral, evolutionary, and statistical genetics. bioRxiv 2024:2023.11.06.565864. [PMID: 37986815 PMCID: PMC10659290 DOI: 10.1101/2023.11.06.565864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Without the ability to control or randomize environments (or genotypes), it is difficult to determine the degree to which observed phenotypic differences between two groups of individuals are due to genetic vs. environmental differences. However, some have suggested that these concerns may be limited to pathological cases, and methods have appeared that seem to give-directly or indirectly-some support to claims that aggregate heritable variation within groups can be related to heritable variation among groups. We consider three families of approaches: the "between-group heritability" sometimes invoked in behavior genetics, the statistic P S T used in empirical work in evolutionary quantitative genetics, and methods based on variation in ancestry in an admixed population, used in anthropological and statistical genetics. We take up these examples to show mathematically that information on within-group genetic and phenotypic information in the aggregate cannot separate among-group differences into genetic and environmental components, and we provide simulation results that support our claims. We discuss these results in terms of the long-running debate on this topic.
Collapse
Affiliation(s)
- Joshua G. Schraiber
- Department of Quantitative and Computational Biology, University of Southern California
| | - Michael D. Edge
- Department of Quantitative and Computational Biology, University of Southern California
| |
Collapse
|
5
|
Kuderna LFK, Ulirsch JC, Rashid S, Ameen M, Sundaram L, Hickey G, Cox AJ, Gao H, Kumar A, Aguet F, Christmas MJ, Clawson H, Haeussler M, Janiak MC, Kuhlwilm M, Orkin JD, Bataillon T, Manu S, Valenzuela A, Bergman J, Rouselle M, Silva FE, Agueda L, Blanc J, Gut M, de Vries D, Goodhead I, Harris RA, Raveendran M, Jensen A, Chuma IS, Horvath JE, Hvilsom C, Juan D, Frandsen P, Schraiber JG, de Melo FR, Bertuol F, Byrne H, Sampaio I, Farias I, Valsecchi J, Messias M, da Silva MNF, Trivedi M, Rossi R, Hrbek T, Andriaholinirina N, Rabarivola CJ, Zaramody A, Jolly CJ, Phillips-Conroy J, Wilkerson G, Abee C, Simmons JH, Fernandez-Duque E, Kanthaswamy S, Shiferaw F, Wu D, Zhou L, Shao Y, Zhang G, Keyyu JD, Knauf S, Le MD, Lizano E, Merker S, Navarro A, Nadler T, Khor CC, Lee J, Tan P, Lim WK, Kitchener AC, Zinner D, Gut I, Melin AD, Guschanski K, Schierup MH, Beck RMD, Karakikes I, Wang KC, Umapathy G, Roos C, Boubli JP, Siepel A, Kundaje A, Paten B, Lindblad-Toh K, Rogers J, Marques Bonet T, Farh KKH. Identification of constrained sequence elements across 239 primate genomes. Nature 2024; 625:735-742. [PMID: 38030727 PMCID: PMC10808062 DOI: 10.1038/s41586-023-06798-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 10/30/2023] [Indexed: 12/01/2023]
Abstract
Noncoding DNA is central to our understanding of human gene regulation and complex diseases1,2, and measuring the evolutionary sequence constraint can establish the functional relevance of putative regulatory elements in the human genome3-9. Identifying the genomic elements that have become constrained specifically in primates has been hampered by the faster evolution of noncoding DNA compared to protein-coding DNA10, the relatively short timescales separating primate species11, and the previously limited availability of whole-genome sequences12. Here we construct a whole-genome alignment of 239 species, representing nearly half of all extant species in the primate order. Using this resource, we identified human regulatory elements that are under selective constraint across primates and other mammals at a 5% false discovery rate. We detected 111,318 DNase I hypersensitivity sites and 267,410 transcription factor binding sites that are constrained specifically in primates but not across other placental mammals and validate their cis-regulatory effects on gene expression. These regulatory elements are enriched for human genetic variants that affect gene expression and complex traits and diseases. Our results highlight the important role of recent evolution in regulatory sequence elements differentiating primates, including humans, from other placental mammals.
Collapse
Affiliation(s)
- Lukas F K Kuderna
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Jacob C Ulirsch
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Sabrina Rashid
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Mohamed Ameen
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Laksshman Sundaram
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Glenn Hickey
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Anthony J Cox
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Hong Gao
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Arvind Kumar
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Francois Aguet
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Matthew J Christmas
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Hiram Clawson
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | | | - Mareike C Janiak
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Martin Kuhlwilm
- Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, Vienna, Austria
| | - Joseph D Orkin
- Département d'Anthropologie, Université de Montréal, Montréal, Quebec, Canada
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Shivakumara Manu
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
| | - Alejandro Valenzuela
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | - Juraj Bergman
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
- Section for Ecoinformatics and Biodiversity, Department of Biology, Aarhus University, Aarhus, Denmark
| | | | - Felipe Ennes Silva
- Research Group on Primate Biology and Conservation, Mamirauá Institute for Sustainable Development, Tefé, Brazil
- Evolutionary Biology and Ecology (EBE), Département de Biologie des Organismes, Université libre de Bruxelles (ULB), Brussels, Belgium
| | - Lidia Agueda
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
| | - Julie Blanc
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
| | - Marta Gut
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
| | - Dorien de Vries
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Ian Goodhead
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - R Alan Harris
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Axel Jensen
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
| | | | - Julie E Horvath
- North Carolina Museum of Natural Sciences, Raleigh, NC, USA
- Department of Biological and Biomedical Sciences, North Carolina Central University, Durham, NC, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, USA
- Department of Evolutionary Anthropology, Duke University, Durham, NC, USA
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | - David Juan
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
| | | | - Joshua G Schraiber
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | | | - Fabrício Bertuol
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Brazil
| | - Hazel Byrne
- Department of Anthropology, University of Utah, Salt Lake City, UT, USA
| | | | - Izeni Farias
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Brazil
| | - João Valsecchi
- Research Group on Terrestrial Vertebrate Ecology, Mamirauá Institute for Sustainable Development, Tefé, Brazil
- Rede de Pesquisa em Diversidade, Conservação e Uso da Fauna da Amazônia - RedeFauna, Manaus, Brazil
- Comunidad de Manejo de Fauna Silvestre en la Amazonía y en Latinoamérica-ComFauna, Iquitos, Peru
| | - Malu Messias
- Universidade Federal de Rondônia, Porto Velho, Brazil
| | | | - Mihir Trivedi
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
| | - Rogerio Rossi
- Instituto de Biociências, Universidade Federal do Mato Grosso, Cuiabá, Brazil
| | - Tomas Hrbek
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Brazil
- Department of Biology, Trinity University, San Antonio, TX, USA
| | - Nicole Andriaholinirina
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Clément J Rabarivola
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Alphonse Zaramody
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Clifford J Jolly
- Department of Anthropology, New York University, New York, NY, USA
| | - Jane Phillips-Conroy
- Department of Neuroscience, Washington University School of Medicine in St Louis, St Louis, MO, USA
| | - Gregory Wilkerson
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop, TX, USA
| | - Christian Abee
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop, TX, USA
| | - Joe H Simmons
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop, TX, USA
| | | | - Sree Kanthaswamy
- School of Interdisciplinary Forensics, Arizona State University, Phoenix, AZ, USA
- California National Primate Research Center, University of California, Davis, CA, USA
| | - Fekadu Shiferaw
- Guinea Worm Eradication Program, The Carter Center Ethiopia, Addis Ababa, Ethiopia
| | - Dongdong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Long Zhou
- Center for Evolutionary and Organismal Biology, Zhejiang University School of Medicine, Hangzhou, China
| | - Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Guojie Zhang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
- Center for Evolutionary and Organismal Biology, Zhejiang University School of Medicine, Hangzhou, China
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- Liangzhu Laboratory, Zhejiang University Medical Center, Hangzhou, China
- Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | - Julius D Keyyu
- Tanzania Wildlife Research Institute (TAWIRI), Arusha, Tanzania
| | - Sascha Knauf
- Institute of International Animal Health/One Health, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Greifswald-Insel Riems, Germany
- Professorship for International Animal Health/One Health, Faculty of Veterinary Medicine, Justus Liebig University, Giessen, Germany
| | - Minh D Le
- Department of Environmental Ecology, Faculty of Environmental Sciences, University of Science and Central Institute for Natural Resources and Environmental Studies, Vietnam National University, Hanoi, Vietnam
| | - Esther Lizano
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Stefan Merker
- Department of Zoology, State Museum of Natural History Stuttgart, Stuttgart, Germany
| | - Arcadi Navarro
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Barcelonaβeta Brain Research Center, Pasqual Maragall Foundation, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Tilo Nadler
- Cuc Phuong Commune, Nho Quan District, Vietnam
| | - Chiea Chuen Khor
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | | | - Patrick Tan
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore, Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, Singapore
| | - Weng Khong Lim
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore, Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, Singapore
- SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore
| | - Andrew C Kitchener
- Department of Natural Sciences, National Museums Scotland, Edinburgh, UK
- School of Geosciences, Edinburgh, UK
| | - Dietmar Zinner
- Cognitive Ethology Laboratory, Germany Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany
- Department of Primate Cognition, Georg-August-Universität Göttingen, Göttingen, Germany
- Leibniz ScienceCampus Primate Cognition, Göttingen, Germany
| | - Ivo Gut
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain
| | - Amanda D Melin
- Department of Anthropology and Archaeology, University of Calgary, Calgary, Alberta, Canada
- Department of Medical Genetics, University of Calgary, Calgary, Alberta, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary, Alberta, Canada
| | - Katerina Guschanski
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, Uppsala, Sweden
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | | | - Robin M D Beck
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Ioannis Karakikes
- Cardiovascular Institute, Stanford University, Stanford, CA, USA
- Department of Cardiothoracic Surgery, Stanford University, Stanford, CA, USA
| | - Kevin C Wang
- Department of Cancer Biology, Stanford University, Stanford, CA, USA
- Department of Dermatology, Stanford University School of Medicine, Stanford, CA, USA
- Veterans Affairs Palo Alto Healthcare System, Palo Alto, CA, USA
| | - Govindhaswamy Umapathy
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad, India
| | - Christian Roos
- Gene Bank of Primates and Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany
| | - Jean P Boubli
- School of Science, Engineering and Environment, University of Salford, Salford, UK
| | - Adam Siepel
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA
| | - Kerstin Lindblad-Toh
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jeffrey Rogers
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
| | - Tomas Marques Bonet
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra, Barcelona, Spain.
- Centro Nacional de Analisis Genomico (CNAG), Barcelona, Spain.
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
- Universitat Pompeu Fabra, Barcelona, Spain.
| | - Kyle Kai-How Farh
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA.
| |
Collapse
|
6
|
Link V, Schraiber JG, Fan C, Dinh B, Mancuso N, Chiang CWK, Edge MD. Tree-based QTL mapping with expected local genetic relatedness matrices. Am J Hum Genet 2023; 110:2077-2091. [PMID: 38065072 PMCID: PMC10716520 DOI: 10.1016/j.ajhg.2023.10.017] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 10/26/2023] [Accepted: 10/27/2023] [Indexed: 12/18/2023] Open
Abstract
Understanding the genetic basis of complex phenotypes is a central pursuit of genetics. Genome-wide association studies (GWASs) are a powerful way to find genetic loci associated with phenotypes. GWASs are widely and successfully used, but they face challenges related to the fact that variants are tested for association with a phenotype independently, whereas in reality variants at different sites are correlated because of their shared evolutionary history. One way to model this shared history is through the ancestral recombination graph (ARG), which encodes a series of local coalescent trees. Recent computational and methodological breakthroughs have made it feasible to estimate approximate ARGs from large-scale samples. Here, we explore the potential of an ARG-based approach to quantitative-trait locus (QTL) mapping, echoing existing variance-components approaches. We propose a framework that relies on the conditional expectation of a local genetic relatedness matrix (local eGRM) given the ARG. Simulations show that our method is especially beneficial for finding QTLs in the presence of allelic heterogeneity. By framing QTL mapping in terms of the estimated ARG, we can also facilitate the detection of QTLs in understudied populations. We use local eGRM to analyze two chromosomes containing known body size loci in a sample of Native Hawaiians. Our investigations can provide intuition about the benefits of using estimated ARGs in population- and statistical-genetic methods in general.
Collapse
Affiliation(s)
- Vivian Link
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Joshua G Schraiber
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Caoqi Fan
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA; Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Bryan Dinh
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA; Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Nicholas Mancuso
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA; Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Charleston W K Chiang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA; Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Michael D Edge
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
7
|
Fiziev PP, McRae J, Ulirsch JC, Dron JS, Hamp T, Yang Y, Wainschtein P, Ni Z, Schraiber JG, Gao H, Cable D, Field Y, Aguet F, Fasnacht M, Metwally A, Rogers J, Marques-Bonet T, Rehm HL, O'Donnell-Luria A, Khera AV, Farh KKH. Rare penetrant mutations confer severe risk of common diseases. Science 2023; 380:eabo1131. [PMID: 37262146 DOI: 10.1126/science.abo1131] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 03/16/2023] [Indexed: 06/03/2023]
Abstract
We examined 454,712 exomes for genes associated with a wide spectrum of complex traits and common diseases and observed that rare, penetrant mutations in genes implicated by genome-wide association studies confer ~10-fold larger effects than common variants in the same genes. Consequently, an individual at the phenotypic extreme and at the greatest risk for severe, early-onset disease is better identified by a few rare penetrant variants than by the collective action of many common variants with weak effects. By combining rare variants across phenotype-associated genes into a unified genetic risk model, we demonstrate superior portability across diverse global populations compared with common-variant polygenic risk scores, greatly improving the clinical utility of genetic-based risk prediction.
Collapse
Affiliation(s)
- Petko P Fiziev
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
| | - Jeremy McRae
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
| | - Jacob C Ulirsch
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
| | - Jacqueline S Dron
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Tobias Hamp
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
| | - Yanshen Yang
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
| | - Pierrick Wainschtein
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia
| | - Zijian Ni
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Joshua G Schraiber
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
| | - Hong Gao
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
| | - Dylan Cable
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, MA 02142, USA
| | - Yair Field
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
| | - Francois Aguet
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
| | - Marc Fasnacht
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
| | - Ahmed Metwally
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
| | - Jeffrey Rogers
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Wisconsin National Primate Research Center, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC), 08003 Barcelona, Spain
- Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, 08193 Barcelona, Spain
| | - Heidi L Rehm
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Anne O'Donnell-Luria
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA 02115, USA
| | - Amit V Khera
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Verve Therapeutics, Cambridge, MA 02215, USA
| | - Kyle Kai-How Farh
- Artificial Intelligence Laboratory, Illumina, Inc., San Diego, CA 92122, USA
| |
Collapse
|
8
|
Kuderna LFK, Gao H, Janiak MC, Kuhlwilm M, Orkin JD, Bataillon T, Manu S, Valenzuela A, Bergman J, Rousselle M, Silva FE, Agueda L, Blanc J, Gut M, de Vries D, Goodhead I, Harris RA, Raveendran M, Jensen A, Chuma IS, Horvath JE, Hvilsom C, Juan D, Frandsen P, Schraiber JG, de Melo FR, Bertuol F, Byrne H, Sampaio I, Farias I, Valsecchi J, Messias M, da Silva MNF, Trivedi M, Rossi R, Hrbek T, Andriaholinirina N, Rabarivola CJ, Zaramody A, Jolly CJ, Phillips-Conroy J, Wilkerson G, Abee C, Simmons JH, Fernandez-Duque E, Kanthaswamy S, Shiferaw F, Wu D, Zhou L, Shao Y, Zhang G, Keyyu JD, Knauf S, Le MD, Lizano E, Merker S, Navarro A, Nadler T, Khor CC, Lee J, Tan P, Lim WK, Kitchener AC, Zinner D, Gut I, Melin AD, Guschanski K, Schierup MH, Beck RMD, Umapathy G, Roos C, Boubli JP, Rogers J, Farh KKH, Marques Bonet T. A global catalog of whole-genome diversity from 233 primate species. Science 2023; 380:906-913. [PMID: 37262161 DOI: 10.1126/science.abn7829] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Accepted: 02/06/2023] [Indexed: 06/03/2023]
Abstract
The rich diversity of morphology and behavior displayed across primate species provides an informative context in which to study the impact of genomic diversity on fundamental biological processes. Analysis of that diversity provides insight into long-standing questions in evolutionary and conservation biology and is urgent given severe threats these species are facing. Here, we present high-coverage whole-genome data from 233 primate species representing 86% of genera and all 16 families. This dataset was used, together with fossil calibration, to create a nuclear DNA phylogeny and to reassess evolutionary divergence times among primate clades. We found within-species genetic diversity across families and geographic regions to be associated with climate and sociality, but not with extinction risk. Furthermore, mutation rates differ across species, potentially influenced by effective population sizes. Lastly, we identified extensive recurrence of missense mutations previously thought to be human specific. This study will open a wide range of research avenues for future primate genomic research.
Collapse
Affiliation(s)
- Lukas F K Kuderna
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA 94404, USA
| | - Hong Gao
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA 94404, USA
| | - Mareike C Janiak
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Martin Kuhlwilm
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
- Department of Evolutionary Anthropology, University of Vienna, Djerassiplatz 1, 1030 Vienna, Austria
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, Austria
| | - Joseph D Orkin
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
- Département d'anthropologie, Université de Montréal, 3150 Jean-Brillant, Montréal, QC H3T 1N8, Canada
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Shivakumara Manu
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500007, India
| | - Alejandro Valenzuela
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
| | - Juraj Bergman
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
- Section for Ecoinformatics and Biodiversity, Department of Biology, Aarhus University, Aarhus, Denmark
| | | | - Felipe Ennes Silva
- Research Group on Primate Biology and Conservation, Mamirauá Institute for Sustainable Development, Estrada da Bexiga 2584, CEP 69553-225, Tefé, Amazonas, Brazil
- Evolutionary Biology and Ecology (EBE), Département de Biologie des Organismes, Université libre de Bruxelles (ULB), Av. Franklin D. Roosevelt 50, CP 160/12, B-1050 Brussels Belgium
| | - Lidia Agueda
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri I Reixac 4, 08028 Barcelona, Spain
| | - Julie Blanc
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri I Reixac 4, 08028 Barcelona, Spain
| | - Marta Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri I Reixac 4, 08028 Barcelona, Spain
| | - Dorien de Vries
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Ian Goodhead
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - R Alan Harris
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Axel Jensen
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, SE-75236 Uppsala, Sweden
| | | | - Julie E Horvath
- North Carolina Museum of Natural Sciences, Raleigh, NC 27601, USA
- Department of Biological and Biomedical Sciences, North Carolina Central University, Durham, NC 27707, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | | - David Juan
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
| | | | - Joshua G Schraiber
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA 94404, USA
| | | | - Fabrício Bertuol
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Amazonas 69080-900, Brazil
| | - Hazel Byrne
- Department of Anthropology, University of Utah, Salt Lake City. UT 84102, USA
| | | | - Izeni Farias
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Amazonas 69080-900, Brazil
| | - João Valsecchi
- Research Group on Terrestrial Vertebrate Ecology, Mamirauá Institute for Sustainable Development, Tefé, Amazonas, Brazil
- Rede de Pesquisa para Estudos sobre Diversidade, Conservação e Uso da Fauna na Amazônia - RedeFauna, Manaus, Amazonas, Brazil
- Comunidad de Manejo de Fauna Silvestre en la Amazonía y en Latinoamérica - ComFauna, Iquitos, Loreto, Peru
| | - Malu Messias
- Universidade Federal de Rondônia, Porto Velho, Rondônia, Brazil
| | | | - Mihir Trivedi
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500007, India
| | - Rogerio Rossi
- Instituto de Biociências, Universidade Federal do Mato Grosso, Cuiabá, MT, Brazil
| | - Tomas Hrbek
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Amazonas 69080-900, Brazil
- Department of Biology, Trinity University, San Antonio, TX 78212, USA
| | - Nicole Andriaholinirina
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Clément J Rabarivola
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Alphonse Zaramody
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, Madagascar
| | - Clifford J Jolly
- Department of Anthropology, New York University, New York, NY 10003, USA
| | - Jane Phillips-Conroy
- Department of Neuroscience, Washington University School of Medicine in St. Louis, St. Louis, MO 63110, USA
| | - Gregory Wilkerson
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop TX 78602, USA
| | - Christian Abee
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop TX 78602, USA
| | - Joe H Simmons
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Bastrop TX 78602, USA
| | | | - Sree Kanthaswamy
- School of Mathematical and Natural Sciences, Arizona State University, Phoenix, AZ 85004, USA
| | - Fekadu Shiferaw
- Guinea Worm Eradication Program, The Carter Center Ethiopia, Addis Ababa, Ethiopia
| | - Dongdong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Long Zhou
- Center for Evolutionary and Organismal Biology, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Guojie Zhang
- Center for Evolutionary and Organismal Biology, Zhejiang University School of Medicine, Hangzhou 310058, China
- Villum Centre for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, DK-2100 Copenhagen, Denmark
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
- Liangzhu Laboratory, Zhejiang University Medical Center, 1369 West Wenyi Road, Hangzhou 311121, China
- Women's Hospital, School of Medicine, Zhejiang University, 1 Xueshi Road, Shangcheng District, Hangzhou 310006, China
| | - Julius D Keyyu
- Tanzania Wildlife Research Institute (TAWIRI), Head Office, P.O. Box 661, Arusha, Tanzania
| | - Sascha Knauf
- Institute of International Animal Health/One Health, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, 17493 Greifswald-Insel Riems, Germany
| | - Minh D Le
- Department of Environmental Ecology, Faculty of Environmental Sciences, University of Science and Central Institute for Natural Resources and Environmental Studies, Vietnam National University, Hanoi, Vietnam
| | - Esther Lizano
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Stefan Merker
- Department of Zoology, State Museum of Natural History Stuttgart, Stuttgart, Germany
| | - Arcadi Navarro
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA) and Universitat Pompeu Fabra. Pg. Luís Companys 23, 08010 Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Av. Doctor Aiguader, N88, 08003 Barcelona, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation, C. Wellington 30, 08005 Barcelona, Spain
| | - Tilo Nadler
- Cuc Phuong Commune, Nho Quan District, Ninh Binh Province, Vietnam
| | - Chiea Chuen Khor
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore
| | - Jessica Lee
- Mandai Nature, 80 Mandai Lake Road, Singapore
| | - Patrick Tan
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore
| | - Weng Khong Lim
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore
- SingHealth Duke-NUS Genomic Medicine Centre, Singapore
| | - Andrew C Kitchener
- Department of Natural Sciences, National Museums Scotland, Chambers Street, Edinburgh EH1 1JF, UK, and School of Geosciences, Drummond Street, Edinburgh EH8 9XP, UK
| | - Dietmar Zinner
- Cognitive Ethology Laboratory, Germany Primate Center, Leibniz Institute for Primate Research, 37077 Göttingen, Germany
- Department of Primate Cognition, Georg-August-Universität Göttingen, 37077 Göttingen, Germany
- Leibniz ScienceCampus Primate Cognition, 37077 Göttingen, Germany
| | - Ivo Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri I Reixac 4, 08028 Barcelona, Spain
| | - Amanda D Melin
- Department of Anthropology and Archaeology, University of Calgary, 2500 University Dr NW, Calgary, AB T2N 1N4, Canada
- Department of Medical Genetics, University of Calgary, 3330 Hospital Drive NW, HMRB 202, Calgary, AB T2N 4N1, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, 3330 Hospital Drive NW, HMRB 202, Calgary, AB T2N 4N1, Canada
| | - Katerina Guschanski
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, SE-75236 Uppsala, Sweden
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | | | - Robin M D Beck
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Govindhaswamy Umapathy
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500007, India
| | - Christian Roos
- Gene Bank of Primates and Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research, Kellnerweg 4, 37077 Göttingen, Germany
| | - Jean P Boubli
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Jeffrey Rogers
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Kyle Kai-How Farh
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA 94404, USA
| | - Tomas Marques Bonet
- IBE, Institute of Evolutionary Biology (UPF-CSIC), Department of Medicine and Life Sciences, Universitat Pompeu Fabra. PRBB, C. Doctor Aiguader N88, 08003 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri I Reixac 4, 08028 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA) and Universitat Pompeu Fabra. Pg. Luís Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
9
|
Gao H, Hamp T, Ede J, Schraiber JG, McRae J, Singer-Berk M, Yang Y, Dietrich ASD, Fiziev PP, Kuderna LFK, Sundaram L, Wu Y, Adhikari A, Field Y, Chen C, Batzoglou S, Aguet F, Lemire G, Reimers R, Balick D, Janiak MC, Kuhlwilm M, Orkin JD, Manu S, Valenzuela A, Bergman J, Rousselle M, Silva FE, Agueda L, Blanc J, Gut M, de Vries D, Goodhead I, Harris RA, Raveendran M, Jensen A, Chuma IS, Horvath JE, Hvilsom C, Juan D, Frandsen P, de Melo FR, Bertuol F, Byrne H, Sampaio I, Farias I, do Amaral JV, Messias M, da Silva MNF, Trivedi M, Rossi R, Hrbek T, Andriaholinirina N, Rabarivola CJ, Zaramody A, Jolly CJ, Phillips-Conroy J, Wilkerson G, Abee C, Simmons JH, Fernandez-Duque E, Kanthaswamy S, Shiferaw F, Wu D, Zhou L, Shao Y, Zhang G, Keyyu JD, Knauf S, Le MD, Lizano E, Merker S, Navarro A, Bataillon T, Nadler T, Khor CC, Lee J, Tan P, Lim WK, Kitchener AC, Zinner D, Gut I, Melin A, Guschanski K, Schierup MH, Beck RMD, Umapathy G, Roos C, Boubli JP, Lek M, Sunyaev S, O'Donnell-Luria A, Rehm HL, Xu J, Rogers J, Marques-Bonet T, Farh KKH. The landscape of tolerated genetic variation in humans and primates. Science 2023; 380:eabn8153. [PMID: 37262156 DOI: 10.1126/science.abn8197] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 03/22/2023] [Indexed: 06/03/2023]
Abstract
Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole-genome sequencing data for 809 individuals from 233 primate species and identified 4.3 million common protein-altering variants with orthologs in humans. We show that these variants can be inferred to have nondeleterious effects in humans based on their presence at high allele frequencies in other primate populations. We use this resource to classify 6% of all possible human protein-altering variants as likely benign and impute the pathogenicity of the remaining 94% of variants with deep learning, achieving state-of-the-art accuracy for diagnosing pathogenic variants in patients with genetic diseases.
Collapse
Affiliation(s)
- Hong Gao
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Tobias Hamp
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Jeffrey Ede
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Joshua G Schraiber
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Jeremy McRae
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Moriel Singer-Berk
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA, 02142, USA
| | - Yanshen Yang
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | | | - Petko P Fiziev
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Lukas F K Kuderna
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Laksshman Sundaram
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Yibing Wu
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Aashish Adhikari
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Yair Field
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Chen Chen
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Serafim Batzoglou
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Francois Aguet
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| | - Gabrielle Lemire
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA, 02142, USA
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Rebecca Reimers
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA
| | - Daniel Balick
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Mareike C Janiak
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Martin Kuhlwilm
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Department of Evolutionary Anthropology, University of Vienna, Djerassiplatz 1, 1030 Vienna, Austria
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna, 1030 Vienna, Austria
| | - Joseph D Orkin
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Département d'anthropologie, Université de Montréal, 3150 Jean-Brillant, Montréal, QC H3T 1N8, Canada
| | - Shivakumara Manu
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500007, India
| | - Alejandro Valenzuela
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Juraj Bergman
- Bioinformatics Research Centre, Aarhus University, Aarhus 8000, Denmark
- Section for Ecoinformatics & Biodiversity, Department of Biology, Aarhus University, 8000 Aarhus, Denmark
| | | | - Felipe Ennes Silva
- Research Group on Primate Biology and Conservation, Mamirauá Institute for Sustainable Development, Estrada da Bexiga 2584, Tefé, Amazonas, CEP 69553-225, Brazil
- Evolutionary Biology and Ecology (EBE), Département de Biologie des Organismes, Université libre de Bruxelles (ULB), Av. Franklin D. Roosevelt 50, CP 160/12, B-1050 Brussels, Belgium
| | - Lidia Agueda
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
| | - Julie Blanc
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
| | - Marta Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
| | - Dorien de Vries
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Ian Goodhead
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - R Alan Harris
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Axel Jensen
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, SE-75236 Uppsala, Sweden
| | | | - Julie E Horvath
- North Carolina Museum of Natural Sciences, Raleigh, NC 27601, USA
- Department of Biological and Biomedical Sciences, North Carolina Central University, Durham, NC 27707, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA
- Department of Evolutionary Anthropology, Duke University, Durham, NC 27708, USA
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | | - David Juan
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | | | | | - Fabrício Bertuol
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Amazonas, 69080-900, Brazil
| | - Hazel Byrne
- Department of Anthropology, University of Utah, Salt Lake City, UT 84102, USA
| | - Iracilda Sampaio
- Universidade Federal do Para, Guamá, Belém - PA, 66075-110, Brazil
| | - Izeni Farias
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Amazonas, 69080-900, Brazil
| | - João Valsecchi do Amaral
- Research Group on Terrestrial Vertebrate Ecology, Mamirauá Institute for Sustainable Development, Tefé, Amazonas, 69553-225, Brazil
- Rede de Pesquisa para Estudos sobre Diversidade, Conservação e Uso da Fauna na Amazônia - RedeFauna, Manaus, Amazonas, 69080-900, Brazil
- Comunidad de Manejo de Fauna Silvestre en la Amazonía y en Latinoamérica - ComFauna, Iquitos, Loreto, 16001, Peru
| | - Mariluce Messias
- Universidade Federal de Rondonia, Porto Velho, Rondônia, 78900-000, Brazil
- PPGREN - Programa de Pós-Graduação "Conservação e Uso dos Recursos Naturais and BIONORTE - Programa de Pós-Graduação em Biodiversidade e Biotecnologia da Rede BIONORTE, Universidade Federal de Rondonia, Porto Velho, Rondônia, 78900-000, Brazil
| | - Maria N F da Silva
- Instituto Nacional de Pesquisas da Amazonia, Petrópolis, Manaus - AM, 69067-375, Brazil
| | - Mihir Trivedi
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500007, India
| | - Rogerio Rossi
- Universidade Federal do Mato Grosso, Boa Esperança, Cuiabá - MT, 78060-900, Brazil
| | - Tomas Hrbek
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL), Manaus, Amazonas, 69080-900, Brazil
- Department of Biology, Trinity University, San Antonio, TX 78212, USA
| | - Nicole Andriaholinirina
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, 401, Madagascar
| | - Clément J Rabarivola
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, 401, Madagascar
| | - Alphonse Zaramody
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga, Mahajanga, 401, Madagascar
| | | | | | - Gregory Wilkerson
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Christian Abee
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Joe H Simmons
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Eduardo Fernandez-Duque
- Yale University, New Haven, CT 06520, USA
- Universidad Nacional de Formosa, Argentina Fundacion ECO, Formosa, Argentina
| | | | - Fekadu Shiferaw
- Guinea Worm Eradication Program, The Carter Center Ethiopia, PoB 16316, Addis Ababa 1000, Ethiopia
| | - Dongdong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Long Zhou
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Guojie Zhang
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou 310058, China
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen, DK-2100 Copenhagen, Denmark
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
- Liangzhu Laboratory, Zhejiang University Medical Center, 1369 West Wenyi Road, Hangzhou 311121, China
- Women's Hospital, School of Medicine, Zhejiang University, 1 Xueshi Road, Shangcheng District, Hangzhou 310006, China
| | - Julius D Keyyu
- Tanzania Wildlife Research Institute (TAWIRI), Head Office, P.O. Box 661, Arusha, Tanzania
| | - Sascha Knauf
- Institute of International Animal Health/One Health, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, 17493 Greifswald - Insei Riems, Germany
| | - Minh D Le
- Department of Environmental Ecology, Faculty of Environmental Sciences, University of Science and Central Institute for Natural Resources and Environmental Studies, Vietnam National University, Hanoi 100000, Vietnam
| | - Esther Lizano
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Catalan Institution of Research and Advanced Studies (ICREA), Passeig de Lluís Companys, 23, 08010 Barcelona, Spain
| | - Stefan Merker
- Department of Zoology, State Museum of Natural History Stuttgart, 70191 Stuttgart, Germany
| | - Arcadi Navarro
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Edifici ICTA-ICP, c/ Columnes s/n, 08193 Cerdanyola del Vallès, Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Av. Doctor Aiguader, N88, 08003 Barcelona, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation, C. Wellington 30, 08005 Barcelona, Spain
| | - Thomas Bataillon
- Bioinformatics Research Centre, Aarhus University, Aarhus 8000, Denmark
| | - Tilo Nadler
- Cuc Phuong Commune, Nho Quan District, Ninh Binh Province 430000, Vietnam
| | - Chiea Chuen Khor
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Genome, Singapore 138672, Republic of Singapore
| | - Jessica Lee
- Mandai Nature, 80 Mandai Lake Road, Singapore 729826, Republic of Singapore
| | - Patrick Tan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Genome, Singapore 138672, Republic of Singapore
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore 168582, Republic of Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore 168582, Republic of Singapore
| | - Weng Khong Lim
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM), Singapore 168582, Republic of Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore 168582, Republic of Singapore
- SingHealth Duke-NUS Genomic Medicine Centre, Singapore 168582, Republic of Singapore
| | - Andrew C Kitchener
- Department of Natural Sciences, National Museums Scotland, Chambers Street, Edinburgh EH1 1JF, UK
- School of Geosciences, University of Edinburgh, Drummond Street, Edinburgh EH8 9XP, UK
| | - Dietmar Zinner
- Cognitive Ethology Laboratory, Germany Primate Center, Leibniz Institute for Primate Research, 37077 Göttingen, Germany
- Department of Primate Cognition, Georg-August-Universität Göttingen, 37077 Göttingen, Germany
- Leibniz Science Campus Primate Cognition, 37077 Göttingen, Germany
| | - Ivo Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
- Universitat Pompeu Fabra, Pg. Luís Companys 23, 08010 Barcelona, Spain
| | - Amanda Melin
- Department of Anthropology & Archaeology, University of Calgary, 2500 University Dr NW, Calgary, AB T2N 1N4, Canada
- Department of Medical Genetics, 3330 Hospital Drive NW, HMRB 202, Calgary, AB T2N 4N1, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, 2500 University Dr NW, Calgary, AB T2N 1N4, Canada
| | - Katerina Guschanski
- Department of Ecology and Genetics, Animal Ecology, Uppsala University, SE-75236 Uppsala, Sweden
- Institute of Ecology and Evolution, School of Biological Sciences, University of Edinburgh, Edinburgh EH8 9XP, UK
| | | | - Robin M D Beck
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Govindhaswamy Umapathy
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500007, India
| | - Christian Roos
- Gene Bank of Primates and Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research, Kellnerweg 4, 37077 Göttingen, Germany
| | - Jean P Boubli
- School of Science, Engineering & Environment, University of Salford, Salford M5 4WT, UK
| | - Monkol Lek
- Department of Genetics, Yale School of Medicine, New Haven, CT 06520, USA
| | - Shamil Sunyaev
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Anne O'Donnell-Luria
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA, 02142, USA
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Heidi L Rehm
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Boston, MA, 02142, USA
- Department of Genetics, Yale School of Medicine, New Haven, CT 06520, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Jinbo Xu
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
- Toyota Technological Institute at Chicago, Chicago, IL 60637, USA
| | - Jeffrey Rogers
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
- Catalan Institution of Research and Advanced Studies (ICREA), Passeig de Lluís Companys, 23, 08010 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Edifici ICTA-ICP, c/ Columnes s/n, 08193 Cerdanyola del Vallès, Barcelona, Spain
| | - Kyle Kai-How Farh
- Illumina Artificial Intelligence Laboratory, Illumina Inc., Foster City, CA, 94404, USA
| |
Collapse
|
10
|
Fiziev P, McRae J, Ulirsch JC, Dron JS, Hamp T, Yang Y, Wainschtein P, Ni Z, Schraiber JG, Gao H, Cable D, Field Y, Aguet F, Fasnacht M, Metwally A, Rogers J, Marques-Bonet T, Rehm HL, O’Donnell-Luria A, Khera AV, Kai-How Farh K. Rare penetrant mutations confer severe risk of common diseases. medRxiv 2023:2023.05.01.23289356. [PMID: 37205493 PMCID: PMC10187340 DOI: 10.1101/2023.05.01.23289356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
We examined 454,712 exomes for genes associated with a wide spectrum of complex traits and common diseases and observed that rare, penetrant mutations in genes implicated by genome-wide association studies confer ∼10-fold larger effects than common variants in the same genes. Consequently, an individual at the phenotypic extreme and at the greatest risk for severe, early-onset disease is better identified by a few rare penetrant variants than by the collective action of many common variants with weak effects. By combining rare variants across phenotype-associated genes into a unified genetic risk model, we demonstrate superior portability across diverse global populations compared to common variant polygenic risk scores, greatly improving the clinical utility of genetic-based risk prediction. One sentence summary Rare variant polygenic risk scores identify individuals with outlier phenotypes in common human diseases and complex traits.
Collapse
Affiliation(s)
- Petko Fiziev
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| | - Jeremy McRae
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| | - Jacob C. Ulirsch
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| | - Jacqueline S. Dron
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard; Cambridge, Massachusetts 02142, USA
| | - Tobias Hamp
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| | - Yanshen Yang
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| | - Pierrick Wainschtein
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| | - Zijian Ni
- Department of Statistics, UW Madison; Madison, Wisconsin 53706, USA
| | - Joshua G. Schraiber
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| | - Hong Gao
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| | - Dylan Cable
- Department of Electrical Engineering and Computer Science, MIT; Cambridge, Massachusetts 02142, USA
| | - Yair Field
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| | - Francois Aguet
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| | - Marc Fasnacht
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| | - Ahmed Metwally
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| | - Jeffrey Rogers
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine; Houston, Texas 77030, USA
- Wisconsin National Primate Research Center, University of Wisconsin; Madison 53715, USA
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC); 08003 Barcelona, Spain
- Catalan Institution of Research and Advanced Studies (ICREA); 08010 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST); 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona; 08193 Barcelona, Spain
| | - Heidi L. Rehm
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard; Cambridge, Massachusetts 02142, USA
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital; Boston, Massachusetts 02114, USA
| | - Anne O’Donnell-Luria
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard; Cambridge, Massachusetts 02142, USA
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital; Boston, Massachusetts 02114, USA
- Division of Genetics and Genomics, Boston Children’s Hospital; Boston, Massachusetts 02115, USA
| | - Amit V. Khera
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard; Cambridge, Massachusetts 02142, USA
- Verve Therapeutics, Cambridge, Massachusetts 02215, USA
| | - Kyle Kai-How Farh
- Artificial Intelligence Laboratory, Illumina, Inc.; San Diego, California 92122, USA
| |
Collapse
|
11
|
Gao H, Hamp T, Ede J, Schraiber JG, McRae J, Singer-Berk M, Yang Y, Dietrich A, Fiziev P, Kuderna L, Sundaram L, Wu Y, Adhikari A, Field Y, Chen C, Batzoglou S, Aguet F, Lemire G, Reimers R, Balick D, Janiak MC, Kuhlwilm M, Orkin JD, Manu S, Valenzuela A, Bergman J, Rouselle M, Silva FE, Agueda L, Blanc J, Gut M, de Vries D, Goodhead I, Harris RA, Raveendran M, Jensen A, Chuma IS, Horvath J, Hvilsom C, Juan D, Frandsen P, de Melo FR, Bertuol F, Byrne H, Sampaio I, Farias I, do Amaral JV, Messias M, da Silva MNF, Trivedi M, Rossi R, Hrbek T, Andriaholinirina N, Rabarivola CJ, Zaramody A, Jolly CJ, Phillips-Conroy J, Wilkerson G, Abee C, Simmons JH, Fernandez-Duque E, Kanthaswamy S, Shiferaw F, Wu D, Zhou L, Shao Y, Zhang G, Keyyu JD, Knauf S, Le MD, Lizano E, Merker S, Navarro A, Batallion T, Nadler T, Khor CC, Lee J, Tan P, Lim WK, Kitchener AC, Zinner D, Gut I, Melin A, Guschanski K, Schierup MH, Beck RMD, Umapathy G, Roos C, Boubli JP, Lek M, Sunyaev S, O’Donnell A, Rehm H, Xu J, Rogers J, Marques-Bonet T, Kai-How Farh K. The landscape of tolerated genetic variation in humans and primates. bioRxiv 2023:2023.05.01.538953. [PMID: 37205491 PMCID: PMC10187174 DOI: 10.1101/2023.05.01.538953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole genome sequencing data for 809 individuals from 233 primate species, and identified 4.3 million common protein-altering variants with orthologs in human. We show that these variants can be inferred to have non-deleterious effects in human based on their presence at high allele frequencies in other primate populations. We use this resource to classify 6% of all possible human protein-altering variants as likely benign and impute the pathogenicity of the remaining 94% of variants with deep learning, achieving state-of-the-art accuracy for diagnosing pathogenic variants in patients with genetic diseases. One Sentence Summary Deep learning classifier trained on 4.3 million common primate missense variants predicts variant pathogenicity in humans.
Collapse
Affiliation(s)
- Hong Gao
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Tobias Hamp
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Jeffrey Ede
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Joshua G. Schraiber
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Jeremy McRae
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Moriel Singer-Berk
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard; Boston, Massachusetts, 02142, USA
| | - Yanshen Yang
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Anastasia Dietrich
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Petko Fiziev
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Lukas Kuderna
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Laksshman Sundaram
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Yibing Wu
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Aashish Adhikari
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Yair Field
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Chen Chen
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Serafim Batzoglou
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Francois Aguet
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| | - Gabrielle Lemire
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard; Boston, Massachusetts, 02142, USA
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children’s Hospital, Harvard Medical School; Boston, Massachusetts, 02115, USA
| | - Rebecca Reimers
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children’s Hospital, Harvard Medical School; Boston, Massachusetts, 02115, USA
| | - Daniel Balick
- Division of Genetics, Brigham and Women’s Hospital, Harvard Medical School; Boston, Massachusetts, 02115, USA
| | - Mareike C. Janiak
- School of Science, Engineering & Environment, University of Salford; Salford, M5 4WT, United Kingdom
| | - Martin Kuhlwilm
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Department of Evolutionary Anthropology, University of Vienna; Djerassiplatz 1, 1030, Vienna, Austria
- Human Evolution and Archaeological Sciences (HEAS), University of Vienna; 1030, Vienna, Austria
| | - Joseph D. Orkin
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Département d’anthropologie, Université de Montréal; 3150 Jean-Brillant, Montréal, QC, H3T 1N8, Canada
| | - Shivakumara Manu
- Academy of Scientific and Innovative Research (AcSIR); Ghaziabad, 201002, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology; Hyderabad, 500007, India
| | - Alejandro Valenzuela
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Juraj Bergman
- Bioinformatics Research Centre, Aarhus University; Aarhus, 8000, Denmark
- Section for Ecoinformatics & Biodiversity, Department of Biology, Aarhus University; Aarhus, 8000, Denmark
| | | | - Felipe Ennes Silva
- Research Group on Primate Biology and Conservation, Mamirauá Institute for Sustainable Development; Estrada da Bexiga 2584, Tefé, Amazonas, CEP 69553-225, Brazil
- Faculty of Sciences, Department of Organismal Biology, Unit of Evolutionary Biology and Ecology, Université Libre de Bruxelles (ULB); Avenue Franklin D. Roosevelt 50, 1050, Brussels, Belgium
| | - Lidia Agueda
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST); Baldiri i Reixac 4, 08028, Barcelona, Spain
| | - Julie Blanc
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST); Baldiri i Reixac 4, 08028, Barcelona, Spain
| | - Marta Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST); Baldiri i Reixac 4, 08028, Barcelona, Spain
| | - Dorien de Vries
- School of Science, Engineering & Environment, University of Salford; Salford, M5 4WT, United Kingdom
| | - Ian Goodhead
- School of Science, Engineering & Environment, University of Salford; Salford, M5 4WT, United Kingdom
| | - R. Alan Harris
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine; Houston, Texas, 77030, USA
| | - Muthuswamy Raveendran
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine; Houston, Texas, 77030, USA
| | - Axel Jensen
- Department of Ecology and Genetics, Animal Ecology, Uppsala University; SE-75236, Uppsala, Sweden
| | | | - Julie Horvath
- North Carolina Museum of Natural Sciences; Raleigh, North Carolina, 27601, USA
- Department of Biological and Biomedical Sciences, North Carolina Central University; Durham, North Carolina , 27707, USA
- Department of Biological Sciences, North Carolina State University; Raleigh, North Carolina , 27695, USA
- Department of Evolutionary Anthropology, Duke University; Durham, North Carolina , 27708, USA
- Renaissance Computing Institute, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | | | - David Juan
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | | | | | - Fabricio Bertuol
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL); Manaus, Amazonas, 69080-900, Brazil
| | - Hazel Byrne
- Department of Anthropology, University of Utah; Salt Lake City, Utah, 84102, USA
| | - Iracilda Sampaio
- Universidade Federal do Para; Guamá, Belém - PA, 66075-110, Brazil
| | - Izeni Farias
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL); Manaus, Amazonas, 69080-900, Brazil
| | - João Valsecchi do Amaral
- Research Group on Terrestrial Vertebrate Ecology, Mamirauá Institute for Sustainable Development; Tefé, Amazonas, 69553-225, Brazil
- Rede de Pesquisa para Estudos sobre Diversidade, Conservação e Uso da Fauna na Amazônia – RedeFauna; Manaus, Amazonas, 69080-900, Brazil
- Comunidad de Manejo de Fauna Silvestre en la Amazonía y en Latinoamérica – ComFauna; Iquitos, Loreto, 16001, Peru
| | - Mariluce Messias
- Universidade Federal de Rondonia; Porto Velho, Rondônia, 78900-000, Brazil
- PPGREN - Programa de Pós-Graduação “Conservação e Uso dos Recursos Naturais and BIONORTE - Programa de Pós-Graduação em Biodiversidade e Biotecnologia da Rede BIONORTE, Universidade Federal de Rondonia; Porto Velho, Rondônia, 78900-000, Brazil
| | - Maria N. F. da Silva
- Instituto Nacional de Pesquisas da Amazonia; Petrópolis, Manaus - AM, 69067-375, Brazil
| | - Mihir Trivedi
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology; Hyderabad, 500007, India
| | - Rogerio Rossi
- Universidade Federal do Mato Grosso; Boa Esperança, Cuiabá - MT, 78060-900, Brazil
| | - Tomas Hrbek
- Universidade Federal do Amazonas, Departamento de Genética, Laboratório de Evolução e Genética Animal (LEGAL); Manaus, Amazonas, 69080-900, Brazil
- Department of Biology, Trinity University; San Antonio, Texas, 78212, USA
| | - Nicole Andriaholinirina
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga; Mahajanga, 401, Madagascar
| | - Clément J. Rabarivola
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga; Mahajanga, 401, Madagascar
| | - Alphonse Zaramody
- Life Sciences and Environment, Technology and Environment of Mahajanga, University of Mahajanga; Mahajanga, 401, Madagascar
| | | | | | - Gregory Wilkerson
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center; Houston, Texas, 77030, USA
| | | | - Joe H. Simmons
- Keeling Center for Comparative Medicine and Research, MD Anderson Cancer Center; Houston, Texas, 77030, USA
| | - Eduardo Fernandez-Duque
- Yale University; New Haven, Connecticut, 06520, USA
- Universidad Nacional de Formosa, Argentina Fundacion ECO, Formosa, Argentina
| | | | | | - Dongdong Wu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences; Kunming, Yunnan, 650223, China
| | - Long Zhou
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences; Kunming, Yunnan, 650223, China
| | - Guojie Zhang
- Center for Evolutionary & Organismal Biology, Zhejiang University School of Medicine, Hangzhou, 310058, China
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen; Copenhagen, DK-2100, Denmark
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, 650223, China
- Liangzhu Laboratory, Zhejiang University Medical Center; 1369 West Wenyi Road, Hangzhou, 311121, China
- Women’s Hospital, School of Medicine, Zhejiang University; 1 Xueshi Road, Shangcheng District, Hangzhou, 310006, China
| | - Julius D. Keyyu
- Tanzania Wildlife Research Institute (TAWIRI), Head Office; P.O.Box 661, Arusha, Tanzania
| | - Sascha Knauf
- Institute of International Animal Health/One Health, Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health; 17493 Greifswald - Isle of Riems, Germany
| | - Minh D. Le
- Department of Environmental Ecology, Faculty of Environmental Sciences, University of Science and Central Institute for Natural Resources and Environmental Studies, Vietnam National University; Hanoi, 100000, Vietnam
| | - Esther Lizano
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain; Catalan Institution of Research and Advanced Studies (ICREA), Barcelona, Spain
| | - Stefan Merker
- Department of Zoology, State Museum of Natural History Stuttgart; 70191 Stuttgart, Germany
| | - Arcadi Navarro
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA) and Universitat Pompeu Fabra, Pg. Luís Companys 23, Barcelona, 08010, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology; Av. Doctor Aiguader, N88, Barcelona, 08003, Spain
- BarcelonaBeta Brain Research Center, Pasqual Maragall Foundation; C. Wellington 30, Barcelona, 08005, Spain
| | - Thomas Batallion
- Bioinformatics Research Centre, Aarhus University; Aarhus, 8000, Denmark
| | - Tilo Nadler
- Cuc Phuong Commune; Nho Quan District, Ninh Binh Province, 430000, Vietnam
| | - Chiea Chuen Khor
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Genome, Singapore 138672, Republic of Singapore
| | - Jessica Lee
- Mandai Nature; 80 Mandai Lake Road, Singapore 729826, Republic of Singapore
| | - Patrick Tan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Genome, Singapore 138672, Republic of Singapore
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM); Singapore 168582, Republic of Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School; Singapore 168582, Republic of Singapore
| | - Weng Khong Lim
- SingHealth Duke-NUS Institute of Precision Medicine (PRISM); Singapore 168582, Republic of Singapore
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School; Singapore 168582, Republic of Singapore
- SingHealth Duke-NUS Genomic Medicine Centre; Singapore 168582, Republic of Singapore
| | - Andrew C. Kitchener
- Department of Natural Sciences, National Museums Scotland; Chambers Street, Edinburgh, EH1 1JF, UK
- School of Geosciences, University of Edinburgh; Drummond Street, Edinburgh, EH8 9XP, UK
| | - Dietmar Zinner
- Cognitive Ethology Laboratory, Germany Primate Center, Leibniz Institute for Primate Research; 37077 Göttingen, Germany
- Department of Primate Cognition, Georg-August-Universität Göttingen; 37077 Göttingen, Germany
| | - Ivo Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST); Baldiri i Reixac 4, 08028, Barcelona, Spain
- Universitat Pompeu Fabra, Pg. Luís Companys 23, Barcelona, 08010, Spain
| | - Amanda Melin
- Leibniz Science Campus Primate Cognition; 37077 Göttingen, Germany
- Department of Anthropology & Archaeology and Department of Medical Genetics
| | - Katerina Guschanski
- Department of Ecology and Genetics, Animal Ecology, Uppsala University; SE-75236, Uppsala, Sweden
- Alberta Children’s Hospital Research Institute; University of Calgary; 2500 University Dr NW T2N 1N4, Calgary, Alberta, Canada
| | | | - Robin M. D. Beck
- School of Science, Engineering & Environment, University of Salford; Salford, M5 4WT, United Kingdom
| | - Govindhaswamy Umapathy
- Academy of Scientific and Innovative Research (AcSIR); Ghaziabad, 201002, India
- Laboratory for the Conservation of Endangered Species, CSIR-Centre for Cellular and Molecular Biology; Hyderabad, 500007, India
| | - Christian Roos
- Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh; Edinburgh, EH8 9XP, UK
| | - Jean P. Boubli
- School of Science, Engineering & Environment, University of Salford; Salford, M5 4WT, United Kingdom
| | - Monkol Lek
- Gene Bank of Primates and Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research; Kellnerweg 4, 37077 Göttingen, Germany
| | - Shamil Sunyaev
- Division of Genetics, Brigham and Women’s Hospital, Harvard Medical School; Boston, Massachusetts, 02115, USA
- Department of Genetics, Yale School of Medicine; New Haven, Connecticut, 06520, USA
| | - Anne O’Donnell
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard; Boston, Massachusetts, 02142, USA
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children’s Hospital, Harvard Medical School; Boston, Massachusetts, 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, 02115, USA
| | - Heidi Rehm
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard; Boston, Massachusetts, 02142, USA
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School; Boston, Massachusetts, 02115, USA
| | - Jinbo Xu
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
- Toyota Technological Institute at Chicago; Chicago, Illinois, 60637, USA
| | - Jeffrey Rogers
- Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine; Houston, Texas, 77030, USA
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC); PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST); Baldiri i Reixac 4, 08028, Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Barcelona, Spain; Catalan Institution of Research and Advanced Studies (ICREA), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA) and Universitat Pompeu Fabra, Pg. Luís Companys 23, Barcelona, 08010, Spain
| | - Kyle Kai-How Farh
- Illumina Artificial Intelligence Laboratory, Illumina Inc.; Foster City, California, 94404, USA
| |
Collapse
|
12
|
Link V, Schraiber JG, Fan C, Dinh B, Mancuso N, Chiang CW, Edge MD. Tree-based QTL mapping with expected local genetic relatedness matrices. bioRxiv 2023:2023.04.07.536093. [PMID: 37066144 PMCID: PMC10104234 DOI: 10.1101/2023.04.07.536093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Understanding the genetic basis of complex phenotypes is a central pursuit of genetics. Genome-wide Association Studies (GWAS) are a powerful way to find genetic loci associated with phenotypes. GWAS are widely and successfully used, but they face challenges related to the fact that variants are tested for association with a phenotype independently, whereas in reality variants at different sites are correlated because of their shared evolutionary history. One way to model this shared history is through the ancestral recombination graph (ARG), which encodes a series of local coalescent trees. Recent computational and methodological breakthroughs have made it feasible to estimate approximate ARGs from large-scale samples. Here, we explore the potential of an ARG-based approach to quantitative-trait locus (QTL) mapping, echoing existing variance-components approaches. We propose a framework that relies on the conditional expectation of a local genetic relatedness matrix given the ARG (local eGRM). Simulations show that our method is especially beneficial for finding QTLs in the presence of allelic heterogeneity. By framing QTL mapping in terms of the estimated ARG, we can also facilitate the detection of QTLs in understudied populations. We use local eGRM to identify a large-effect BMI locus, the CREBRF gene, in a sample of Native Hawaiians in which it was not previously detectable by GWAS because of a lack of population-specific imputation resources. Our investigations can provide intuition about the benefits of using estimated ARGs in population- and statistical-genetic methods in general.
Collapse
Affiliation(s)
- Vivian Link
- Department of Quantitative and Computational Biology, University of Southern California
| | - Joshua G. Schraiber
- Department of Quantitative and Computational Biology, University of Southern California
| | - Caoqi Fan
- Department of Quantitative and Computational Biology, University of Southern California
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Bryan Dinh
- Department of Quantitative and Computational Biology, University of Southern California
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Nicholas Mancuso
- Department of Quantitative and Computational Biology, University of Southern California
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Charleston W.K. Chiang
- Department of Quantitative and Computational Biology, University of Southern California
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Michael D. Edge
- Department of Quantitative and Computational Biology, University of Southern California
| |
Collapse
|
13
|
Gopalakrishnan S, Ebenesersdóttir SS, Lundstrøm IKC, Turner-Walker G, Moore KHS, Luisi P, Margaryan A, Martin MD, Ellegaard MR, Magnússon ÓÞ, Sigurðsson Á, Snorradóttir S, Magnúsdóttir DN, Laffoon JE, van Dorp L, Liu X, Moltke I, Ávila-Arcos MC, Schraiber JG, Rasmussen S, Juan D, Gelabert P, de-Dios T, Fotakis AK, Iraeta-Orbegozo M, Vågene ÅJ, Denham SD, Christophersen A, Stenøien HK, Vieira FG, Liu S, Günther T, Kivisild T, Moseng OG, Skar B, Cheung C, Sandoval-Velasco M, Wales N, Schroeder H, Campos PF, Guðmundsdóttir VB, Sicheritz-Ponten T, Petersen B, Halgunset J, Gilbert E, Cavalleri GL, Hovig E, Kockum I, Olsson T, Alfredsson L, Hansen TF, Werge T, Willerslev E, Balloux F, Marques-Bonet T, Lalueza-Fox C, Nielsen R, Stefánsson K, Helgason A, Gilbert MTP. The population genomic legacy of the second plague pandemic. Curr Biol 2022; 32:4743-4751.e6. [PMID: 36182700 DOI: 10.1016/j.cub.2022.09.023] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 06/15/2022] [Accepted: 09/09/2022] [Indexed: 11/18/2022]
Abstract
Human populations have been shaped by catastrophes that may have left long-lasting signatures in their genomes. One notable example is the second plague pandemic that entered Europe in ca. 1,347 CE and repeatedly returned for over 300 years, with typical village and town mortality estimated at 10%-40%.1 It is assumed that this high mortality affected the gene pools of these populations. First, local population crashes reduced genetic diversity. Second, a change in frequency is expected for sequence variants that may have affected survival or susceptibility to the etiologic agent (Yersinia pestis).2 Third, mass mortality might alter the local gene pools through its impact on subsequent migration patterns. We explored these factors using the Norwegian city of Trondheim as a model, by sequencing 54 genomes spanning three time periods: (1) prior to the plague striking Trondheim in 1,349 CE, (2) the 17th-19th century, and (3) the present. We find that the pandemic period shaped the gene pool by reducing long distance immigration, in particular from the British Isles, and inducing a bottleneck that reduced genetic diversity. Although we also observe an excess of large FST values at multiple loci in the genome, these are shaped by reference biases introduced by mapping our relatively low genome coverage degraded DNA to the reference genome. This implies that attempts to detect selection using ancient DNA (aDNA) datasets that vary by read length and depth of sequencing coverage may be particularly challenging until methods have been developed to account for the impact of differential reference bias on test statistics.
Collapse
Affiliation(s)
- Shyam Gopalakrishnan
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark.
| | - S Sunna Ebenesersdóttir
- deCODE Genetics, AMGEN Inc., Sturlugata 8, 102 Reykjavík, Iceland; Department of Anthropology, School of Social Sciences, University of Iceland, Gimli, Sæmundargata, 102 Reykjavík, Iceland
| | - Inge K C Lundstrøm
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark
| | - Gordon Turner-Walker
- National Yunlin University of Science & Technology, 123 University Road, Section 3, 64002 Douliu, Yun-Lin County, Taiwan; Department of Archaeology and Anthropology, National Museum of Natural Science, 1 Guanqian Road, North District Taichung City 404023, Taiwan
| | | | - Pierre Luisi
- Facultad de Filosofía y Humanidades, Universidad Nacional de Córdoba, Córdoba, Argentina; Microbial Paleogenomics Unit, Institut Pasteur, 25-28 Rue du Dr Roux, 75015 Paris, France
| | - Ashot Margaryan
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark
| | - Michael D Martin
- NTNU University Museum, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
| | - Martin Rene Ellegaard
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark; NTNU University Museum, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
| | | | | | | | | | - Jason E Laffoon
- Department of Archaeological Sciences, Faculty of Archaeology, Leiden University, Leiden, the Netherlands
| | - Lucy van Dorp
- UCL Genetics Institute, Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London WC1E 6BT, UK
| | - Xiaodong Liu
- Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Ida Moltke
- Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - María C Ávila-Arcos
- International Laboratory for Human Genome Research, Laboratorio Internacional de Investigación sobre el Genoma Humano (LIIGH), Universidad Nacional Autónoma de México (UNAM), 3001 Boulevard Juriquilla, 76230 Querétaro, Mexico
| | - Joshua G Schraiber
- Illumina Artificial Intelligence Laboratory, Illumina Inc., San Diego, CA, USA
| | - Simon Rasmussen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3, 2200 Copenhagen, Denmark
| | - David Juan
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Pere Gelabert
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain; Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria
| | - Toni de-Dios
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Anna K Fotakis
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark
| | - Miren Iraeta-Orbegozo
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark
| | - Åshild J Vågene
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark; Max Planck Institute for the Science of Human History, Kahlaische Strasse 10, 07745 Jena, Germany; Institute for Archaeological Sciences, University of Tübingen, Tübingen, Germany
| | | | - Axel Christophersen
- NTNU University Museum, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
| | - Hans K Stenøien
- NTNU University Museum, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
| | - Filipe G Vieira
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark
| | - Shanlin Liu
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark; China National GeneBank, BGI-Shenzhen, Shenzhen 518083, China
| | - Torsten Günther
- Evolutionsbiologisk Centrum EBC, Norbyv. 18A, 752 36 Uppsala, Sweden
| | - Toomas Kivisild
- KU Leuven, Herestraat 49, 3000 Leuven, Belgium; Institute of Genomics, University of Tartu, Riia 23b, 51010 Tartu, Estonia
| | - Ole Georg Moseng
- Department of Business, History and Social Sciences, University of South-Eastern Norway, Notodden, Norway
| | - Birgitte Skar
- NTNU University Museum, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
| | - Christina Cheung
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark; EA - Eco-anthropologie (UMR 7206), Muséum National d'Histoire Naturelle, CNRS, Université Paris Diderot, Paris, France
| | - Marcela Sandoval-Velasco
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark
| | - Nathan Wales
- Department of Archaeology, Kings Manor and Principals House, University of York, Exhibition Square, York YO1 7EP, UK
| | - Hannes Schroeder
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark
| | - Paula F Campos
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark; CIIMAR, Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Avenida General Norton de Matos, Matosinhos, Portugal
| | - Valdís B Guðmundsdóttir
- deCODE Genetics, AMGEN Inc., Sturlugata 8, 102 Reykjavík, Iceland; Department of Anthropology, School of Social Sciences, University of Iceland, Gimli, Sæmundargata, 102 Reykjavík, Iceland
| | - Thomas Sicheritz-Ponten
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark; Centre of Excellence for Omics-Driven Computational Biodiscovery (COMBio), Faculty of Applied Sciences, Asian Institute of Medicine, Science and Technology (AIMST), 08100 Bedong, Kedah, Malaysia
| | - Bent Petersen
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark; Centre of Excellence for Omics-Driven Computational Biodiscovery (COMBio), Faculty of Applied Sciences, Asian Institute of Medicine, Science and Technology (AIMST), 08100 Bedong, Kedah, Malaysia
| | | | - Edmund Gilbert
- School of Pharmacy and Biomolecular Sciences, RCSI, Dublin, Ireland; FutureNeuro SFI Research Centre, RCSI, Dublin, Ireland
| | - Gianpiero L Cavalleri
- School of Pharmacy and Biomolecular Sciences, RCSI, Dublin, Ireland; FutureNeuro SFI Research Centre, RCSI, Dublin, Ireland
| | - Eivind Hovig
- Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway; Center for Bioinformatics, Department of Informatics, University of Oslo, Oslo, Norway
| | - Ingrid Kockum
- Center for Molecular Medicine, Department of Clinical Neuroscience, Neuroimmunology Unit, Karolinska Institutet, Stockholm, Sweden
| | - Tomas Olsson
- Center for Molecular Medicine, Department of Clinical Neuroscience, Neuroimmunology Unit, Karolinska Institutet, Stockholm, Sweden
| | - Lars Alfredsson
- Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
| | - Thomas F Hansen
- Institute of Biological Psychiatry, Copenhagen Mental Health Services, Copenhagen, Denmark; Danish Headache Center, Department of Neurology, Copenhagen University Hospital, 2600 Glostrup, Denmark
| | - Thomas Werge
- Institute of Biological Psychiatry, Copenhagen Mental Health Services, Copenhagen, Denmark; Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark; The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Copenhagen, Denmark; The Globe Institute, Lundbeck Foundation Center for Geogenetics, Øster Voldgade 5-7, 1350 Copenhagen K, Denmark
| | - Eske Willerslev
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark; Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, UK
| | - Francois Balloux
- UCL Genetics Institute, Department of Genetics, Evolution and Environment, University College London, Darwin Building, Gower Street, London WC1E 6BT, UK
| | - Tomas Marques-Bonet
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain; Catalan Institution of Research and Advanced Studies (ICREA), Passeig de Lluís Companys, 23, 08010 Barcelona, Spain; CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain; Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona, Edifici ICTA-ICP, c/ Columnes s/n, 08193 Cerdanyola del Vallès, Barcelona, Spain
| | - Carles Lalueza-Fox
- Institute of Evolutionary Biology (UPF-CSIC), PRBB, Dr. Aiguader 88, 08003 Barcelona, Spain; Museu de Ciències Naturals de Barcelona, 08019 Barcelona, Spain
| | - Rasmus Nielsen
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark; Department of Integrative Biology, University of California, Berkeley, 3060 Valley Life Sciences Bldg #3140, Berkeley, CA 94720-3140, USA
| | - Kári Stefánsson
- deCODE Genetics, AMGEN Inc., Sturlugata 8, 102 Reykjavík, Iceland; Faculty of Medicine, University of Iceland, Reykjavík, Iceland
| | - Agnar Helgason
- deCODE Genetics, AMGEN Inc., Sturlugata 8, 102 Reykjavík, Iceland; Department of Anthropology, School of Social Sciences, University of Iceland, Gimli, Sæmundargata, 102 Reykjavík, Iceland
| | - M Thomas P Gilbert
- The GLOBE Institute, Faculty of Health and Medical Sciences, University of Copenhagen, Øster Farimagsgade 5A, 1353 Copenhagen, Denmark; NTNU University Museum, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway
| |
Collapse
|
14
|
Hateley S, Lopez-Izquierdo A, Jou CJ, Cho S, Schraiber JG, Song S, Maguire CT, Torres N, Riedel M, Bowles NE, Arrington CB, Kennedy BJ, Etheridge SP, Lai S, Pribble C, Meyers L, Lundahl D, Byrnes J, Granka JM, Kauffman CA, Lemmon G, Boyden S, Scott Watkins W, Karren MA, Knight S, Brent Muhlestein J, Carlquist JF, Anderson JL, Chahine KG, Shah KU, Ball CA, Benjamin IJ, Yandell M, Tristani-Firouzi M. The history and geographic distribution of a KCNQ1 atrial fibrillation risk allele. Nat Commun 2021; 12:6442. [PMID: 34750360 PMCID: PMC8575962 DOI: 10.1038/s41467-021-26741-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 10/20/2021] [Indexed: 11/08/2022] Open
Abstract
The genetic architecture of atrial fibrillation (AF) encompasses low impact, common genetic variants and high impact, rare variants. Here, we characterize a high impact AF-susceptibility allele, KCNQ1 R231H, and describe its transcontinental geographic distribution and history. Induced pluripotent stem cell-derived cardiomyocytes procured from risk allele carriers exhibit abbreviated action potential duration, consistent with a gain-of-function effect. Using identity-by-descent (IBD) networks, we estimate the broad- and fine-scale population ancestry of risk allele carriers and their relatives. Analysis of ancestral migration routes reveals ancestors who inhabited Denmark in the 1700s, migrated to the Northeastern United States in the early 1800s, and traveled across the Midwest to arrive in Utah in the late 1800s. IBD/coalescent-based allele dating analysis reveals a relatively recent origin of the AF risk allele (~5000 years). Thus, our approach broadens the scope of study for disease susceptibility alleles to the context of human migration and ancestral origins.
Collapse
Affiliation(s)
| | | | - Chuanchau J Jou
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Scott Cho
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | | | | | - Colin T Maguire
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Natalia Torres
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Michael Riedel
- Cardiovascular Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Neil E Bowles
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Cammon B Arrington
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Brett J Kennedy
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Susan P Etheridge
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Shuping Lai
- Cardiovascular Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Chase Pribble
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Lindsay Meyers
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Derek Lundahl
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA
| | | | | | - Christopher A Kauffman
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Gordon Lemmon
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Steven Boyden
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - W Scott Watkins
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Mary Anne Karren
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | | | | | | | | | | | - Khushi U Shah
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA
| | | | - Ivor J Benjamin
- Cardiovascular Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Mark Yandell
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - Martin Tristani-Firouzi
- Nora Eccles Harrison CVRTI, University of Utah School of Medicine, Salt Lake City, UT, USA.
- Division of Pediatric Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA.
| |
Collapse
|
15
|
Wang Y, Song S, Schraiber JG, Sedghifar A, Byrnes JK, Turissini DA, Hong EL, Ball CA, Noto K. Ancestry inference using reference labeled clusters of haplotypes. BMC Bioinformatics 2021; 22:459. [PMID: 34563119 PMCID: PMC8466715 DOI: 10.1186/s12859-021-04350-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 08/31/2021] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND We present ARCHes, a fast and accurate haplotype-based approach for inferring an individual's ancestry composition. Our approach works by modeling haplotype diversity from a large, admixed cohort of hundreds of thousands, then annotating those models with population information from reference panels of known ancestry. RESULTS The running time of ARCHes does not depend on the size of a reference panel because training and testing are separate processes, and the inferred population-annotated haplotype models can be written to disk and reused to label large test sets in parallel (in our experiments, it averages less than one minute to assign ancestry from 32 populations using 10 CPU). We test ARCHes on public data from the 1000 Genomes Project and the Human Genome Diversity Project (HGDP) as well as simulated examples of known admixture. CONCLUSIONS Our results demonstrate that ARCHes outperforms RFMix at correctly assigning both global and local ancestry at finer population scales regardless of the amount of population admixture.
Collapse
Affiliation(s)
- Yong Wang
- AncestryDNA, San Francisco, CA, 94107, USA
| | - Shiya Song
- AncestryDNA, San Francisco, CA, 94107, USA
| | | | | | | | | | | | | | - Keith Noto
- AncestryDNA, San Francisco, CA, 94107, USA.
| |
Collapse
|
16
|
Abstract
Neanderthals and anatomically modern humans overlapped geographically for a period of over 30,000 years following human migration out of Africa. During this period, Neanderthals and humans interbred, as evidenced by Neanderthal portions of the genome carried by non-African individuals today. A key observation is that the proportion of Neanderthal ancestry is ~12-20% higher in East Asian individuals relative to European individuals. Here, we explore various demographic models that could explain this observation. These include distinguishing between a single admixture event and multiple Neanderthal contributions to either population, and the hypothesis that reduced Neanderthal ancestry in modern Europeans resulted from more recent admixture with a ghost population that lacked a Neanderthal ancestry component (the 'dilution' hypothesis). To summarize the asymmetric pattern of Neanderthal allele frequencies, we compiled the joint fragment frequency spectrum of European and East Asian Neanderthal fragments and compared it with both analytical theory and data simulated under various models of admixture. Using maximum-likelihood and machine learning, we found that a simple model of a single admixture did not fit the empirical data, and instead favour a model of multiple episodes of gene flow into both European and East Asian populations. These findings indicate a longer-term, more complex interaction between humans and Neanderthals than was previously appreciated.
Collapse
Affiliation(s)
- Fernando A Villanea
- Department of Biology, Temple University, Philadelphia, PA, USA
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA
| | - Joshua G Schraiber
- Department of Biology, Temple University, Philadelphia, PA, USA.
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA.
| |
Collapse
|
17
|
Abstract
We present a multispecies coalescent model for quantitative traits that allows for evolutionary inferences at micro- and macroevolutionary scales. A major advantage of this model is its ability to incorporate genealogical discordance underlying a quantitative trait. We show that discordance causes a decrease in the expected trait covariance between more closely related species relative to more distantly related species. If unaccounted for, this outcome can lead to an overestimation of a trait's evolutionary rate, to a decrease in its phylogenetic signal, and to errors when examining shifts in mean trait values. The number of loci controlling a quantitative trait appears to be irrelevant to all trends reported, and discordance also affected discrete, threshold traits. Our model and analyses point to the conditions under which different methods should fare better or worse, in addition to indicating current and future approaches that can mitigate the effects of discordance.
Collapse
Affiliation(s)
- Fábio K Mendes
- Department of BiologyIndiana UniversityBloomingtonUnited States
| | - Jesualdo A Fuentes-González
- Department of BiologyIndiana UniversityBloomingtonUnited States
- School of Life SciencesArizona State UniversityTempeUnited States
| | - Joshua G Schraiber
- Department of BiologyTemple UniversityPhiladelphiaUnited States
- Center for Computational Genetics and GenomicsTemple UniversityPhiladelphiaUnited States
- Institute for Genomics and Evolutionary MedicineTemple UniversityPhiladelphiaUnited States
| | - Matthew W Hahn
- Department of BiologyIndiana UniversityBloomingtonUnited States
- Department of Computer ScienceIndiana UniversityBloomingtonUnited States
| |
Collapse
|
18
|
Lin D, Bi K, Conroy CJ, Lacey EA, Schraiber JG, Bowie RCK. Mito-nuclear discordance across a recent contact zone for California voles. Ecol Evol 2018; 8:6226-6241. [PMID: 29988439 PMCID: PMC6024151 DOI: 10.1002/ece3.4129] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Revised: 03/16/2018] [Accepted: 03/29/2018] [Indexed: 11/17/2022] Open
Abstract
To examine the processes that maintain genetic diversity among closely related taxa, we investigated the dynamics of introgression across a contact zone between two lineages of California voles (Microtus californicus). We tested the prediction that introgression of nuclear loci would be greater than that for mitochondrial loci, assuming ongoing gene flow across the contact zone. We also predicted that genomic markers would show a mosaic pattern of differentiation across this zone, consistent with genomes that are semi-permeable. Using mitochondrial cytochrome b sequences and genome-wide loci developed via ddRAD-seq, we analyzed genetic variation for 10 vole populations distributed along the central California coast; this transect included populations from within the distributions of both parental lineages as well as the putative contact zone. Our analyses revealed that (1) the two lineages examined are relatively young, having diverged ca. 8.5-54 kya, (2) voles from the contact zone in Santa Barbara County did not include F1 or early generation backcrossed individuals, and (3) there appeared to be little to no recurrent gene flow across the contact zone. Introgression patterns for mitochondrial and nuclear markers were not concordant; only mitochondrial markers revealed evidence of introgression, putatively due to historical hybridization. These differences in genetic signatures are intriguing given that the contact zone occurs in a region of continuous vole habitat, with no evidence of past or present physical barriers. Future studies that examine specific isolating mechanisms, such as microhabitat use and mate choice, will facilitate our understanding of how genetic boundaries are maintained in this system.
Collapse
Affiliation(s)
- Dana Lin
- Museum of Vertebrate ZoologyUniversity of California, BerkeleyBerkeleyCalifornia
- Department of Integrative BiologyUniversity of California, BerkeleyBerkeleyCalifornia
| | - Ke Bi
- Computational Genomics Resource LaboratoryCalifornia Institute for Quantitative BiosciencesUniversity of California, BerkeleyBerkeleyCalifornia
| | - Christopher J. Conroy
- Department of Integrative BiologyUniversity of California, BerkeleyBerkeleyCalifornia
| | - Eileen A. Lacey
- Museum of Vertebrate ZoologyUniversity of California, BerkeleyBerkeleyCalifornia
- Department of Integrative BiologyUniversity of California, BerkeleyBerkeleyCalifornia
| | - Joshua G. Schraiber
- Department of BiologyCenter for Computational Genetics and GenomicsTemple UniversityPhiladelphiaPennsylvania
- Institute for Genomics and Evolutionary MedicineTemple UniversityPhiladelphiaPennsylvania
| | - Rauri C. K. Bowie
- Museum of Vertebrate ZoologyUniversity of California, BerkeleyBerkeleyCalifornia
- Department of Integrative BiologyUniversity of California, BerkeleyBerkeleyCalifornia
| |
Collapse
|
19
|
Gittelman RM, Schraiber JG, Vernot B, Mikacenic C, Wurfel MM, Akey JM. Archaic Hominin Admixture Facilitated Adaptation to Out-of-Africa Environments. Curr Biol 2016; 26:3375-3382. [PMID: 27839976 DOI: 10.1016/j.cub.2016.10.041] [Citation(s) in RCA: 100] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Revised: 09/26/2016] [Accepted: 10/19/2016] [Indexed: 12/31/2022]
Abstract
As modern humans dispersed from Africa throughout the world, they encountered and interbred with archaic hominins, including Neanderthals and Denisovans [1, 2]. Although genome-scale maps of introgressed sequences have been constructed [3-6], considerable gaps in knowledge remain about the functional, phenotypic, and evolutionary significance of archaic hominin DNA that persists in present-day individuals. Here, we describe a comprehensive set of analyses that identified 126 high-frequency archaic haplotypes as putative targets of adaptive introgression in geographically diverse populations. These loci are enriched for immune-related genes (such as OAS1/2/3, TLR1/6/10, and TNFAIP3) and also encompass genes (including OCA2 and BNC2) that influence skin pigmentation phenotypes. Furthermore, we leveraged existing and novel large-scale gene expression datasets to show many positively selected archaic haplotypes act as expression quantitative trait loci (eQTLs), suggesting that modulation of transcript abundance was a common mechanism facilitating adaptive introgression. Our results demonstrate that hybridization between modern and archaic hominins provided an important reservoir of advantageous alleles that enabled adaptation to out-of-Africa environments.
Collapse
Affiliation(s)
- Rachel M Gittelman
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Joshua G Schraiber
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Benjamin Vernot
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Carmen Mikacenic
- Department of Medicine, Division of Pulmonary and Critical Care Medicine, University of Washington, Seattle, WA 98195, USA
| | - Mark M Wurfel
- Department of Medicine, Division of Pulmonary and Critical Care Medicine, University of Washington, Seattle, WA 98195, USA
| | - Joshua M Akey
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
20
|
Nuttle X, Giannuzzi G, Duyzend MH, Schraiber JG, Narvaiza I, Sudmant PH, Penn O, Chiatante G, Malig M, Huddleston J, Benner C, Camponeschi F, Ciofi-Baffoni S, Stessman HA, Marchetto MCN, Denman L, Harshman L, Baker C, Raja A, Penewit K, Janke N, Tang WJ, Ventura M, Banci L, Antonacci F, Akey JM, Amemiya CT, Gage FH, Reymond A, Eichler EE. Emergence of a Homo sapiens-specific gene family and chromosome 16p11.2 CNV susceptibility. Nature 2016; 536:205-9. [PMID: 27487209 PMCID: PMC4988886 DOI: 10.1038/nature19075] [Citation(s) in RCA: 79] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 07/02/2016] [Indexed: 12/31/2022]
Abstract
Genetic differences that specify unique aspects of human evolution have typically been identified by comparative analyses between the genomes of humans and closely related primates, including more recently the genomes of archaic hominins. Not all regions of the genome, however, are equally amenable to such study. Recurrent copy number variation (CNV) at chromosome 16p11.2 accounts for approximately 1% of cases of autism and is mediated by a complex set of segmental duplications, many of which arose recently during human evolution. Here we reconstruct the evolutionary history of the locus and identify bolA family member 2 (BOLA2) as a gene duplicated exclusively in Homo sapiens. We estimate that a 95-kilobase-pair segment containing BOLA2 duplicated across the critical region approximately 282 thousand years ago (ka), one of the latest among a series of genomic changes that dramatically restructured the locus during hominid evolution. All humans examined carried one or more copies of the duplication, which nearly fixed early in the human lineage--a pattern unlikely to have arisen so rapidly in the absence of selection (P < 0.0097). We show that the duplication of BOLA2 led to a novel, human-specific in-frame fusion transcript and that BOLA2 copy number correlates with both RNA expression (r = 0.36) and protein level (r = 0.65), with the greatest expression difference between human and chimpanzee in experimentally derived stem cells. Analyses of 152 patients carrying a chromosome 16p11. rearrangement show that more than 96% of breakpoints occur within the H. sapiens-specific duplication. In summary, the duplicative transposition of BOLA2 at the root of the H. sapiens lineage about 282 ka simultaneously increased copy number of a gene associated with iron homeostasis and predisposed our species to recurrent rearrangements associated with disease.
Collapse
Affiliation(s)
- Xander Nuttle
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Giuliana Giannuzzi
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Michael H. Duyzend
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Joshua G. Schraiber
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Iñigo Narvaiza
- Laboratory of Genetics, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Peter H. Sudmant
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Osnat Penn
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | | | - Maika Malig
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - John Huddleston
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, Seattle, WA 98195, USA
| | - Chris Benner
- Laboratory of Genetics, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Francesca Camponeschi
- Department of Chemistry, University of Florence, Via della Lastruccia 3, 50019 Sesto Fiorentino, Florence, Italy
| | - Simone Ciofi-Baffoni
- Department of Chemistry, University of Florence, Via della Lastruccia 3, 50019 Sesto Fiorentino, Florence, Italy
- Magnetic Resonance Center CERM, University of Florence, Via Luigi Sacconi 6, 50019, Sesto Fiorentino, Florence, Italy
| | - Holly A.F. Stessman
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Maria C. N. Marchetto
- Laboratory of Genetics, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA 92037, USA
| | - Laura Denman
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Lana Harshman
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Carl Baker
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Archana Raja
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, Seattle, WA 98195, USA
| | - Kelsi Penewit
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Nicolette Janke
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - W. Joyce Tang
- Benaroya Research Institute at Virginia Mason, Seattle, WA 98101, USA
| | - Mario Ventura
- Department of Biology, University of Bari, Bari, Italy
| | - Lucia Banci
- Department of Chemistry, University of Florence, Via della Lastruccia 3, 50019 Sesto Fiorentino, Florence, Italy
- Magnetic Resonance Center CERM, University of Florence, Via Luigi Sacconi 6, 50019, Sesto Fiorentino, Florence, Italy
| | | | - Joshua M. Akey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Chris T. Amemiya
- Benaroya Research Institute at Virginia Mason, Seattle, WA 98101, USA
| | - Fred H. Gage
- Laboratory of Genetics, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA 92037, USA
- Center for Academic Research and Training in Anthropogeny (CARTA), 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, Seattle, WA 98195, USA
| |
Collapse
|
21
|
Vernot B, Tucci S, Kelso J, Schraiber JG, Wolf AB, Gittelman RM, Dannemann M, Grote S, McCoy RC, Norton H, Scheinfeldt LB, Merriwether DA, Koki G, Friedlaender JS, Wakefield J, Pääbo S, Akey JM. Excavating Neandertal and Denisovan DNA from the genomes of Melanesian individuals. Science 2016; 352:235-9. [PMID: 26989198 DOI: 10.1126/science.aad9416] [Citation(s) in RCA: 233] [Impact Index Per Article: 29.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Accepted: 02/29/2016] [Indexed: 12/15/2022]
Abstract
Although Neandertal sequences that persist in the genomes of modern humans have been identified in Eurasians, comparable studies in people whose ancestors hybridized with both Neandertals and Denisovans are lacking. We developed an approach to identify DNA inherited from multiple archaic hominin ancestors and applied it to whole-genome sequences from 1523 geographically diverse individuals, including 35 previously unknown Island Melanesian genomes. In aggregate, we recovered 1.34 gigabases and 303 megabases of the Neandertal and Denisovan genome, respectively. We use these maps of archaic sequences to show that Neandertal admixture occurred multiple times in different non-African populations, characterize genomic regions that are significantly depleted of archaic sequences, and identify signatures of adaptive introgression.
Collapse
Affiliation(s)
- Benjamin Vernot
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Serena Tucci
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA. Department of Life Sciences and Biotechnology, University of Ferrara, Italy
| | - Janet Kelso
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Joshua G Schraiber
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Aaron B Wolf
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Rachel M Gittelman
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Michael Dannemann
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Steffi Grote
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Rajiv C McCoy
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Heather Norton
- Department of Anthropology, University of Cincinnati, Cincinnati, OH, USA
| | - Laura B Scheinfeldt
- Department of Biology and Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA
| | | | - George Koki
- Institute for Medical Research, Goroka, Eastern Highlands Province, Papua New Guinea
| | | | - Jon Wakefield
- Department of Statistics, University of Washington, Seattle, Washington, USA
| | - Svante Pääbo
- Department of Evolutionary Genetics, Max-Planck-Institute for Evolutionary Anthropology, Leipzig, Germany.
| | - Joshua M Akey
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA.
| |
Collapse
|
22
|
Frantz LAF, Schraiber JG, Madsen O, Megens HJ, Cagan A, Bosse M, Paudel Y, Crooijmans RPMA, Larson G, Groenen MAM. Evidence of long-term gene flow and selection during domestication from analyses of Eurasian wild and domestic pig genomes. Nat Genet 2015; 47:1141-8. [PMID: 26323058 DOI: 10.1038/ng.3394] [Citation(s) in RCA: 166] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Accepted: 08/10/2015] [Indexed: 12/18/2022]
Abstract
Traditionally, the process of domestication is assumed to be initiated by humans, involve few individuals and rely on reproductive isolation between wild and domestic forms. We analyzed pig domestication using over 100 genome sequences and tested whether pig domestication followed a traditional linear model or a more complex, reticulate model. We found that the assumptions of traditional models, such as reproductive isolation and strong domestication bottlenecks, are incompatible with the genetic data. In addition, our results show that, despite gene flow, the genomes of domestic pigs have strong signatures of selection at loci that affect behavior and morphology. We argue that recurrent selection for domestic traits likely counteracted the homogenizing effect of gene flow from wild boars and created 'islands of domestication' in the genome. Our results have major ramifications for the understanding of animal domestication and suggest that future studies should employ models that do not assume reproductive isolation.
Collapse
Affiliation(s)
- Laurent A F Frantz
- Animal Breeding and Genomics Group, Wageningen University, Wageningen, the Netherlands.,Palaeogenomics and Bio-Archaeology Research Network, Research Laboratory for Archaeology and History of Art, University of Oxford, Oxford, UK
| | - Joshua G Schraiber
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, USA.,Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Ole Madsen
- Animal Breeding and Genomics Group, Wageningen University, Wageningen, the Netherlands
| | - Hendrik-Jan Megens
- Animal Breeding and Genomics Group, Wageningen University, Wageningen, the Netherlands
| | - Alex Cagan
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Mirte Bosse
- Animal Breeding and Genomics Group, Wageningen University, Wageningen, the Netherlands
| | - Yogesh Paudel
- Animal Breeding and Genomics Group, Wageningen University, Wageningen, the Netherlands
| | | | - Greger Larson
- Palaeogenomics and Bio-Archaeology Research Network, Research Laboratory for Archaeology and History of Art, University of Oxford, Oxford, UK
| | - Martien A M Groenen
- Animal Breeding and Genomics Group, Wageningen University, Wageningen, the Netherlands
| |
Collapse
|
23
|
Frantz LAF, Madsen O, Megens HJ, Schraiber JG, Paudel Y, Bosse M, Crooijmans RPMA, Larson G, Groenen MAM. Evolution of Tibetan wild boars. Nat Genet 2015; 47:188-9. [PMID: 25711859 DOI: 10.1038/ng.3197] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Affiliation(s)
- Laurent A F Frantz
- 1] Animal Breeding and Genomics Group, Wageningen University, Wageningen, the Netherlands. [2] Palaeogenomics and Bio-Archaeology Research Network, Research Laboratory for Archaeology and the History of Art, University of Oxford, Oxford, UK
| | - Ole Madsen
- Animal Breeding and Genomics Group, Wageningen University, Wageningen, the Netherlands
| | - Hendrik-Jan Megens
- Animal Breeding and Genomics Group, Wageningen University, Wageningen, the Netherlands
| | - Joshua G Schraiber
- 1] Department of Integrative Biology, University of California, Berkeley, Berkeley, California, USA. [2] Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Yogesh Paudel
- Animal Breeding and Genomics Group, Wageningen University, Wageningen, the Netherlands
| | - Mirte Bosse
- Animal Breeding and Genomics Group, Wageningen University, Wageningen, the Netherlands
| | | | - Greger Larson
- 1] Durham Evolution and Ancient DNA, Department of Archaeology, Durham University, Durham, UK. [2] Palaeogenomics and Bio-Archaeology Research Network, Research Laboratory for Archaeology and the History of Art, University of Oxford, Oxford, UK
| | - Martien A M Groenen
- Animal Breeding and Genomics Group, Wageningen University, Wageningen, the Netherlands
| |
Collapse
|
24
|
Frantz LAF, Schraiber JG, Madsen O, Megens HJ, Bosse M, Paudel Y, Semiadi G, Meijaard E, Li N, Crooijmans RPMA, Archibald AL, Slatkin M, Schook LB, Larson G, Groenen MAM. Genome sequencing reveals fine scale diversification and reticulation history during speciation in Sus. Genome Biol 2015; 14:R107. [PMID: 24070215 PMCID: PMC4053821 DOI: 10.1186/gb-2013-14-9-r107] [Citation(s) in RCA: 107] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2013] [Revised: 08/21/2013] [Accepted: 09/26/2013] [Indexed: 11/26/2022] Open
Abstract
Background Elucidating the process of speciation requires an in-depth understanding of the evolutionary history of the species in question. Studies that rely upon a limited number of genetic loci do not always reveal actual evolutionary history, and often confuse inferences related to phylogeny and speciation. Whole-genome data, however, can overcome this issue by providing a nearly unbiased window into the patterns and processes of speciation. In order to reveal the complexity of the speciation process, we sequenced and analyzed the genomes of 10 wild pigs, representing morphologically or geographically well-defined species and subspecies of the genus Sus from insular and mainland Southeast Asia, and one African common warthog. Results Our data highlight the importance of past cyclical climatic fluctuations in facilitating the dispersal and isolation of populations, thus leading to the diversification of suids in one of the most species-rich regions of the world. Moreover, admixture analyses revealed extensive, intra- and inter-specific gene-flow that explains previous conflicting results obtained from a limited number of loci. We show that these multiple episodes of gene-flow resulted from both natural and human-mediated dispersal. Conclusions Our results demonstrate the importance of past climatic fluctuations and human mediated translocations in driving and complicating the process of speciation in island Southeast Asia. This case study demonstrates that genomics is a powerful tool to decipher the evolutionary history of a genus, and reveals the complexity of the process of speciation.
Collapse
|
25
|
Lazaridis I, Patterson N, Mittnik A, Renaud G, Mallick S, Kirsanow K, Sudmant PH, Schraiber JG, Castellano S, Lipson M, Berger B, Economou C, Bollongino R, Fu Q, Bos KI, Nordenfelt S, Li H, de Filippo C, Prüfer K, Sawyer S, Posth C, Haak W, Hallgren F, Fornander E, Rohland N, Delsate D, Francken M, Guinet JM, Wahl J, Ayodo G, Babiker HA, Bailliet G, Balanovska E, Balanovsky O, Barrantes R, Bedoya G, Ben-Ami H, Bene J, Berrada F, Bravi CM, Brisighelli F, Busby GBJ, Cali F, Churnosov M, Cole DEC, Corach D, Damba L, van Driem G, Dryomov S, Dugoujon JM, Fedorova SA, Gallego Romero I, Gubina M, Hammer M, Henn BM, Hervig T, Hodoglugil U, Jha AR, Karachanak-Yankova S, Khusainova R, Khusnutdinova E, Kittles R, Kivisild T, Klitz W, Kučinskas V, Kushniarevich A, Laredj L, Litvinov S, Loukidis T, Mahley RW, Melegh B, Metspalu E, Molina J, Mountain J, Näkkäläjärvi K, Nesheva D, Nyambo T, Osipova L, Parik J, Platonov F, Posukh O, Romano V, Rothhammer F, Rudan I, Ruizbakiev R, Sahakyan H, Sajantila A, Salas A, Starikovskaya EB, Tarekegn A, Toncheva D, Turdikulova S, Uktveryte I, Utevska O, Vasquez R, Villena M, Voevoda M, Winkler CA, Yepiskoposyan L, Zalloua P, Zemunik T, Cooper A, Capelli C, Thomas MG, Ruiz-Linares A, Tishkoff SA, Singh L, Thangaraj K, Villems R, Comas D, Sukernik R, Metspalu M, Meyer M, Eichler EE, Burger J, Slatkin M, Pääbo S, Kelso J, Reich D, Krause J. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 2014; 513:409-13. [PMID: 25230663 PMCID: PMC4170574 DOI: 10.1038/nature13673] [Citation(s) in RCA: 737] [Impact Index Per Article: 73.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Accepted: 07/11/2014] [Indexed: 12/19/2022]
Abstract
We sequenced the genomes of a ~7,000 year old farmer from Germany and eight
~8,000 year old hunter-gatherers from Luxembourg and Sweden. We analyzed these and other
ancient genomes1–4 with 2,345 contemporary humans to show that most
present Europeans derive from at least three highly differentiated populations: West
European Hunter-Gatherers (WHG), who contributed ancestry to all Europeans but not to Near
Easterners; Ancient North Eurasians (ANE) related to Upper Paleolithic Siberians3, who contributed to both Europeans and Near
Easterners; and Early European Farmers (EEF), who were mainly of Near Eastern origin but
also harbored WHG-related ancestry. We model these populations’ deep relationships
and show that EEF had ~44% ancestry from a “Basal Eurasian”
population that split prior to the diversification of other non-African lineages.
Collapse
Affiliation(s)
- Iosif Lazaridis
- 1] Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA. [2] Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA
| | - Nick Patterson
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA
| | - Alissa Mittnik
- Institute for Archaeological Sciences, University of Tübingen, Tübingen 72074, Germany
| | - Gabriel Renaud
- Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
| | - Swapan Mallick
- 1] Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA. [2] Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA
| | - Karola Kirsanow
- Institute of Anthropology, Johannes Gutenberg University Mainz, Mainz D-55128, Germany
| | - Peter H Sudmant
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Joshua G Schraiber
- 1] Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA. [2] Department of Integrative Biology, University of California, Berkeley, California 94720-3140, USA
| | - Sergi Castellano
- Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
| | - Mark Lipson
- Department of Mathematics and Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Bonnie Berger
- 1] Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA. [2] Department of Mathematics and Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Christos Economou
- Archaeological Research Laboratory, Stockholm University, 114 18, Sweden
| | - Ruth Bollongino
- Institute of Anthropology, Johannes Gutenberg University Mainz, Mainz D-55128, Germany
| | - Qiaomei Fu
- 1] Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA. [2] Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany. [3] Key Laboratory of Vertebrate Evolution and Human Origins of Chinese Academy of Sciences, IVPP, CAS, Beijing 100049, China
| | - Kirsten I Bos
- Institute for Archaeological Sciences, University of Tübingen, Tübingen 72074, Germany
| | - Susanne Nordenfelt
- 1] Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA. [2] Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA
| | - Heng Li
- 1] Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA. [2] Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA
| | - Cesare de Filippo
- Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
| | - Kay Prüfer
- Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
| | - Susanna Sawyer
- Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
| | - Cosimo Posth
- Institute for Archaeological Sciences, University of Tübingen, Tübingen 72074, Germany
| | - Wolfgang Haak
- Australian Centre for Ancient DNA and Environment Institute, School of Earth and Environmental Sciences, University of Adelaide, Adelaide, South Australia 5005, Australia
| | | | - Elin Fornander
- The Cultural Heritage Foundation, Västerås 722 12, Sweden
| | - Nadin Rohland
- 1] Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA. [2] Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA
| | - Dominique Delsate
- 1] National Museum of Natural History, L-2160, Luxembourg. [2] National Center of Archaeological Research, National Museum of History and Art, L-2345, Luxembourg
| | - Michael Francken
- Department of Paleoanthropology, Senckenberg Center for Human Evolution and Paleoenvironment, University of Tübingen, Tübingen D-72070, Germany
| | | | - Joachim Wahl
- State Office for Cultural Heritage Management Baden-Württemberg, Osteology, Konstanz D-78467, Germany
| | - George Ayodo
- Center for Global Health and Child Development, Kisumu 40100, Kenya
| | - Hamza A Babiker
- 1] Institutes of Evolution, Immunology and Infection Research, School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JT, UK. [2] Biochemistry Department, Faculty of Medicine, Sultan Qaboos University, Alkhod, Muscat 123, Oman
| | - Graciela Bailliet
- Laboratorio de Genética Molecular Poblacional, Instituto Multidisciplinario de Biología Celular (IMBICE), CCT-CONICET &CICPBA, La Plata, B1906APO, Argentina
| | | | - Oleg Balanovsky
- 1] Research Centre for Medical Genetics, Moscow 115478, Russia. [2] Vavilov Institute for General Genetics, Moscow 119991, Russia
| | - Ramiro Barrantes
- Escuela de Biología, Universidad de Costa Rica, San José 2060, Costa Rica
| | - Gabriel Bedoya
- Institute of Biology, Research group GENMOL, Universidad de Antioquia, Medellín, Colombia
| | | | - Judit Bene
- Department of Medical Genetics and Szentagothai Research Center, University of Pécs, Pécs H-7624, Hungary
| | - Fouad Berrada
- Al Akhawayn University in Ifrane (AUI), School of Science and Engineering, Ifrane 53000, Morocco
| | - Claudio M Bravi
- Laboratorio de Genética Molecular Poblacional, Instituto Multidisciplinario de Biología Celular (IMBICE), CCT-CONICET &CICPBA, La Plata, B1906APO, Argentina
| | - Francesca Brisighelli
- Forensic Genetics Laboratory, Institute of Legal Medicine, Università Cattolica del Sacro Cuore, Rome 00168, Italy
| | - George B J Busby
- 1] Department of Zoology, University of Oxford, Oxford OX1 3PS, UK. [2] Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Francesco Cali
- Laboratorio di Genetica Molecolare, IRCCS Associazione Oasi Maria SS, Troina 94018, Italy
| | | | - David E C Cole
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario M5G 1L5, Canada
| | - Daniel Corach
- Servicio de Huellas Digitales Genéticas, School of Pharmacy and Biochemistry, Universidad de Buenos Aires, 1113 CABA, Argentina
| | - Larissa Damba
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk 630090, Russia
| | - George van Driem
- Institute of Linguistics, University of Bern, Bern CH-3012, Switzerland
| | - Stanislav Dryomov
- Laboratory of Human Molecular Genetics, Institute of Molecular and Cellular Biology, Russian Academy of Science, Siberian Branch, Novosibirsk 630090, Russia
| | - Jean-Michel Dugoujon
- Anthropologie Moléculaire et Imagerie de Synthèse, CNRS UMR 5288, Université Paul Sabatier Toulouse III, Toulouse 31000, France
| | - Sardana A Fedorova
- North-Eastern Federal University and Yakut Research Center of Complex Medical Problems, Yakutsk 677013, Russia
| | - Irene Gallego Romero
- Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA
| | - Marina Gubina
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk 630090, Russia
| | - Michael Hammer
- ARL Division of Biotechnology, University of Arizona, Tucson, Arizona 85721, USA
| | - Brenna M Henn
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, New York 11794, USA
| | - Tor Hervig
- Department of Clinical Science, University of Bergen, Bergen 5021, Norway
| | | | - Aashish R Jha
- Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA
| | - Sena Karachanak-Yankova
- Department of Medical Genetics, National Human Genome Center, Medical University Sofia, Sofia 1431, Bulgaria
| | - Rita Khusainova
- 1] Institute of Biochemistry and Genetics, Ufa Research Centre, Russian Academy of Sciences, Ufa 450054, Russia. [2] Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa 450074, Russia
| | - Elza Khusnutdinova
- 1] Institute of Biochemistry and Genetics, Ufa Research Centre, Russian Academy of Sciences, Ufa 450054, Russia. [2] Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa 450074, Russia
| | - Rick Kittles
- College of Medicine, University of Arizona, Tucson, Arizona 85724, USA
| | - Toomas Kivisild
- Division of Biological Anthropology, University of Cambridge, Cambridge CB2 1QH, UK
| | - William Klitz
- Department of Integrative Biology, University of California, Berkeley, California 94720-3140, USA
| | - Vaidutis Kučinskas
- Department of Human and Medical Genetics, Vilnius University, Vilnius LT-08661, Lithuania
| | | | - Leila Laredj
- Translational Medicine and Neurogenetics, Institut de Génétique et de Biologie Moléculaire et Cellulaire, Illkirch 67404, France
| | - Sergey Litvinov
- 1] Institute of Biochemistry and Genetics, Ufa Research Centre, Russian Academy of Sciences, Ufa 450054, Russia. [2] Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa 450074, Russia. [3] Estonian Biocentre, Evolutionary Biology group, Tartu, 51010, Estonia
| | - Theologos Loukidis
- 1] Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK. [2] Amgen, 33 Kazantzaki Str, Ilioupolis 16342, Athens, Greece (T.L.); Banaras Hindu University, Varanasi 221 005, India (L.S.)
| | | | - Béla Melegh
- Department of Medical Genetics and Szentagothai Research Center, University of Pécs, Pécs H-7624, Hungary
| | - Ene Metspalu
- Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia
| | - Julio Molina
- Centro de Investigaciones Biomédicas de Guatemala, Ciudad de Guatemala, Guatemala
| | - Joanna Mountain
- Research Department, 23andMe, Mountain View, California 94043, USA
| | | | - Desislava Nesheva
- Department of Medical Genetics, National Human Genome Center, Medical University Sofia, Sofia 1431, Bulgaria
| | - Thomas Nyambo
- Department of Biochemistry, Muhimbili University of Health and Allied Sciences, Dar es Salaam 65001, Tanzania
| | - Ludmila Osipova
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk 630090, Russia
| | - Jüri Parik
- Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia
| | - Fedor Platonov
- Research Institute of Health, North-Eastern Federal University, Yakutsk 677000, Russia
| | - Olga Posukh
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk 630090, Russia
| | - Valentino Romano
- Dipartimento di Fisica e Chimica, Università di Palermo, Palermo 90128, Italy
| | - Francisco Rothhammer
- 1] Instituto de Alta Investigación, Universidad de Tarapacá, Arica 1000000, Chile. [2] Programa de Genética Humana ICBM Facultad de Medicina Universidad de Chile, Santiago 8320000, Chile. [3] Centro de Investigaciones del Hombre en el Desierto, Arica 1000000, Chile
| | - Igor Rudan
- Centre for Population Health Sciences, The University of Edinburgh Medical School, Edinburgh EH8 9AG, UK
| | - Ruslan Ruizbakiev
- 1] Institute of Immunology, Academy of Science, Tashkent 70000, Uzbekistan. [2]
| | - Hovhannes Sahakyan
- 1] Estonian Biocentre, Evolutionary Biology group, Tartu, 51010, Estonia. [2] Laboratory of Ethnogenomics, Institute of Molecular Biology, National Academy of Sciences of Armenia, Yerevan 0014, Armenia
| | - Antti Sajantila
- 1] Department of Forensic Medicine, Hjelt Institute, University of Helsinki, Helsinki 00014, Finland. [2] Institute of Applied Genetics, Department of Molecular and Medical Genetics, University of North Texas Health Science Center, Fort Worth, Texas 76107, USA
| | - Antonio Salas
- Unidade de Xenética, Departamento de Anatomía Patolóxica e Ciencias Forenses, and Instituto de Ciencias Forenses, Grupo de Medicina Xenómica (GMX), Facultade de Medicina, Universidade de Santiago de Compostela, Galcia 15872, Spain
| | - Elena B Starikovskaya
- Laboratory of Human Molecular Genetics, Institute of Molecular and Cellular Biology, Russian Academy of Science, Siberian Branch, Novosibirsk 630090, Russia
| | - Ayele Tarekegn
- Research Fellow, Henry Stewart Group, Russell House, London WC1A 2HN, UK
| | - Draga Toncheva
- Department of Medical Genetics, National Human Genome Center, Medical University Sofia, Sofia 1431, Bulgaria
| | - Shahlo Turdikulova
- Institute of Bioorganic Chemistry Academy of Sciences Republic of Uzbekistan, Tashkent 100125, Uzbekistan
| | - Ingrida Uktveryte
- Department of Human and Medical Genetics, Vilnius University, Vilnius LT-08661, Lithuania
| | - Olga Utevska
- Department of Genetics and Cytology, V. N. Karazin Kharkiv National University, Kharkiv 61077, Ukraine
| | - René Vasquez
- 1] Instituto Boliviano de Biología de la Altura, Universidad Mayor de San Andrés, 591 2 La Paz, Bolivia. [2] UniversidadAutonoma Tomás Frías, Potosí, Bolivia
| | - Mercedes Villena
- 1] Instituto Boliviano de Biología de la Altura, Universidad Mayor de San Andrés, 591 2 La Paz, Bolivia. [2] UniversidadAutonoma Tomás Frías, Potosí, Bolivia
| | - Mikhail Voevoda
- 1] Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences, Novosibirsk 630090, Russia. [2] Institute of Internal Medicine, Siberian Branch of Russian Academy of Medical Sciences, Novosibirsk 630089, Russia. [3] Novosibirsk State University, Novosibirsk 630090, Russia
| | - Cheryl A Winkler
- Basic Research Laboratory, NCI, NIH, Frederick National Laboratory, Leidos Biomedical, Frederick, Maryland 21702, USA
| | - Levon Yepiskoposyan
- Laboratory of Ethnogenomics, Institute of Molecular Biology, National Academy of Sciences of Armenia, Yerevan 0014, Armenia
| | - Pierre Zalloua
- 1] Lebanese American University, School of Medicine, Beirut 13-5053, Lebanon. [2] Harvard School of Public Health, Boston, Massachusetts 02115, USA
| | - Tatijana Zemunik
- Department of Medical Biology, University of Split, School of Medicine, Split 21000, Croatia
| | - Alan Cooper
- Australian Centre for Ancient DNA and Environment Institute, School of Earth and Environmental Sciences, University of Adelaide, Adelaide, South Australia 5005, Australia
| | | | - Mark G Thomas
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| | - Andres Ruiz-Linares
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK
| | - Sarah A Tishkoff
- Department of Biology and Genetics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Lalji Singh
- 1] CSIR-Centre for Cellular and Molecular Biology, Hyderabad 500 007, India. [2] Amgen, 33 Kazantzaki Str, Ilioupolis 16342, Athens, Greece (T.L.); Banaras Hindu University, Varanasi 221 005, India (L.S.)
| | | | - Richard Villems
- 1] Estonian Biocentre, Evolutionary Biology group, Tartu, 51010, Estonia. [2] Department of Evolutionary Biology, University of Tartu, Tartu 51010, Estonia. [3] Estonian Academy of Sciences, Tallinn 10130, Estonia
| | - David Comas
- Institut de Biologia Evolutiva (CSIC-UPF), Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Rem Sukernik
- Laboratory of Human Molecular Genetics, Institute of Molecular and Cellular Biology, Russian Academy of Science, Siberian Branch, Novosibirsk 630090, Russia
| | - Mait Metspalu
- Estonian Biocentre, Evolutionary Biology group, Tartu, 51010, Estonia
| | - Matthias Meyer
- Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
| | - Evan E Eichler
- 1] Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA. [2] Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | - Joachim Burger
- Institute of Anthropology, Johannes Gutenberg University Mainz, Mainz D-55128, Germany
| | - Montgomery Slatkin
- Department of Integrative Biology, University of California, Berkeley, California 94720-3140, USA
| | - Svante Pääbo
- Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
| | - Janet Kelso
- Max Planck Institute for Evolutionary Anthropology, Leipzig 04103, Germany
| | - David Reich
- 1] Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA. [2] Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA. [3] Howard Hughes Medical Institute, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Johannes Krause
- 1] Institute for Archaeological Sciences, University of Tübingen, Tübingen 72074, Germany. [2] Senckenberg Centre for Human Evolution and Palaeoenvironment, University of Tübingen, 72070 Tübingen, Germany. [3] Max Planck Institut für Geschichte und Naturwissenschaften, Jena 07745, Germany
| |
Collapse
|
26
|
Abstract
The Wright-Fisher process with selection is an important tool in population genetics theory. Traditional analysis of this process relies on the diffusion approximation. The diffusion approximation is usually studied in a partial differential equations framework. In this paper, I introduce a path integral formalism to study the Wright-Fisher process with selection and use that formalism to obtain a simple perturbation series to approximate the transition density. The perturbation series can be understood in terms of Feynman diagrams, which have a simple probabilistic interpretation in terms of selective events. The perturbation series proves to be an accurate approximation of the transition density for weak selection and is shown to be arbitrarily accurate for any selection coefficient.
Collapse
|
27
|
Schraiber JG, Mostovoy Y, Hsu TY, Brem RB. Inferring evolutionary histories of pathway regulation from transcriptional profiling data. PLoS Comput Biol 2013; 9:e1003255. [PMID: 24130471 PMCID: PMC3794907 DOI: 10.1371/journal.pcbi.1003255] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2013] [Accepted: 08/20/2013] [Indexed: 01/09/2023] Open
Abstract
One of the outstanding challenges in comparative genomics is to interpret the evolutionary importance of regulatory variation between species. Rigorous molecular evolution-based methods to infer evidence for natural selection from expression data are at a premium in the field, and to date, phylogenetic approaches have not been well-suited to address the question in the small sets of taxa profiled in standard surveys of gene expression. We have developed a strategy to infer evolutionary histories from expression profiles by analyzing suites of genes of common function. In a manner conceptually similar to molecular evolution models in which the evolutionary rates of DNA sequence at multiple loci follow a gamma distribution, we modeled expression of the genes of an a priori-defined pathway with rates drawn from an inverse gamma distribution. We then developed a fitting strategy to infer the parameters of this distribution from expression measurements, and to identify gene groups whose expression patterns were consistent with evolutionary constraint or rapid evolution in particular species. Simulations confirmed the power and accuracy of our inference method. As an experimental testbed for our approach, we generated and analyzed transcriptional profiles of four Saccharomyces yeasts. The results revealed pathways with signatures of constrained and accelerated regulatory evolution in individual yeasts and across the phylogeny, highlighting the prevalence of pathway-level expression change during the divergence of yeast species. We anticipate that our pathway-based phylogenetic approach will be of broad utility in the search to understand the evolutionary relevance of regulatory change. Comparative transcriptomic studies routinely identify thousands of genes differentially expressed between species. The central question in the field is whether and how such regulatory changes have been the product of natural selection. Can the signal of evolutionarily relevant expression divergence be detected amid the noise of changes resulting from genetic drift? Our work develops a theory of gene expression variation among a suite of genes that function together. We derive a formalism that relates empirical observations of expression of pathway genes in divergent species to the underlying strength of natural selection on expression output. We show that fitting this type of model to simulated data accurately recapitulates the parameters used to generate the simulation. We then make experimental measurements of gene expression in a panel of single-celled eukaryotic yeast species. To these data we apply our inference method, and identify pathways with striking evidence for accelerated or constrained regulatory evolution, in particular species and across the phylogeny. Our method provides a key advance over previous approaches in that it maximizes the power of rigorous molecular-evolution analysis of regulatory variation even when data are relatively sparse. As such, the theory and tools we have developed will likely find broad application in the field of comparative genomics.
Collapse
Affiliation(s)
- Joshua G. Schraiber
- Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Yulia Mostovoy
- Department of Molecular and Cellular Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Tiffany Y. Hsu
- Department of Molecular and Cellular Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - Rachel B. Brem
- Department of Molecular and Cellular Biology, University of California, Berkeley, Berkeley, California, United States of America
- * E-mail:
| |
Collapse
|
28
|
Schraiber JG, Griffiths RC, Evans SN. Analysis and rejection sampling of Wright-Fisher diffusion bridges. Theor Popul Biol 2013; 89:64-74. [PMID: 24001410 DOI: 10.1016/j.tpb.2013.08.005] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2013] [Revised: 08/08/2013] [Accepted: 08/14/2013] [Indexed: 11/24/2022]
Abstract
We investigate the properties of a Wright-Fisher diffusion process starting at frequency x at time 0 and conditioned to be at frequency y at time T. Such a process is called a bridge. Bridges arise naturally in the analysis of selection acting on standing variation and in the inference of selection from allele frequency time series. We establish a number of results about the distribution of neutral Wright-Fisher bridges and develop a novel rejection-sampling scheme for bridges under selection that we use to study their behavior.
Collapse
Affiliation(s)
- Joshua G Schraiber
- Department of Integrative Biology, University of California, 3060 Valley Life Sciences Bldg #3140, Berkeley, CA 94720-3140, USA.
| | | | | |
Collapse
|
29
|
Groenen MAM, Archibald AL, Uenishi H, Tuggle CK, Takeuchi Y, Rothschild MF, Rogel-Gaillard C, Park C, Milan D, Megens HJ, Li S, Larkin DM, Kim H, Frantz LAF, Caccamo M, Ahn H, Aken BL, Anselmo A, Anthon C, Auvil L, Badaoui B, Beattie CW, Bendixen C, Berman D, Blecha F, Blomberg J, Bolund L, Bosse M, Botti S, Bujie Z, Bystrom M, Capitanu B, Carvalho-Silva D, Chardon P, Chen C, Cheng R, Choi SH, Chow W, Clark RC, Clee C, Crooijmans RPMA, Dawson HD, Dehais P, De Sapio F, Dibbits B, Drou N, Du ZQ, Eversole K, Fadista J, Fairley S, Faraut T, Faulkner GJ, Fowler KE, Fredholm M, Fritz E, Gilbert JGR, Giuffra E, Gorodkin J, Griffin DK, Harrow JL, Hayward A, Howe K, Hu ZL, Humphray SJ, Hunt T, Hornshøj H, Jeon JT, Jern P, Jones M, Jurka J, Kanamori H, Kapetanovic R, Kim J, Kim JH, Kim KW, Kim TH, Larson G, Lee K, Lee KT, Leggett R, Lewin HA, Li Y, Liu W, Loveland JE, Lu Y, Lunney JK, Ma J, Madsen O, Mann K, Matthews L, McLaren S, Morozumi T, Murtaugh MP, Narayan J, Nguyen DT, Ni P, Oh SJ, Onteru S, Panitz F, Park EW, Park HS, Pascal G, Paudel Y, Perez-Enciso M, Ramirez-Gonzalez R, Reecy JM, Rodriguez-Zas S, Rohrer GA, Rund L, Sang Y, Schachtschneider K, Schraiber JG, Schwartz J, Scobie L, Scott C, Searle S, Servin B, Southey BR, Sperber G, Stadler P, Sweedler JV, Tafer H, Thomsen B, Wali R, Wang J, Wang J, White S, Xu X, Yerle M, Zhang G, Zhang J, Zhang J, Zhao S, Rogers J, Churcher C, Schook LB. Analyses of pig genomes provide insight into porcine demography and evolution. Nature 2012; 491:393-8. [PMID: 23151582 PMCID: PMC3566564 DOI: 10.1038/nature11622] [Citation(s) in RCA: 947] [Impact Index Per Article: 78.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2012] [Accepted: 09/27/2012] [Indexed: 01/03/2023]
Abstract
For 10,000 years pigs and humans have shared a close and complex relationship. From domestication to modern breeding practices, humans have shaped the genomes of domestic pigs. Here we present the assembly and analysis of the genome sequence of a female domestic Duroc pig (Sus scrofa) and a comparison with the genomes of wild and domestic pigs from Europe and Asia. Wild pigs emerged in South East Asia and subsequently spread across Eurasia. Our results reveal a deep phylogenetic split between European and Asian wild boars ∼1 million years ago, and a selective sweep analysis indicates selection on genes involved in RNA processing and regulation. Genes associated with immune response and olfaction exhibit fast evolution. Pigs have the largest repertoire of functional olfactory receptor genes, reflecting the importance of smell in this scavenging animal. The pig genome sequence provides an important resource for further improvements of this important livestock species, and our identification of many putative disease-causing variants extends the potential of the pig as a biomedical model.
Collapse
Affiliation(s)
- Martien A M Groenen
- Animal Breeding and Genomics Centre, Wageningen University, De Elst 1, 6708 WD, Wageningen, The Netherlands.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Abstract
Gaussian processes, a class of stochastic processes including Brownian motion and the Ornstein-Uhlenbeck process, are widely used to model continuous trait evolution in statistical phylogenetics. Under such processes, observations at the tips of a phylogenetic tree have a multivariate Gaussian distribution, which may lead to suboptimal model specification under certain evolutionary conditions, as supposed in models of punctuated equilibrium or adaptive radiation. To consider non-normally distributed continuous trait evolution, we introduce a method to compute posterior probabilities when modeling continuous trait evolution as a Lévy process. Through data simulation and model testing, we establish that single-rate Brownian motion (BM) and Lévy processes with jumps generate distinct patterns in comparative data. We then analyzed body mass and endocranial volume measurements for 126 primates. We rejected single-rate BM in favor of a Lévy process with jumps for each trait, with the lineage leading to most recent common ancestor of great apes showing particularly strong evidence against single-rate BM.
Collapse
Affiliation(s)
- Michael J Landis
- Department of Integrative Biology, University of California, Berkeley, CA 94720-3140, USA
| | | | | |
Collapse
|
31
|
Meyer M, Kircher M, Gansauge MT, Li H, Racimo F, Mallick S, Schraiber JG, Jay F, Prüfer K, de Filippo C, Sudmant PH, Alkan C, Fu Q, Do R, Rohland N, Tandon A, Siebauer M, Green RE, Bryc K, Briggs AW, Stenzel U, Dabney J, Shendure J, Kitzman J, Hammer MF, Shunkov MV, Derevianko AP, Patterson N, Andrés AM, Eichler EE, Slatkin M, Reich D, Kelso J, Pääbo S. A high-coverage genome sequence from an archaic Denisovan individual. Science 2012; 338:222-6. [PMID: 22936568 DOI: 10.1126/science.1224344] [Citation(s) in RCA: 1066] [Impact Index Per Article: 88.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
We present a DNA library preparation method that has allowed us to reconstruct a high-coverage (30×) genome sequence of a Denisovan, an extinct relative of Neandertals. The quality of this genome allows a direct estimation of Denisovan heterozygosity indicating that genetic diversity in these archaic hominins was extremely low. It also allows tentative dating of the specimen on the basis of "missing evolution" in its genome, detailed measurements of Denisovan and Neandertal admixture into present-day human populations, and the generation of a near-complete catalog of genetic changes that swept to high frequency in modern humans since their divergence from Denisovans.
Collapse
Affiliation(s)
- Matthias Meyer
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Abstract
Expression variation is widespread between species. The ability to distinguish regulatory change driven by natural selection from the consequences of neutral drift remains a major challenge in comparative genomics. In this work, we used observations of mRNA expression and promoter sequence to analyze signatures of selection on groups of functionally related genes in Saccharomycete yeasts. In a survey of gene regulons with expression divergence between Saccharomyces cerevisiae and S. paradoxus, we found that most were subject to variation in trans-regulatory factors that provided no evidence against a neutral model. However, we identified one regulon of membrane protein genes controlled by unlinked cis- and trans-acting determinants with coherent effects on gene expression, consistent with a history of directional, nonneutral evolution. For this membrane protein group, S. paradoxus alleles at regulatory loci were associated with elevated expression and altered stress responsiveness relative to other yeasts. In a phylogenetic comparison of promoter sequences of the membrane protein genes between species, the S. paradoxus lineage was distinguished by a short branch length, indicative of strong selective constraint. Likewise, sequence variants within the S. paradoxus population, but not across strains of other yeasts, were skewed toward low frequencies in promoters of genes in the membrane protein regulon, again reflecting strong purifying selection. Our results support a model in which a distinct expression program for the membrane protein genes in S. paradoxus has been preferentially maintained by negative selection as the result of an increased importance to organismal fitness. These findings illustrate the power of integrating expression- and sequence-based tests of natural selection in the study of evolutionary forces that underlie regulatory change.
Collapse
Affiliation(s)
- Hilary C Martin
- Department of Molecular and Cell Biology, University of California, Berkeley, USA
| | | | | | | | | |
Collapse
|