1
|
Stankey CT, Bourges C, Haag LM, Turner-Stokes T, Piedade AP, Palmer-Jones C, Papa I, Silva Dos Santos M, Zhang Q, Cameron AJ, Legrini A, Zhang T, Wood CS, New FN, Randzavola LO, Speidel L, Brown AC, Hall A, Saffioti F, Parkes EC, Edwards W, Direskeneli H, Grayson PC, Jiang L, Merkel PA, Saruhan-Direskeneli G, Sawalha AH, Tombetti E, Quaglia A, Thorburn D, Knight JC, Rochford AP, Murray CD, Divakar P, Green M, Nye E, MacRae JI, Jamieson NB, Skoglund P, Cader MZ, Wallace C, Thomas DC, Lee JC. A disease-associated gene desert directs macrophage inflammation through ETS2. Nature 2024; 630:447-456. [PMID: 38839969 PMCID: PMC11168933 DOI: 10.1038/s41586-024-07501-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 05/01/2024] [Indexed: 06/07/2024]
Abstract
Increasing rates of autoimmune and inflammatory disease present a burgeoning threat to human health1. This is compounded by the limited efficacy of available treatments1 and high failure rates during drug development2, highlighting an urgent need to better understand disease mechanisms. Here we show how functional genomics could address this challenge. By investigating an intergenic haplotype on chr21q22-which has been independently linked to inflammatory bowel disease, ankylosing spondylitis, primary sclerosing cholangitis and Takayasu's arteritis3-6-we identify that the causal gene, ETS2, is a central regulator of human inflammatory macrophages and delineate the shared disease mechanism that amplifies ETS2 expression. Genes regulated by ETS2 were prominently expressed in diseased tissues and more enriched for inflammatory bowel disease GWAS hits than most previously described pathways. Overexpressing ETS2 in resting macrophages reproduced the inflammatory state observed in chr21q22-associated diseases, with upregulation of multiple drug targets, including TNF and IL-23. Using a database of cellular signatures7, we identified drugs that might modulate this pathway and validated the potent anti-inflammatory activity of one class of small molecules in vitro and ex vivo. Together, this illustrates the power of functional genomics, applied directly in primary human cells, to identify immune-mediated disease mechanisms and potential therapeutic opportunities.
Collapse
Affiliation(s)
- C T Stankey
- Genetic Mechanisms of Disease Laboratory, The Francis Crick Institute, London, UK
- Department of Immunology and Inflammation, Imperial College London, London, UK
- Washington University School of Medicine, St Louis, MO, USA
| | - C Bourges
- Genetic Mechanisms of Disease Laboratory, The Francis Crick Institute, London, UK
| | - L M Haag
- Division of Gastroenterology, Infectious Diseases and Rheumatology, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - T Turner-Stokes
- Genetic Mechanisms of Disease Laboratory, The Francis Crick Institute, London, UK
- Department of Immunology and Inflammation, Imperial College London, London, UK
| | - A P Piedade
- Genetic Mechanisms of Disease Laboratory, The Francis Crick Institute, London, UK
| | - C Palmer-Jones
- Department of Gastroenterology, Royal Free Hospital, London, UK
- Institute for Liver and Digestive Health, Division of Medicine, University College London, London, UK
| | - I Papa
- Genetic Mechanisms of Disease Laboratory, The Francis Crick Institute, London, UK
| | | | - Q Zhang
- Genomics of Inflammation and Immunity Group, Human Genetics Programme, Wellcome Sanger Institute, Hinxton, UK
| | - A J Cameron
- Wolfson Wohl Cancer Centre, School of Cancer Sciences, University of Glasgow, Glasgow, UK
| | - A Legrini
- Wolfson Wohl Cancer Centre, School of Cancer Sciences, University of Glasgow, Glasgow, UK
| | - T Zhang
- Wolfson Wohl Cancer Centre, School of Cancer Sciences, University of Glasgow, Glasgow, UK
| | - C S Wood
- Wolfson Wohl Cancer Centre, School of Cancer Sciences, University of Glasgow, Glasgow, UK
| | - F N New
- NanoString Technologies, Seattle, WA, USA
| | - L O Randzavola
- Department of Immunology and Inflammation, Imperial College London, London, UK
| | - L Speidel
- Ancient Genomics Laboratory, The Francis Crick Institute, London, UK
- Genetics Institute, University College London, London, UK
| | - A C Brown
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - A Hall
- The Sheila Sherlock Liver Centre, Royal Free Hospital, London, UK
- Department of Cellular Pathology, Royal Free Hospital, London, UK
| | - F Saffioti
- Institute for Liver and Digestive Health, Division of Medicine, University College London, London, UK
- The Sheila Sherlock Liver Centre, Royal Free Hospital, London, UK
| | - E C Parkes
- Genetic Mechanisms of Disease Laboratory, The Francis Crick Institute, London, UK
| | - W Edwards
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, University of Cambridge, Cambridge, UK
| | - H Direskeneli
- Department of Internal Medicine, Division of Rheumatology, Marmara University, Istanbul, Turkey
| | - P C Grayson
- Systemic Autoimmunity Branch, NIAMS, National Institutes of Health, Bethesda, MD, USA
| | - L Jiang
- Department of Rheumatology, Zhongshan Hospital, Fudan University, Shanghai, China
| | - P A Merkel
- Division of Rheumatology, Department of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Division of Epidemiology, Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - G Saruhan-Direskeneli
- Department of Physiology, Istanbul University, Istanbul Faculty of Medicine, Istanbul, Turkey
| | - A H Sawalha
- Division of Rheumatology, Department of Pediatrics, University of Pittsburgh, Pittsburgh, PA, USA
- Division of Rheumatology and Clinical Immunology, Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Lupus Center of Excellence, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Immunology, University of Pittsburgh, Pittsburgh, PA, USA
| | - E Tombetti
- Department of Biomedical and Clinical Sciences, Milan University, Milan, Italy
- Internal Medicine and Rheumatology, ASST FBF-Sacco, Milan, Italy
| | - A Quaglia
- Department of Cellular Pathology, Royal Free Hospital, London, UK
- UCL Cancer Institute, London, UK
| | - D Thorburn
- Institute for Liver and Digestive Health, Division of Medicine, University College London, London, UK
- The Sheila Sherlock Liver Centre, Royal Free Hospital, London, UK
| | - J C Knight
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
- Chinese Academy of Medical Sciences Institute, Nuffield Department of Medicine, University of Oxford, Oxford, UK
- NIHR Comprehensive Biomedical Research Centre, Oxford, UK
| | - A P Rochford
- Department of Gastroenterology, Royal Free Hospital, London, UK
- Institute for Liver and Digestive Health, Division of Medicine, University College London, London, UK
| | - C D Murray
- Department of Gastroenterology, Royal Free Hospital, London, UK
- Institute for Liver and Digestive Health, Division of Medicine, University College London, London, UK
| | - P Divakar
- NanoString Technologies, Seattle, WA, USA
| | - M Green
- Experimental Histopathology STP, The Francis Crick Institute, London, UK
| | - E Nye
- Experimental Histopathology STP, The Francis Crick Institute, London, UK
| | - J I MacRae
- Metabolomics STP, The Francis Crick Institute, London, UK
| | - N B Jamieson
- Wolfson Wohl Cancer Centre, School of Cancer Sciences, University of Glasgow, Glasgow, UK
| | - P Skoglund
- Ancient Genomics Laboratory, The Francis Crick Institute, London, UK
| | - M Z Cader
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, University of Cambridge, Cambridge, UK
- Department of Medicine, University of Cambridge, Cambridge, UK
| | - C Wallace
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, University of Cambridge, Cambridge, UK
- MRC Biostatistics Unit, Cambridge Institute of Public Health, Cambridge, UK
| | - D C Thomas
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, University of Cambridge, Cambridge, UK
- Department of Medicine, University of Cambridge, Cambridge, UK
| | - J C Lee
- Genetic Mechanisms of Disease Laboratory, The Francis Crick Institute, London, UK.
- Department of Gastroenterology, Royal Free Hospital, London, UK.
- Institute for Liver and Digestive Health, Division of Medicine, University College London, London, UK.
| |
Collapse
|
2
|
Peng D, Mulder OJ, Edge MD. Evaluating ARG-estimation methods in the context of estimating population-mean polygenic score histories. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.24.595829. [PMID: 38854009 PMCID: PMC11160635 DOI: 10.1101/2024.05.24.595829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Scalable methods for estimating marginal coalescent trees across the genome present new opportunities for studying evolution and have generated considerable excitement, with new methods extending scalability to thousands of samples. Benchmarking of the available methods has revealed general tradeoffs between accuracy and scalability, but performance in downstream applications has not always been easily predictable from general performance measures, suggesting that specific features of the ARG may be important for specific downstream applications of estimated ARGs. To exemplify this point, we benchmark ARG estimation methods with respect to a specific set of methods for estimating the historical time course of a population-mean polygenic score (PGS) using the marginal coalescent trees encoded by the ancestral recombination graph (ARG). Here we examine the performance in simulation of six ARG estimation methods: ARGweaver, RENT+, Relate, tsinfer+tsdate, ARG-Needle/ASMC-clust , and SINGER , using their estimated coalescent trees and examining bias, mean squared error (MSE), confidence interval coverage, and Type I and II error rates of the downstream methods. Although it does not scale to the sample sizes attainable by other new methods, SINGER produced the most accurate estimated PGS histories in many instances, even when Relate, tsinfer+tsdate , and ARG-Needle/ASMC-clust used samples ten times as large as those used by SINGER. In general, the best choice of method depends on the number of samples available and the historical time period of interest. In particular, the unprecedented sample sizes allowed by Relate, tsinfer+tsdate , and ARG-Needle/ASMC-clust are of greatest importance when the recent past is of interest-further back in time, most of the tree has coalesced, and differences in contemporary sample size are less salient.
Collapse
|
3
|
Brandt DYC, Huber CD, Chiang CWK, Ortega-Del Vecchyo D. The Promise of Inferring the Past Using the Ancestral Recombination Graph. Genome Biol Evol 2024; 16:evae005. [PMID: 38242694 PMCID: PMC10834162 DOI: 10.1093/gbe/evae005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 12/11/2023] [Accepted: 12/17/2023] [Indexed: 01/21/2024] Open
Abstract
The ancestral recombination graph (ARG) is a structure that represents the history of coalescent and recombination events connecting a set of sequences (Hudson RR. In: Futuyma D, Antonovics J, editors. Gene genealogies and the coalescent process. In: Oxford Surveys in Evolutionary Biology; 1991. p. 1 to 44.). The full ARG can be represented as a set of genealogical trees at every locus in the genome, annotated with recombination events that change the topology of the trees between adjacent loci and the mutations that occurred along the branches of those trees (Griffiths RC, Marjoram P. An ancestral recombination graph. In: Donnelly P, Tavare S, editors. Progress in population genetics and human evolution. Springer; 1997. p. 257 to 270.). Valuable insights can be gained into past evolutionary processes, such as demographic events or the influence of natural selection, by studying the ARG. It is regarded as the "holy grail" of population genetics (Hubisz M, Siepel A. Inference of ancestral recombination graphs using ARGweaver. In: Dutheil JY, editors. Statistical population genomics. New York, NY: Springer US; 2020. p. 231-266.) since it encodes the processes that generate all patterns of allelic and haplotypic variation from which all commonly used summary statistics in population genetic research (e.g. heterozygosity and linkage disequilibrium) can be derived. Many previous evolutionary inferences relied on summary statistics extracted from the genotype matrix. Evolutionary inferences using the ARG represent a significant advancement as the ARG is a representation of the evolutionary history of a sample that shows the past history of recombination, coalescence, and mutation events across a particular sequence. This representation in theory contains as much information, if not more, than the combination of all independent summary statistics that could be derived from the genotype matrix. Consistent with this idea, some of the first ARG-based analyses have proven to be more powerful than summary statistic-based analyses (Speidel L, Forest M, Shi S, Myers SR. A method for genome-wide genealogy estimation for thousands of samples. Nat Genet. 2019:51(9):1321 to 1329.; Stern AJ, Wilton PR, Nielsen R. An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data. PLoS Genet. 2019:15(9):e1008384.; Hubisz MJ, Williams AL, Siepel A. Mapping gene flow between ancient hominins through demography-aware inference of the ancestral recombination graph. PLoS Genet. 2020:16(8):e1008895.; Fan C, Mancuso N, Chiang CWK. A genealogical estimate of genetic relationships. Am J Hum Genet. 2022:109(5):812-824.; Fan C, Cahoon JL, Dinh BL, Ortega-Del Vecchyo D, Huber C, Edge MD, Mancuso N, Chiang CWK. A likelihood-based framework for demographic inference from genealogical trees. bioRxiv. 2023.10.10.561787. 2023.; Hejase HA, Mo Z, Campagna L, Siepel A. A deep-learning approach for inference of selective sweeps from the ancestral recombination graph. Mol Biol Evol. 2022:39(1):msab332.; Link V, Schraiber JG, Fan C, Dinh B, Mancuso N, Chiang CWK, Edge MD. Tree-based QTL mapping with expected local genetic relatedness matrices. bioRxiv. 2023.04.07.536093. 2023.; Zhang BC, Biddanda A, Gunnarsson ÁF, Cooper F, Palamara PF. Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits. Nat Genet. 2023:55(5):768-776.). As such, there has been significant interest in the field to investigate 2 main problems related to the ARG: (i) How can we estimate the ARG based on genomic data, and (ii) how can we extract information of past evolutionary processes from the ARG? In this perspective, we highlight 3 topics that pertain to these main issues: The development of computational innovations that enable the estimation of the ARG; remaining challenges in estimating the ARG; and methodological advances for deducing evolutionary forces and mechanisms using the ARG. This perspective serves to introduce the readers to the types of questions that can be explored using the ARG and to highlight some of the most pressing issues that must be addressed in order to make ARG-based inference an indispensable tool for evolutionary research.
Collapse
Affiliation(s)
- Débora Y C Brandt
- Department of Genetics Evolution and Environment, University College London, London, UK
| | - Christian D Huber
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - Charleston W K Chiang
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Diego Ortega-Del Vecchyo
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma De México, Querétaro, Querétaro, Mexico
| |
Collapse
|
4
|
Irving-Pease EK, Refoyo-Martínez A, Barrie W, Ingason A, Pearson A, Fischer A, Sjögren KG, Halgren AS, Macleod R, Demeter F, Henriksen RA, Vimala T, McColl H, Vaughn AH, Speidel L, Stern AJ, Scorrano G, Ramsøe A, Schork AJ, Rosengren A, Zhao L, Kristiansen K, Iversen AKN, Fugger L, Sudmant PH, Lawson DJ, Durbin R, Korneliussen T, Werge T, Allentoft ME, Sikora M, Nielsen R, Racimo F, Willerslev E. The selection landscape and genetic legacy of ancient Eurasians. Nature 2024; 625:312-320. [PMID: 38200293 PMCID: PMC10781624 DOI: 10.1038/s41586-023-06705-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Accepted: 10/03/2023] [Indexed: 01/12/2024]
Abstract
The Holocene (beginning around 12,000 years ago) encompassed some of the most significant changes in human evolution, with far-reaching consequences for the dietary, physical and mental health of present-day populations. Using a dataset of more than 1,600 imputed ancient genomes1, we modelled the selection landscape during the transition from hunting and gathering, to farming and pastoralism across West Eurasia. We identify key selection signals related to metabolism, including that selection at the FADS cluster began earlier than previously reported and that selection near the LCT locus predates the emergence of the lactase persistence allele by thousands of years. We also find strong selection in the HLA region, possibly due to increased exposure to pathogens during the Bronze Age. Using ancient individuals to infer local ancestry tracts in over 400,000 samples from the UK Biobank, we identify widespread differences in the distribution of Mesolithic, Neolithic and Bronze Age ancestries across Eurasia. By calculating ancestry-specific polygenic risk scores, we show that height differences between Northern and Southern Europe are associated with differential Steppe ancestry, rather than selection, and that risk alleles for mood-related phenotypes are enriched for Neolithic farmer ancestry, whereas risk alleles for diabetes and Alzheimer's disease are enriched for Western hunter-gatherer ancestry. Our results indicate that ancient selection and migration were large contributors to the distribution of phenotypic diversity in present-day Europeans.
Collapse
Affiliation(s)
- Evan K Irving-Pease
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
| | - Alba Refoyo-Martínez
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - William Barrie
- GeoGenetics Group, Department of Zoology, University of Cambridge, Cambridge, UK
| | - Andrés Ingason
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Institute of Biological Psychiatry, Mental Health Services, Copenhagen University Hospital, Roskilde, Denmark
| | - Alice Pearson
- Department of Genetics, University of Cambridge, Cambridge, UK
- Department of Zoology, University of Cambridge, Cambridge, UK
| | - Anders Fischer
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Department of Historical Studies, University of Gothenburg, Gothenburg, Sweden
- Sealand Archaeology, Kalundborg, Denmark
| | - Karl-Göran Sjögren
- Department of Historical Studies, University of Gothenburg, Gothenburg, Sweden
| | - Alma S Halgren
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA
| | - Ruairidh Macleod
- GeoGenetics Group, Department of Zoology, University of Cambridge, Cambridge, UK
- UCL Genetics Institute, University College London, London, UK
| | - Fabrice Demeter
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Eco-anthropologie, Muséum national d'Histoire naturelle, CNRS, Université Paris Cité, Musée de l'Homme, Paris, France
| | - Rasmus A Henriksen
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Tharsika Vimala
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Hugh McColl
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Andrew H Vaughn
- Center for Computational Biology, University of California, Berkeley, CA, USA
| | - Leo Speidel
- UCL Genetics Institute, University College London, London, UK
- Ancient Genomics Laboratory, The Francis Crick Institute, London, UK
| | - Aaron J Stern
- Center for Computational Biology, University of California, Berkeley, CA, USA
| | - Gabriele Scorrano
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Abigail Ramsøe
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Andrew J Schork
- Institute of Biological Psychiatry, Mental Health Services, Copenhagen University Hospital, Roskilde, Denmark
- Neurogenomics Division, The Translational Genomics Research Institute (TGEN), Phoenix, AZ, USA
| | - Anders Rosengren
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Institute of Biological Psychiatry, Mental Health Services, Copenhagen University Hospital, Roskilde, Denmark
| | - Lei Zhao
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Kristian Kristiansen
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Department of Historical Studies, University of Gothenburg, Gothenburg, Sweden
| | - Astrid K N Iversen
- Oxford Centre for Neuroinflammation, Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK
- Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK
| | - Lars Fugger
- Oxford Centre for Neuroinflammation, Nuffield Department of Clinical Neurosciences, John Radcliffe Hospital, University of Oxford, Oxford, UK
- Department of Clinical Medicine, Aarhus University Hospital, Aarhus, Denmark
- MRC Human Immunology Unit, John Radcliffe Hospital, University of Oxford, Oxford, UK
| | - Peter H Sudmant
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA
- Center for Computational Biology, University of California, Berkeley, CA, USA
| | - Daniel J Lawson
- Institute of Statistical Sciences, School of Mathematics, University of Bristol, Bristol, UK
| | - Richard Durbin
- Department of Genetics, University of Cambridge, Cambridge, UK
- Wellcome Sanger Institute, Cambridge, UK
| | - Thorfinn Korneliussen
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Thomas Werge
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
- Institute of Biological Psychiatry, Mental Health Center Sct Hans, Copenhagen University Hospital, Copenhagen, Denmark
| | - Morten E Allentoft
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- Trace and Environmental DNA (TrEnD) Laboratory, School of Molecular and Life Science, Curtin University, Perth, Western Australia, Australia
| | - Martin Sikora
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Rasmus Nielsen
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
- Departments of Integrative Biology and Statistics, UC Berkeley, Berkeley, CA, USA.
| | - Fernando Racimo
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
| | - Eske Willerslev
- Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
- GeoGenetics Group, Department of Zoology, University of Cambridge, Cambridge, UK.
- MARUM Center for Marine Environmental Sciences and Faculty of Geosciences, University of Bremen, Bremen, Germany.
| |
Collapse
|
5
|
Lewanski AL, Grundler MC, Bradburd GS. The era of the ARG: An introduction to ancestral recombination graphs and their significance in empirical evolutionary genomics. PLoS Genet 2024; 20:e1011110. [PMID: 38236805 PMCID: PMC10796009 DOI: 10.1371/journal.pgen.1011110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2024] Open
Abstract
In the presence of recombination, the evolutionary relationships between a set of sampled genomes cannot be described by a single genealogical tree. Instead, the genomes are related by a complex, interwoven collection of genealogies formalized in a structure called an ancestral recombination graph (ARG). An ARG extensively encodes the ancestry of the genome(s) and thus is replete with valuable information for addressing diverse questions in evolutionary biology. Despite its potential utility, technological and methodological limitations, along with a lack of approachable literature, have severely restricted awareness and application of ARGs in evolution research. Excitingly, recent progress in ARG reconstruction and simulation have made ARG-based approaches feasible for many questions and systems. In this review, we provide an accessible introduction and exploration of ARGs, survey recent methodological breakthroughs, and describe the potential for ARGs to further existing goals and open avenues of inquiry that were previously inaccessible in evolutionary genomics. Through this discussion, we aim to more widely disseminate the promise of ARGs in evolutionary genomics and encourage the broader development and adoption of ARG-based inference.
Collapse
Affiliation(s)
- Alexander L. Lewanski
- Department of Integrative Biology, Michigan State University, East Lansing, Michigan, United States of America
- W.K. Kellogg Biological Station, Michigan State University, Hickory Corners, Michigan, United States of America
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, Michigan, United States of America
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Michael C. Grundler
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Gideon S. Bradburd
- W.K. Kellogg Biological Station, Michigan State University, Hickory Corners, Michigan, United States of America
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
6
|
Fournier R, Tsangalidou Z, Reich D, Palamara PF. Haplotype-based inference of recent effective population size in modern and ancient DNA samples. Nat Commun 2023; 14:7945. [PMID: 38040695 PMCID: PMC10692198 DOI: 10.1038/s41467-023-43522-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Accepted: 11/10/2023] [Indexed: 12/03/2023] Open
Abstract
Individuals sharing recent ancestors are likely to co-inherit large identical-by-descent (IBD) genomic regions. The distribution of these IBD segments in a population may be used to reconstruct past demographic events such as effective population size variation, but accurate IBD detection is difficult in ancient DNA data and in underrepresented populations with limited reference data. In this work, we introduce an accurate method for inferring effective population size variation during the past ~2000 years in both modern and ancient DNA data, called HapNe. HapNe infers recent population size fluctuations using either IBD sharing (HapNe-IBD) or linkage disequilibrium (HapNe-LD), which does not require phasing and can be computed in low coverage data, including data sets with heterogeneous sampling times. HapNe shows improved accuracy in a range of simulated demographic scenarios compared to currently available methods for IBD-based and LD-based inference of recent effective population size, while requiring fewer computational resources. We apply HapNe to several modern populations from the 1,000 Genomes Project, the UK Biobank, the Allen Ancient DNA Resource, and recently published samples from Iron Age Britain, detecting multiple instances of recent effective population size variation across these groups.
Collapse
Affiliation(s)
| | | | - David Reich
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA
| | - Pier Francesco Palamara
- Department of Statistics, University of Oxford, Oxford, UK.
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK.
| |
Collapse
|
7
|
Lewanski AL, Grundler MC, Bradburd GS. The era of the ARG: an empiricist's guide to ancestral recombination graphs. ARXIV 2023:arXiv:2310.12070v1. [PMID: 37904740 PMCID: PMC10614969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 11/01/2023]
Abstract
In the presence of recombination, the evolutionary relationships between a set of sampled genomes cannot be described by a single genealogical tree. Instead, the genomes are related by a complex, interwoven collection of genealogies formalized in a structure called an ancestral recombination graph (ARG). An ARG extensively encodes the ancestry of the genome(s) and thus is replete with valuable information for addressing diverse questions in evolutionary biology. Despite its potential utility, technological and methodological limitations, along with a lack of approachable literature, have severely restricted awareness and application of ARGs in empirical evolution research. Excitingly, recent progress in ARG reconstruction and simulation have made ARG-based approaches feasible for many questions and systems. In this review, we provide an accessible introduction and exploration of ARGs, survey recent methodological breakthroughs, and describe the potential for ARGs to further existing goals and open avenues of inquiry that were previously inaccessible in evolutionary genomics. Through this discussion, we aim to more widely disseminate the promise of ARGs in evolutionary genomics and encourage the broader development and adoption of ARG-based inference.
Collapse
Affiliation(s)
- Alexander L Lewanski
- Department of Integrative Biology, Michigan State University, East Lansing, MI, US
- W.K. Kellogg Biological Station, Michigan State University, Hickory Corners, MI, US
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI, US
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, US
| | - Michael C Grundler
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, US
| | - Gideon S Bradburd
- W.K. Kellogg Biological Station, Michigan State University, Hickory Corners, MI, US
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI, US
| |
Collapse
|
8
|
Moorjani P, Hellenthal G. Methods for Assessing Population Relationships and History Using Genomic Data. Annu Rev Genomics Hum Genet 2023; 24:305-332. [PMID: 37220313 PMCID: PMC11040641 DOI: 10.1146/annurev-genom-111422-025117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Genetic data contain a record of our evolutionary history. The availability of large-scale datasets of human populations from various geographic areas and timescales, coupled with advances in the computational methods to analyze these data, has transformed our ability to use genetic data to learn about our evolutionary past. Here, we review some of the widely used statistical methods to explore and characterize population relationships and history using genomic data. We describe the intuition behind commonly used approaches, their interpretation, and important limitations. For illustration, we apply some of these techniques to genome-wide autosomal data from 929 individuals representing 53 worldwide populations that are part of the Human Genome Diversity Project. Finally, we discuss the new frontiers in genomic methods to learn about population history. In sum, this review highlights the power (and limitations) of DNA to infer features of human evolutionary history, complementing the knowledge gleaned from other disciplines, such as archaeology, anthropology, and linguistics.
Collapse
Affiliation(s)
- Priya Moorjani
- Department of Molecular and Cell Biology and Center for Computational Biology, University of California, Berkeley, California, USA;
| | - Garrett Hellenthal
- UCL Genetics Institute and Research Department of Genetics, Evolution, and Environment, University College London, London, United Kingdom;
| |
Collapse
|
9
|
Zhang BC, Biddanda A, Gunnarsson ÁF, Cooper F, Palamara PF. Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits. Nat Genet 2023; 55:768-776. [PMID: 37127670 PMCID: PMC10181934 DOI: 10.1038/s41588-023-01379-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2021] [Accepted: 03/22/2023] [Indexed: 05/03/2023]
Abstract
Genome-wide genealogies compactly represent the evolutionary history of a set of genomes and inferring them from genetic data has the potential to facilitate a wide range of analyses. We introduce a method, ARG-Needle, for accurately inferring biobank-scale genealogies from sequencing or genotyping array data, as well as strategies to utilize genealogies to perform association and other complex trait analyses. We use these methods to build genome-wide genealogies using genotyping data for 337,464 UK Biobank individuals and test for association across seven complex traits. Genealogy-based association detects more rare and ultra-rare signals (N = 134, frequency range 0.0007-0.1%) than genotype imputation using ~65,000 sequenced haplotypes (N = 64). In a subset of 138,039 exome sequencing samples, these associations strongly tag (average r = 0.72) underlying sequencing variants enriched (4.8×) for loss-of-function variation. These results demonstrate that inferred genome-wide genealogies may be leveraged in the analysis of complex traits, complementing approaches that require the availability of large, population-specific sequencing panels.
Collapse
Affiliation(s)
- Brian C Zhang
- Department of Statistics, University of Oxford, Oxford, UK
| | - Arjun Biddanda
- Department of Statistics, University of Oxford, Oxford, UK
| | - Árni Freyr Gunnarsson
- Department of Statistics, University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | - Fergus Cooper
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Pier Francesco Palamara
- Department of Statistics, University of Oxford, Oxford, UK.
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK.
| |
Collapse
|
10
|
Bloom JD, Beichman AC, Neher RA, Harris K. Evolution of the SARS-CoV-2 Mutational Spectrum. Mol Biol Evol 2023; 40:msad085. [PMID: 37039557 PMCID: PMC10124870 DOI: 10.1093/molbev/msad085] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 02/07/2023] [Accepted: 04/06/2023] [Indexed: 04/12/2023] Open
Abstract
SARS-CoV-2 evolves rapidly in part because of its high mutation rate. Here, we examine whether this mutational process itself has changed during viral evolution. To do this, we quantify the relative rates of different types of single-nucleotide mutations at 4-fold degenerate sites in the viral genome across millions of human SARS-CoV-2 sequences. We find clear shifts in the relative rates of several types of mutations during SARS-CoV-2 evolution. The most striking trend is a roughly 2-fold decrease in the relative rate of G→T mutations in Omicron versus early clades, as was recently noted by Ruis et al. (2022. Mutational spectra distinguish SARS-CoV-2 replication niches. bioRxiv, doi:10.1101/2022.09.27.509649). There is also a decrease in the relative rate of C→T mutations in Delta, and other subtle changes in the mutation spectrum along the phylogeny. We speculate that these changes in the mutation spectrum could arise from viral mutations that affect genome replication, packaging, and antagonization of host innate-immune factors, although environmental factors could also play a role. Interestingly, the mutation spectrum of Omicron is more similar than that of earlier SARS-CoV-2 clades to the spectrum that shaped the long-term evolution of sarbecoviruses. Overall, our work shows that the mutation process is itself a dynamic variable during SARS-CoV-2 evolution and suggests that human SARS-CoV-2 may be trending toward a mutation spectrum more similar to that of other animal sarbecoviruses.
Collapse
Affiliation(s)
- Jesse D Bloom
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA
- Department of Genome Sciences, University of Washington, Seattle, WA
- Howard Hughes Medical Institute, Seattle, WA
| | | | - Richard A Neher
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Kelley Harris
- Department of Genome Sciences, University of Washington, Seattle, WA
| |
Collapse
|
11
|
Gao Z, Zhang Y, Cramer N, Przeworski M, Moorjani P. Limited role of generation time changes in driving the evolution of the mutation spectrum in humans. eLife 2023; 12:e81188. [PMID: 36779395 PMCID: PMC10014080 DOI: 10.7554/elife.81188] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Accepted: 02/02/2023] [Indexed: 02/14/2023] Open
Abstract
Recent studies have suggested that the human germline mutation rate and spectrum evolve rapidly. Variation in generation time has been linked to these changes, though its contribution remains unclear. We develop a framework to characterize temporal changes in polymorphisms within and between populations, while controlling for the effects of natural selection and biased gene conversion. Application to the 1000 Genomes Project dataset reveals multiple independent changes that arose after the split of continental groups, including a previously reported, transient elevation in TCC>TTC mutations in Europeans and novel signals of divergence in C>Gand T>A mutation rates among population samples. We also find a significant difference between groups sampled in and outside of Africa in old T>C polymorphisms that predate the out-of-Africa migration. This surprising signal is driven by TpG>CpG mutations and stems in part from mis-polarized CpG transitions, which are more likely to undergo recurrent mutations. Finally, by relating the mutation spectrum of polymorphisms to parental age effects on de novo mutations, we show that plausible changes in the generation time cannot explain the patterns observed for different mutation types jointly. Thus, other factors - genetic modifiers or environmental exposures - must have had a non-negligible impact on the human mutation landscape.
Collapse
Affiliation(s)
- Ziyue Gao
- Department of Genetics, University of Pennsylvania, Perelman School of MedicinePhiladelphiaUnited States
| | - Yulin Zhang
- Center for Computational Biology, University of California, BerkeleyBerkeleyUnited States
| | - Nathan Cramer
- Department of Molecular and Cell Biology, University of California, BerkeleyBerkeleyUnited States
| | - Molly Przeworski
- Department of Biological Sciences, Columbia UniversityNew YorkUnited States
- Department of Systems Biology, Columbia UniversityNew YorkUnited States
| | - Priya Moorjani
- Center for Computational Biology, University of California, BerkeleyBerkeleyUnited States
- Department of Molecular and Cell Biology, University of California, BerkeleyBerkeleyUnited States
| |
Collapse
|
12
|
Nickchi P, Karunarathna C, Graham J. An exploration of linkage fine-mapping on sequences from case-control studies. Genet Epidemiol 2023; 47:78-94. [PMID: 36047334 PMCID: PMC10087369 DOI: 10.1002/gepi.22502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 05/30/2022] [Accepted: 08/09/2022] [Indexed: 02/01/2023]
Abstract
Linkage analysis maps genetic loci for a heritable trait by identifying genomic regions with excess relatedness among individuals with similar trait values. Analysis may be conducted on related individuals from families, or on samples of unrelated individuals from a population. For allelically heterogeneous traits, population-based linkage analysis can be more powerful than genotypic-association analysis. Here, we focus on linkage analysis in a population sample, but use sequences rather than individuals as our unit of observation. Earlier investigations of sequence-based linkage mapping relied on known sequence relatedness, whereas we infer relatedness from the sequence data. We propose two ways to associate similarity in relatedness of sequences with similarity in their trait values and compare the resulting linkage methods to two genotypic-association methods. We also introduce a procedure to label case sequences as potential carriers or noncarriers of causal variants after an association has been found. This post hoc labeling of case sequences is based on inferred relatedness to other case sequences. Our simulation results indicate that methods based on sequence relatedness improve localization and perform as well as genotypic-association methods for detecting rare causal variants. Sequence-based linkage analysis therefore has potential to fine-map allelically heterogeneous disease traits.
Collapse
Affiliation(s)
- Payman Nickchi
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Charith Karunarathna
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, British Columbia, Canada.,Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Jinko Graham
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, British Columbia, Canada
| |
Collapse
|
13
|
Aqil A, Speidel L, Pavlidis P, Gokcumen O. Balancing selection on genomic deletion polymorphisms in humans. eLife 2023; 12:79111. [PMID: 36625544 PMCID: PMC9943071 DOI: 10.7554/elife.79111] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 01/05/2023] [Indexed: 01/11/2023] Open
Abstract
A key question in biology is why genomic variation persists in a population for extended periods. Recent studies have identified examples of genomic deletions that have remained polymorphic in the human lineage for hundreds of millennia, ostensibly owing to balancing selection. Nevertheless, genome-wide investigation of ancient and possibly adaptive deletions remains an imperative exercise. Here, we demonstrate an excess of polymorphisms in present-day humans that predate the modern human-Neanderthal split (ancient polymorphisms), which cannot be explained solely by selectively neutral scenarios. We analyze the adaptive mechanisms that underlie this excess in deletion polymorphisms. Using a previously published measure of balancing selection, we show that this excess of ancient deletions is largely owing to balancing selection. Based on the absence of signatures of overdominance, we conclude that it is a rare mode of balancing selection among ancient deletions. Instead, more complex scenarios involving spatially and temporally variable selective pressures are likely more common mechanisms. Our results suggest that balancing selection resulted in ancient deletions harboring disproportionately more exonic variants with GWAS (genome-wide association studies) associations. We further found that ancient deletions are significantly enriched for traits related to metabolism and immunity. As a by-product of our analysis, we show that deletions are, on average, more deleterious than single nucleotide variants. We can now argue that not only is a vast majority of common variants shared among human populations, but a considerable portion of biologically relevant variants has been segregating among our ancestors for hundreds of thousands, if not millions, of years.
Collapse
Affiliation(s)
- Alber Aqil
- Department of Biological Sciences, University at BuffaloBuffaloUnited States
| | - Leo Speidel
- University College London, Genetics InstituteLondonUnited Kingdom
- The Francis Crick InstituteLondonUnited Kingdom
| | - Pavlos Pavlidis
- Institute of Computer Science (ICS), Foundation of Research and Technology-HellasHeraklionGreece
| | - Omer Gokcumen
- Department of Biological Sciences, University at BuffaloBuffaloUnited States
| |
Collapse
|
14
|
Bloom JD, Beichman AC, Neher RA, Harris K. Evolution of the SARS-CoV-2 mutational spectrum. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022:2022.11.19.517207. [PMID: 36451887 PMCID: PMC9709787 DOI: 10.1101/2022.11.19.517207] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
SARS-CoV-2 evolves rapidly in part because of its high mutation rate. Here we examine whether this mutational process itself has changed during viral evolution. To do this, we quantify the relative rates of different types of single nucleotide mutations at four-fold degenerate sites in the viral genome across millions of human SARS-CoV-2 sequences. We find clear shifts in the relative rates of several types of mutations during SARS-CoV-2 evolution. The most striking trend is a roughly two-fold decrease in the relative rate of G→T mutations in Omicron versus early clades, as was recently noted by Ruis et al (2022). There is also a decrease in the relative rate of C→T mutations in Delta, and other subtle changes in the mutation spectrum along the phylogeny. We speculate that these changes in the mutation spectrum could arise from viral mutations that affect genome replication, packaging, and antagonization of host innate-immune factors-although environmental factors could also play a role. Interestingly, the mutation spectrum of Omicron is more similar than that of earlier SARS-CoV-2 clades to the spectrum that shaped the long-term evolution of sarbecoviruses. Overall, our work shows that the mutation process is itself a dynamic variable during SARS-CoV-2 evolution, and suggests that human SARS-CoV-2 may be trending towards a mutation spectrum more similar to that of other animal sarbecoviruses.
Collapse
Affiliation(s)
- Jesse D Bloom
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, USA
- Department of Genome Sciences & Medical Scientist Training Program, University of Washington, Seattle, Washington, USA
- Howard Hughes Medical Institute, Seattle, WA, USA
| | - Annabel C Beichman
- Department of Genome Sciences & Medical Scientist Training Program, University of Washington, Seattle, Washington, USA
| | - Richard A Neher
- Biozentrum, University of Basel, Basel, Switzerland, Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | |
Collapse
|
15
|
Brace S, Diekmann Y, Booth T, Macleod R, Timpson A, Stephen W, Emery G, Cabot S, Thomas MG, Barnes I. Genomes from a medieval mass burial show Ashkenazi-associated hereditary diseases pre-date the 12th century. Curr Biol 2022; 32:4350-4359.e6. [PMID: 36044903 PMCID: PMC10499757 DOI: 10.1016/j.cub.2022.08.036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 07/26/2022] [Accepted: 08/12/2022] [Indexed: 11/16/2022]
Abstract
We report genome sequence data from six individuals excavated from the base of a medieval well at a site in Norwich, UK. A revised radiocarbon analysis of the assemblage is consistent with these individuals being part of a historically attested episode of antisemitic violence on 6 February 1190 CE. We find that four of these individuals were closely related and all six have strong genetic affinities with modern Ashkenazi Jews. We identify four alleles associated with genetic disease in Ashkenazi Jewish populations and infer variation in pigmentation traits, including the presence of red hair. Simulations indicate that Ashkenazi-associated genetic disease alleles were already at appreciable frequencies, centuries earlier than previously hypothesized. These findings provide new insights into a significant historical crime, into Ashkenazi population history, and into the origins of genetic diseases associated with modern Jewish populations.
Collapse
Affiliation(s)
- Selina Brace
- Department of Earth Sciences, The Natural History Museum, Cromwell Road, London SW7 5BD, UK
| | - Yoan Diekmann
- Research Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK; Palaeogenetics Group, Institute of Organismic and Molecular Evolution (iomE), Johannes Gutenberg-University Mainz, 55099 Mainz, Germany
| | - Thomas Booth
- Department of Earth Sciences, The Natural History Museum, Cromwell Road, London SW7 5BD, UK; Francis Crick Institute, London NW1 1AT, UK; UCL Genetics Institute, University College London, London, UK
| | - Ruairidh Macleod
- Department of Earth Sciences, The Natural History Museum, Cromwell Road, London SW7 5BD, UK; Research Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK; Department of Archaeology, University of Cambridge, Downing Street, Cambridge CB2 3DZ, UK
| | - Adrian Timpson
- Research Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK
| | - Will Stephen
- Department of Earth Sciences, The Natural History Museum, Cromwell Road, London SW7 5BD, UK
| | - Giles Emery
- Norvic Archaeology, 7 Foxburrow Road, Norwich NR7 8QU, UK
| | - Sophie Cabot
- Norfolk Record Office, The Archive Centre, Martineau Lane, Norwich, Norfolk NR1 2DQ, UK
| | - Mark G Thomas
- Research Department of Genetics, Evolution and Environment, University College London, Gower Street, London WC1E 6BT, UK.
| | - Ian Barnes
- Department of Earth Sciences, The Natural History Museum, Cromwell Road, London SW7 5BD, UK.
| |
Collapse
|
16
|
Upadhya G, Steinrücken M. Robust inference of population size histories from genomic sequencing data. PLoS Comput Biol 2022; 18:e1010419. [PMID: 36112715 PMCID: PMC9518926 DOI: 10.1371/journal.pcbi.1010419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Revised: 09/28/2022] [Accepted: 07/21/2022] [Indexed: 02/08/2023] Open
Abstract
Unraveling the complex demographic histories of natural populations is a central problem in population genetics. Understanding past demographic events is of general anthropological interest, but is also an important step in establishing accurate null models when identifying adaptive or disease-associated genetic variation. An important class of tools for inferring past population size changes from genomic sequence data are Coalescent Hidden Markov Models (CHMMs). These models make efficient use of the linkage information in population genomic datasets by using the local genealogies relating sampled individuals as latent states that evolve along the chromosome in an HMM framework. Extending these models to large sample sizes is challenging, since the number of possible latent states increases rapidly. Here, we present our method CHIMP (CHMM History-Inference Maximum-Likelihood Procedure), a novel CHMM method for inferring the size history of a population. It can be applied to large samples (hundreds of haplotypes) and only requires unphased genomes as input. The two implementations of CHIMP that we present here use either the height of the genealogical tree (TMRCA) or the total branch length, respectively, as the latent variable at each position in the genome. The requisite transition and emission probabilities are obtained by numerically solving certain systems of differential equations derived from the ancestral process with recombination. The parameters of the population size history are subsequently inferred using an Expectation-Maximization algorithm. In addition, we implement a composite likelihood scheme to allow the method to scale to large sample sizes. We demonstrate the efficiency and accuracy of our method in a variety of benchmark tests using simulated data and present comparisons to other state-of-the-art methods. Specifically, our implementation using TMRCA as the latent variable shows comparable performance and provides accurate estimates of effective population sizes in intermediate and ancient times. Our method is agnostic to the phasing of the data, which makes it a promising alternative in scenarios where high quality data is not available, and has potential applications for pseudo-haploid data.
Collapse
Affiliation(s)
- Gautam Upadhya
- Department of Physics, University of Chicago, Chicago, Illinois, United States of America
| | - Matthias Steinrücken
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
17
|
Milligan WR, Amster G, Sella G. The impact of genetic modifiers on variation in germline mutation rates within and among human populations. Genetics 2022; 221:6603115. [PMID: 35666194 DOI: 10.1093/genetics/iyac087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 05/16/2022] [Indexed: 11/14/2022] Open
Abstract
Mutation rates and spectra differ among human populations. Here, we examine whether this variation could be explained by evolution at mutation modifiers. To this end, we consider genetic modifier sites at which mutations, "mutator alleles", increase genome-wide mutation rates and model their evolution under purifying selection due to the additional deleterious mutations that they cause, genetic drift, and demographic processes. We solve the model analytically for a constant population size and characterize how evolution at modifier sites impacts variation in mutation rates within and among populations. We then use simulations to study the effects of modifier sites under a plausible demographic model for Africans and Europeans. When comparing populations that evolve independently, weakly selected modifier sites (2Nes ≈ 1), which evolve slowly, contribute the most to variation in mutation rates. In contrast, when populations recently split from a common ancestral population, strongly selected modifier sites (2Nes » 1), which evolve rapidly, contribute the most to variation between them. Moreover, a modest number of modifier sites (e.g., 10 per mutation type in the standard classification into 96 types) subject to moderate to strong selection (2Nes > 1) could account for the variation in mutation rates observed among human populations. If such modifier sites indeed underlie differences among populations, they should also cause variation in mutation rates within populations and their effects should be detectable in pedigree studies.
Collapse
Affiliation(s)
- William R Milligan
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA
| | - Guy Amster
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA.,Flatiron Health Inc., New York, NY 10013, USA
| | - Guy Sella
- Department of Biological Sciences, Columbia University, New York, NY 10027, USA.,Program for Mathematical Genomics, Columbia University, New York, NY 10032, USA
| |
Collapse
|
18
|
The genomic origins of the world's first farmers. Cell 2022; 185:1842-1859.e18. [PMID: 35561686 PMCID: PMC9166250 DOI: 10.1016/j.cell.2022.04.008] [Citation(s) in RCA: 31] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 03/04/2022] [Accepted: 04/06/2022] [Indexed: 11/24/2022]
Abstract
The precise genetic origins of the first Neolithic farming populations in Europe and Southwest Asia, as well as the processes and the timing of their differentiation, remain largely unknown. Demogenomic modeling of high-quality ancient genomes reveals that the early farmers of Anatolia and Europe emerged from a multiphase mixing of a Southwest Asian population with a strongly bottlenecked western hunter-gatherer population after the last glacial maximum. Moreover, the ancestors of the first farmers of Europe and Anatolia went through a period of extreme genetic drift during their westward range expansion, contributing highly to their genetic distinctiveness. This modeling elucidates the demographic processes at the root of the Neolithic transition and leads to a spatial interpretation of the population history of Southwest Asia and Europe during the late Pleistocene and early Holocene.
Collapse
|
19
|
Fan C, Mancuso N, Chiang CW. A genealogical estimate of genetic relationships. Am J Hum Genet 2022; 109:812-824. [PMID: 35417677 DOI: 10.1016/j.ajhg.2022.03.016] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 03/25/2022] [Indexed: 12/23/2022] Open
Abstract
The application of genetic relationships among individuals, characterized by a genetic relationship matrix (GRM), has far-reaching effects in human genetics. However, the current standard to calculate the GRM treats linked markers as independent and does not explicitly model the underlying genealogical history of the study sample. Here, we propose a coalescent-informed framework, namely the expected GRM (eGRM), to infer the expected relatedness between pairs of individuals given an ancestral recombination graph (ARG) of the sample. Through extensive simulations, we show that the eGRM is an unbiased estimate of latent pairwise genome-wide relatedness and is robust when computed with ARG inferred from incomplete genetic data. As a result, the eGRM better captures the structure of a population than the canonical GRM, even when using the same genetic information. More importantly, our framework allows a principled approach to estimate the eGRM at different time depths of the ARG, thereby revealing the time-varying nature of population structure in a sample. When applied to SNP array genotypes from a population sample from Northern and Eastern Finland, we find that clustering analysis with the eGRM reveals population structure driven by subpopulations that would not be apparent via the canonical GRM and that temporally the population model is consistent with recent divergence and expansion. Taken together, our proposed eGRM provides a robust tree-centric estimate of relatedness with wide application to genetic studies.
Collapse
|
20
|
Fulgione A, Neto C, Elfarargi AF, Tergemina E, Ansari S, Göktay M, Dinis H, Döring N, Flood PJ, Rodriguez-Pacheco S, Walden N, Koch MA, Roux F, Hermisson J, Hancock AM. Parallel reduction in flowering time from de novo mutations enable evolutionary rescue in colonizing lineages. Nat Commun 2022; 13:1461. [PMID: 35304466 PMCID: PMC8933414 DOI: 10.1038/s41467-022-28800-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 02/07/2022] [Indexed: 12/11/2022] Open
Abstract
Understanding how populations adapt to abrupt environmental change is necessary to predict responses to future challenges, but identifying specific adaptive variants, quantifying their responses to selection and reconstructing their detailed histories is challenging in natural populations. Here, we use Arabidopsis from the Cape Verde Islands as a model to investigate the mechanisms of adaptation after a sudden shift to a more arid climate. We find genome-wide evidence of adaptation after a multivariate change in selection pressures. In particular, time to flowering is reduced in parallel across islands, substantially increasing fitness. This change is mediated by convergent de novo loss of function of two core flowering time genes: FRI on one island and FLC on the other. Evolutionary reconstructions reveal a case where expansion of the new populations coincided with the emergence and proliferation of these variants, consistent with models of rapid adaptation and evolutionary rescue. Detailing how populations adapted to environmental change is needed to predict future responses, but identifying adaptive variants and detailing their fitness effects is rare. Here, the authors show that parallel loss of FRI and FLC function reduces time to flowering and drives adaptation in a drought prone environment.
Collapse
Affiliation(s)
- Andrea Fulgione
- Max Planck Institute for Plant Breeding Research, Cologne, Germany.,Mathematics and Bioscience, Department of Mathematics and Max F. Perutz Labs, University of Vienna, Vienna, Austria.,Vienna Graduate School for Population Genetics, Vienna, Austria
| | - Célia Neto
- Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | | | | | - Shifa Ansari
- Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Mehmet Göktay
- Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Herculano Dinis
- Parque Natural do Fogo, Direção Nacional do Ambiente, Praia, Santiago, Cabo Verde.,Associação Projecto Vitó, São Filipe, Fogo, Cabo Verde
| | - Nina Döring
- Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Pádraic J Flood
- Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | | | - Nora Walden
- Centre for Organismal Studies (COS) Heidelberg, Biodiversity and Plant Systematics, Heidelberg University, Heidelberg, Germany.,Biosystematics, Wageningen University, Wageningen, The Netherlands
| | - Marcus A Koch
- Centre for Organismal Studies (COS) Heidelberg, Biodiversity and Plant Systematics, Heidelberg University, Heidelberg, Germany
| | - Fabrice Roux
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan, France
| | - Joachim Hermisson
- Mathematics and Bioscience, Department of Mathematics and Max F. Perutz Labs, University of Vienna, Vienna, Austria
| | - Angela M Hancock
- Max Planck Institute for Plant Breeding Research, Cologne, Germany. .,Mathematics and Bioscience, Department of Mathematics and Max F. Perutz Labs, University of Vienna, Vienna, Austria.
| |
Collapse
|
21
|
Rees J, Andrés A. Inferring human evolutionary history. Science 2022; 375:817-818. [PMID: 35201893 DOI: 10.1126/science.abo0498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Unified genetic genealogy improves our understanding of how humans evolved.
Collapse
Affiliation(s)
- Jasmin Rees
- UCL Genetics Institute, Department of Genetics, Evolution and Environnment, University College London, London, UK.,Genetics and Genomic Medicine Programme, Great Ormond Street Institute of Child Health, University College London, London, UK
| | - Aida Andrés
- UCL Genetics Institute, Department of Genetics, Evolution and Environnment, University College London, London, UK.,Genetics and Genomic Medicine Programme, Great Ormond Street Institute of Child Health, University College London, London, UK
| |
Collapse
|
22
|
Wohns AW, Wong Y, Jeffery B, Akbari A, Mallick S, Pinhasi R, Patterson N, Reich D, Kelleher J, McVean G. A unified genealogy of modern and ancient genomes. Science 2022; 375:eabi8264. [PMID: 35201891 PMCID: PMC10027547 DOI: 10.1126/science.abi8264] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
The sequencing of modern and ancient genomes from around the world has revolutionized our understanding of human history and evolution. However, the problem of how best to characterize ancestral relationships from the totality of human genomic variation remains unsolved. Here, we address this challenge with nonparametric methods that enable us to infer a unified genealogy of modern and ancient humans. This compact representation of multiple datasets explores the challenges of missing and erroneous data and uses ancient samples to constrain and date relationships. We demonstrate the power of the method to recover relationships between individuals and populations as well as to identify descendants of ancient samples. Finally, we introduce a simple nonparametric estimator of the geographical location of ancestors that recapitulates key events in human history.
Collapse
Affiliation(s)
- Anthony Wilder Wohns
- Broad Institute of MIT and Harvard; Cambridge, MA 02142, USA
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford; Oxford OX3 7LF, UK
| | - Yan Wong
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford; Oxford OX3 7LF, UK
| | - Ben Jeffery
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford; Oxford OX3 7LF, UK
| | - Ali Akbari
- Broad Institute of MIT and Harvard; Cambridge, MA 02142, USA
- Department of Human Evolutionary Biology, Harvard University; Cambridge, MA 02138, USA
- Department of Genetics, Harvard Medical School; Boston, MA 02115, USA
| | - Swapan Mallick
- Broad Institute of MIT and Harvard; Cambridge, MA 02142, USA
- Howard Hughes Medical Institute; Boston, MA 02115, USA
| | - Ron Pinhasi
- Department of Evolutionary Anthropology, University of Vienna; 1090 Vienna, Austria
| | - Nick Patterson
- Broad Institute of MIT and Harvard; Cambridge, MA 02142, USA
- Department of Human Evolutionary Biology, Harvard University; Cambridge, MA 02138, USA
- Howard Hughes Medical Institute; Boston, MA 02115, USA
- Department of Genetics, Harvard Medical School; Boston, MA 02115, USA
| | - David Reich
- Broad Institute of MIT and Harvard; Cambridge, MA 02142, USA
- Department of Human Evolutionary Biology, Harvard University; Cambridge, MA 02138, USA
- Howard Hughes Medical Institute; Boston, MA 02115, USA
- Department of Genetics, Harvard Medical School; Boston, MA 02115, USA
| | - Jerome Kelleher
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford; Oxford OX3 7LF, UK
| | - Gil McVean
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford; Oxford OX3 7LF, UK
- Corresponding author.
| |
Collapse
|
23
|
Colomer-Vilaplana A, Murga-Moreno J, Canalda-Baltrons A, Inserte C, Soto D, Coronado-Zamora M, Barbadilla A, Casillas S. PopHumanVar: an interactive application for the functional characterization and prioritization of adaptive genomic variants in humans. Nucleic Acids Res 2022; 50:D1069-D1076. [PMID: 34664660 PMCID: PMC8728255 DOI: 10.1093/nar/gkab925] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 09/17/2021] [Accepted: 09/28/2021] [Indexed: 12/22/2022] Open
Abstract
Adaptive challenges that humans faced as they expanded across the globe left specific molecular footprints that can be decoded in our today's genomes. Different sets of metrics are used to identify genomic regions that have undergone selection. However, there are fewer methods capable of pinpointing the allele ultimately responsible for this selection. Here, we present PopHumanVar, an interactive online application that is designed to facilitate the exploration and thorough analysis of candidate genomic regions by integrating both functional and population genomics data currently available. PopHumanVar generates useful summary reports of prioritized variants that are putatively causal of recent selective sweeps. It compiles data and graphically represents different layers of information, including natural selection statistics, as well as functional annotations and genealogical estimations of variant age, for biallelic single nucleotide variants (SNVs) of the 1000 Genomes Project phase 3. Specifically, PopHumanVar amasses SNV-based information from GEVA, SnpEFF, GWAS Catalog, ClinVar, RegulomeDB and DisGeNET databases, as well as accurate estimations of iHS, nSL and iSAFE statistics. Notably, PopHumanVar can successfully identify known causal variants of frequently reported candidate selection regions, including EDAR in East-Asians, ACKR1 (DARC) in Africans and LCT/MCM6 in Europeans. PopHumanVar is open and freely available at https://pophumanvar.uab.cat.
Collapse
Affiliation(s)
- Aina Colomer-Vilaplana
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Jesús Murga-Moreno
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Aleix Canalda-Baltrons
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Clara Inserte
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Daniel Soto
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Marta Coronado-Zamora
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Antonio Barbadilla
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Sònia Casillas
- Department of Genetics and Microbiology, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
- Institute of Biotechnology and Biomedicine, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| |
Collapse
|
24
|
Baumdicker F, Bisschop G, Goldstein D, Gower G, Ragsdale AP, Tsambos G, Zhu S, Eldon B, Ellerman EC, Galloway JG, Gladstein AL, Gorjanc G, Guo B, Jeffery B, Kretzschmar WW, Lohse K, Matschiner M, Nelson D, Pope NS, Quinto-Cortés CD, Rodrigues MF, Saunack K, Sellinger T, Thornton K, van Kemenade H, Wohns AW, Wong Y, Gravel S, Kern AD, Koskela J, Ralph PL, Kelleher J. Efficient ancestry and mutation simulation with msprime 1.0. Genetics 2021; 220:6460344. [PMID: 34897427 PMCID: PMC9176297 DOI: 10.1093/genetics/iyab229] [Citation(s) in RCA: 91] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 12/03/2021] [Indexed: 11/13/2022] Open
Abstract
Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this, a large number of specialized simulation programs have been developed, each filling a particular niche, but with largely overlapping functionality and a substantial duplication of effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry and mutation simulations based on the succinct tree sequence data structure and the tskit library. We summarize msprime’s many features, and show that its performance is excellent, often many times faster and more memory efficient than specialized alternatives. These high-performance features have been thoroughly tested and validated, and built using a collaborative, open source development model, which reduces duplication of effort and promotes software quality via community engagement.
Collapse
Affiliation(s)
- Franz Baumdicker
- Cluster of Excellence "Controlling Microbes to Fight Infections", Mathematical and Computational Population Genetics, University of Tübingen, 72076 Tübingen, Germany
| | - Gertjan Bisschop
- Institute of Evolutionary Biology,The University of Edinburgh, EH9 3FL, UK
| | - Daniel Goldstein
- Khoury College of Computer Sciences, Northeastern University, MA 02115, USA.,No affiliation
| | - Graham Gower
- Lundbeck GeoGenetics Centre, Globe Institute, University of Copenhagen, 1350 Copenhagen K, Denmark
| | - Aaron P Ragsdale
- Department of Integrative Biology, University of Wisconsin-Madison, WI 53706, USA
| | - Georgia Tsambos
- Melbourne Integrative Genomics, School of Mathematics and Statistics, University of Melbourne, Victoria, 3010, Australia
| | - Sha Zhu
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, OX3 7LF, UK
| | - Bjarki Eldon
- Leibniz Institute for Evolution and Biodiversity Science,Museum für Naturkunde Berlin, 10115, Germany
| | | | - Jared G Galloway
- Institute of Ecology and Evolution, Department of Biology, University of Oregon, OR 97403-5289, USA.,Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA 98102, USA
| | - Ariella L Gladstein
- Department of Genetics, University of North Carolina at Chapel Hill, NC 27599-7264, USA.,Embark Veterinary, Inc., Boston, MA 02111, USA
| | - Gregor Gorjanc
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, EH25 9RG, UK
| | - Bing Guo
- Institute for Genome Sciences,University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Ben Jeffery
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, OX3 7LF, UK
| | - Warren W Kretzschmar
- Center for Hematology and Regenerative Medicine, Karolinska Institute, 141 83 Huddinge, Sweden
| | - Konrad Lohse
- Institute of Evolutionary Biology,The University of Edinburgh, EH9 3FL, UK
| | | | - Dominic Nelson
- Department of Human Genetics, McGill University, Montréal, QC H3A 0C7, Canada
| | - Nathaniel S Pope
- Department of Entomology, Pennsylvania State University, PA 16802, USA
| | - Consuelo D Quinto-Cortés
- National Laboratory of Genomics for Biodiversity (LANGEBIO), Unit of Advanced Genomics, CINVESTAV, Irapuato, Mexico
| | - Murillo F Rodrigues
- Institute of Ecology and Evolution, Department of Biology, University of Oregon, OR 97403-5289, USA
| | - Kumar Saunack
- IIT Bombay, Powai, Mumbai 400 076, Maharashtra, India
| | - Thibaut Sellinger
- Professorship for Population Genetics, Department of Life Science Systems, Technical University of Munich, 85354 Freising, Germany
| | - Kevin Thornton
- Ecology and Evolutionary Biology, University of California, Irvine, CA 92697, USA
| | | | - Anthony W Wohns
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, OX3 7LF, UK.,Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Yan Wong
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, OX3 7LF, UK
| | - Simon Gravel
- Department of Human Genetics, McGill University, Montréal, QC H3A 0C7, Canada
| | - Andrew D Kern
- Institute of Ecology and Evolution, Department of Biology, University of Oregon, OR 97403-5289, USA
| | - Jere Koskela
- Department of Statistics, University of Warwick, CV4 7AL, UK
| | - Peter L Ralph
- Institute of Ecology and Evolution, Department of Biology, University of Oregon, OR 97403-5289, USA.,Department of Mathematics, University of Oregon, OR 97403-5289 USA
| | - Jerome Kelleher
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, OX3 7LF, UK
| |
Collapse
|
25
|
Irving-Pease EK, Muktupavela R, Dannemann M, Racimo F. Quantitative Human Paleogenetics: What can Ancient DNA Tell us About Complex Trait Evolution? Front Genet 2021; 12:703541. [PMID: 34422004 PMCID: PMC8371751 DOI: 10.3389/fgene.2021.703541] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/08/2021] [Indexed: 12/13/2022] Open
Abstract
Genetic association data from national biobanks and large-scale association studies have provided new prospects for understanding the genetic evolution of complex traits and diseases in humans. In turn, genomes from ancient human archaeological remains are now easier than ever to obtain, and provide a direct window into changes in frequencies of trait-associated alleles in the past. This has generated a new wave of studies aiming to analyse the genetic component of traits in historic and prehistoric times using ancient DNA, and to determine whether any such traits were subject to natural selection. In humans, however, issues about the portability and robustness of complex trait inference across different populations are particularly concerning when predictions are extended to individuals that died thousands of years ago, and for which little, if any, phenotypic validation is possible. In this review, we discuss the advantages of incorporating ancient genomes into studies of trait-associated variants, the need for models that can better accommodate ancient genomes into quantitative genetic frameworks, and the existing limits to inferences about complex trait evolution, particularly with respect to past populations.
Collapse
Affiliation(s)
- Evan K. Irving-Pease
- Lundbeck Foundation GeoGenetics Centre, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| | - Rasa Muktupavela
- Lundbeck Foundation GeoGenetics Centre, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| | - Michael Dannemann
- Center for Genomics, Evolution and Medicine, Institute of Genomics, University of Tartu, Tartu, Estonia
| | - Fernando Racimo
- Lundbeck Foundation GeoGenetics Centre, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|