1
|
Kyriazis CC, Robinson JA, Lohmueller KE. Using Computational Simulations to Model Deleterious Variation and Genetic Load in Natural Populations. Am Nat 2023; 202:737-752. [PMID: 38033186 PMCID: PMC10897732 DOI: 10.1086/726736] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
AbstractDeleterious genetic variation is abundant in wild populations, and understanding the ecological and conservation implications of such variation is an area of active research. Genomic methods are increasingly used to quantify the impacts of deleterious variation in natural populations; however, these approaches remain limited by an inability to accurately predict the selective and dominance effects of mutations. Computational simulations of deleterious variation offer a complementary tool that can help overcome these limitations, although such approaches have yet to be widely employed. In this perspective article, we aim to encourage ecological and conservation genomics researchers to adopt greater use of computational simulations to aid in deepening our understanding of deleterious variation in natural populations. We first provide an overview of the components of a simulation of deleterious variation, describing the key parameters involved in such models. Next, we discuss several approaches for validating simulation models. Finally, we compare and validate several recently proposed deleterious mutation models, demonstrating that models based on estimates of selection parameters from experimental systems are biased toward highly deleterious mutations. We describe a new model that is supported by multiple orthogonal lines of evidence and provide example scripts for implementing this model (https://github.com/ckyriazis/simulations_review).
Collapse
|
2
|
Carlberg C. Nutrigenomics in the context of evolution. Redox Biol 2023; 62:102656. [PMID: 36933390 PMCID: PMC10036735 DOI: 10.1016/j.redox.2023.102656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 03/03/2023] [Accepted: 03/03/2023] [Indexed: 03/13/2023] Open
Abstract
Nutrigenomics describes the interaction between nutrients and our genome. Since the origin of our species most of these nutrient-gene communication pathways have not changed. However, our genome experienced over the past 50,000 years a number of evolutionary pressures, which are based on the migration to new environments concerning geography and climate, the transition from hunter-gatherers to farmers including the zoonotic transfer of many pathogenic microbes and the rather recent change of societies to a preferentially sedentary lifestyle and the dominance of Western diet. Human populations responded to these challenges not only by specific anthropometric adaptations, such as skin color and body stature, but also through diversity in dietary intake and different resistance to complex diseases like the metabolic syndrome, cancer and immune disorders. The genetic basis of this adaptation process has been investigated by whole genome genotyping and sequencing including that of DNA extracted from ancient bones. In addition to genomic changes, also the programming of epigenomes in pre- and postnatal phases of life has an important contribution to the response to environmental changes. Thus, insight into the variation of our (epi)genome in the context of our individual's risk for developing complex diseases, helps to understand the evolutionary basis how and why we become ill. This review will discuss the relation of diet, modern environment and our (epi)genome including aspects of redox biology. This has numerous implications for the interpretation of the risks for disease and their prevention.
Collapse
Affiliation(s)
- Carsten Carlberg
- Institute of Animal Reproduction and Food Research, Polish Academy of Sciences, ul. Juliana Tuwima 10, PL-10748, Olsztyn, Poland; School of Medicine, Institute of Biomedicine, University of Eastern Finland, FI-70211, Kuopio, Finland.
| |
Collapse
|
3
|
Wainschtein P, Jain D, Zheng Z, Cupples LA, Shadyab AH, McKnight B, Shoemaker BM, Mitchell BD, Psaty BM, Kooperberg C, Liu CT, Albert CM, Roden D, Chasman DI, Darbar D, Lloyd-Jones DM, Arnett DK, Regan EA, Boerwinkle E, Rotter JI, O'Connell JR, Yanek LR, de Andrade M, Allison MA, McDonald MLN, Chung MK, Fornage M, Chami N, Smith NL, Ellinor PT, Vasan RS, Mathias RA, Loos RJF, Rich SS, Lubitz SA, Heckbert SR, Redline S, Guo X, Chen YDI, Laurie CA, Hernandez RD, McGarvey ST, Goddard ME, Laurie CC, North KE, Lange LA, Weir BS, Yengo L, Yang J, Visscher PM. Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nat Genet 2022; 54:263-273. [PMID: 35256806 PMCID: PMC9119698 DOI: 10.1038/s41588-021-00997-7] [Citation(s) in RCA: 140] [Impact Index Per Article: 70.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 12/01/2021] [Indexed: 12/20/2022]
Abstract
Analyses of data from genome-wide association studies on unrelated individuals have shown that, for human traits and diseases, approximately one-third to two-thirds of heritability is captured by common SNPs. However, it is not known whether the remaining heritability is due to the imperfect tagging of causal variants by common SNPs, in particular whether the causal variants are rare, or whether it is overestimated due to bias in inference from pedigree data. Here we estimated heritability for height and body mass index (BMI) from whole-genome sequence data on 25,465 unrelated individuals of European ancestry. The estimated heritability was 0.68 (standard error 0.10) for height and 0.30 (standard error 0.10) for body mass index. Low minor allele frequency variants in low linkage disequilibrium (LD) with neighboring variants were enriched for heritability, to a greater extent for protein-altering variants, consistent with negative selection. Our results imply that rare variants, in particular those in regions of low linkage disequilibrium, are a major source of the still missing heritability of complex traits and disease.
Collapse
Affiliation(s)
- Pierrick Wainschtein
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia.
| | - Deepti Jain
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Zhili Zheng
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia
| | - L Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Framingham Heart Study, Framingham, MA, USA
| | - Aladdin H Shadyab
- Herbert Wertheim School of Public Health and Human Longevity Science, University of California San Diego, La Jolla, CA, USA
| | - Barbara McKnight
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Benjamin M Shoemaker
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Braxton D Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
- Geriatrics Research and Education Clinical Center, Baltimore Veterans Administration Medical Center, Baltimore, MD, USA
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Ching-Ti Liu
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Christine M Albert
- Harvard Medical School, Boston, MA, USA
- Division of Cardiovascular, Brigham and Women's Hospital, Boston, MA, USA
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Dan Roden
- Departments of Medicine, Pharmacology and Bioinformatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Daniel I Chasman
- Division of Preventive Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Dawood Darbar
- Department of Medicine, University of Illinois-Chicago, Chicago, IL, USA
| | | | - Donna K Arnett
- Dean's Office, College of Public Health, University of Kentucky, Lexington, KY, USA
| | | | - Eric Boerwinkle
- Health Science Center, University of Texas, Houston, TX, USA
| | - Jerome I Rotter
- Institute for Translational Genomics and Population Sciences, Department of Pediatrics, Lundquist Institute at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Jeffrey R O'Connell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Lisa R Yanek
- Division of General Internal Medicine, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Mariza de Andrade
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Matthew A Allison
- Department of Family Medicine, University of California San Diego, La Jolla, CA, USA
| | - Merry-Lynn N McDonald
- Division of Pulmonary, Allergy and Critical Care Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Mina K Chung
- Department of Molecular Cardiology, Cleveland Clinic, Cleveland, OH, USA
| | - Myriam Fornage
- Brown Foundation Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Nathalie Chami
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mindich Institute for Child Health and Development, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Nicholas L Smith
- Cardiovascular Health Research Unit and Department of Epidemiology, University of Washington, Seattle, WA, USA
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
- Seattle Epidemiologic Research and Information Center, Department of Veterans Affairs Office of Research and Development, Seattle, WA, USA
| | - Patrick T Ellinor
- Harvard Medical School, Boston, MA, USA
- Cardiac Arrhythmia Service, Massachusetts General Hospital, Boston, MA, USA
| | - Ramachandran S Vasan
- Framingham Heart Study, Framingham, MA, USA
- Sections of Preventive Medicine and Cardiovascular Medicine, Department of Medicine, Boston University School of Medicine, Boston, MA, USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
| | - Rasika A Mathias
- GeneSTAR Research Program, Divisions of Allergy and Clinical Immunology and General Internal Medicine, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Ruth J F Loos
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Mindich Institute for Child Health and Development, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Stephen S Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Steven A Lubitz
- Cardiac Arrhythmia Service, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Disease Initiative, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Susan R Heckbert
- Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
- Seattle Epidemiologic Research and Information Center, Department of Veterans Affairs Office of Research and Development, Seattle, WA, USA
| | - Susan Redline
- Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA, USA
- Division of Sleep Medicine, Harvard Medical School, Boston, MA, USA
- Division of Pulmonary, Critical Care, and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Xiuqing Guo
- Institute for Translational Genomics and Population Sciences, Department of Pediatrics, Lundquist Institute at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Y -D Ida Chen
- Institute for Translational Genomics and Population Sciences, Department of Pediatrics, Lundquist Institute at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Cecelia A Laurie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Ryan D Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Department of Human Genetics, McGill University, Montreal, Quebec, Canada
| | - Stephen T McGarvey
- International Health Institute, Department of Epidemiology, Brown University School of Public Health, Providence, RI, USA
| | - Michael E Goddard
- Centre for AgriBioscience, Department of Economic Development, Jobs, Transport and Resources, Bundoora, Victoria, Australia
- Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Parkville, Victoria, Australia
| | - Cathy C Laurie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Kari E North
- Department of Epidemiology and Carolina Center of Genome Sciences, University of North Carolina, Chapel Hill, NC, USA
| | - Leslie A Lange
- Department of Medicine, University of Colorado, Aurora, CO, USA
| | - Bruce S Weir
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Loic Yengo
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia
| | - Jian Yang
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia.
- School of Life Sciences, Westlake University, Hangzhou Zhejiang, China.
| | - Peter M Visscher
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia.
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia.
| |
Collapse
|
4
|
Sohail M, Izarraras-Gomez A, Ortega-Del Vecchyo D. Populations, Traits, and Their Spatial Structure in Humans. Genome Biol Evol 2021; 13:evab272. [PMID: 34894236 PMCID: PMC8715524 DOI: 10.1093/gbe/evab272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/06/2021] [Indexed: 11/16/2022] Open
Abstract
The spatial distribution of genetic variants is jointly determined by geography, past demographic processes, natural selection, and its interplay with environmental variation. A fraction of these genetic variants are "causal alleles" that affect the manifestation of a complex trait. The effect exerted by these causal alleles on complex traits can be independent or dependent on the environment. Understanding the evolutionary processes that shape the spatial structure of causal alleles is key to comprehend the spatial distribution of complex traits. Natural selection, past population size changes, range expansions, consanguinity, assortative mating, archaic introgression, admixture, and the environment can alter the frequencies, effect sizes, and heterozygosities of causal alleles. This provides a genetic axis along which complex traits can vary. However, complex traits also vary along biogeographical and sociocultural axes which are often correlated with genetic axes in complex ways. The purpose of this review is to consider these genetic and environmental axes in concert and examine the ways they can help us decipher the variation in complex traits that is visible in humans today. This initiative necessarily implies a discussion of populations, traits, the ability to infer and interpret "genetic" components of complex traits, and how these have been impacted by adaptive events. In this review, we provide a history-aware discussion on these topics using both the recent and more distant past of our academic discipline and its relevant contexts.
Collapse
Affiliation(s)
- Mashaal Sohail
- Department of Human Genetics, University of Chicago, USA
- Centro de Ciencias Genómicas (CCG), Universidad Nacional Autónoma de México (UNAM), Cuernavaca, Morelos, México
| | - Alan Izarraras-Gomez
- Laboratorio Internacional de Investigación sobre el Genoma Humano (LIIGH), Universidad Nacional Autónoma de México (UNAM), Juriquilla, Querétaro, México
| | - Diego Ortega-Del Vecchyo
- Laboratorio Internacional de Investigación sobre el Genoma Humano (LIIGH), Universidad Nacional Autónoma de México (UNAM), Juriquilla, Querétaro, México
| |
Collapse
|
5
|
Dunn CD. The population frequency of human mitochondrial DNA variants is highly dependent upon mutational bias. Biol Open 2021; 10:272468. [PMID: 34643212 PMCID: PMC8565468 DOI: 10.1242/bio.059072] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 10/07/2021] [Indexed: 12/23/2022] Open
Abstract
Next-generation sequencing can quickly reveal genetic variation potentially linked to heritable disease. As databases encompassing human variation continue to expand, rare variants have been of high interest, since the frequency of a variant is expected to be low if the genetic change leads to a loss of fitness or fecundity. However, the use of variant frequency when seeking genomic changes linked to disease remains very challenging. Here, I explored the role of selection in controlling human variant frequency using the HelixMT database, which encompasses hundreds of thousands of mitochondrial DNA (mtDNA) samples. I found that a substantial number of synonymous substitutions, which have no effect on protein sequence, were never encountered in this large study, while many other synonymous changes are found at very low frequencies. Further analyses of human and mammalian mtDNA datasets indicate that the population frequency of synonymous variants is predominantly determined by mutational biases rather than by strong selection acting upon nucleotide choice. My work has important implications that extend to the interpretation of variant frequency for non-synonymous substitutions.
Collapse
Affiliation(s)
- Cory D Dunn
- Institute of Biotechnology, University of Helsinki, Helsinki 00014, Finland
| |
Collapse
|
6
|
Lachance J. Beyond Stamp Collecting: Evolutionary and Functional Genomics Advance Our Understanding of Cancer Biology. Cancer Res 2021; 81:1637-1638. [PMID: 34003790 DOI: 10.1158/0008-5472.can-21-0146] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 01/14/2021] [Indexed: 11/16/2022]
Abstract
In this issue of Cancer Research, Emami and colleagues leveraged genetic data from over 200,000 men of European descent to implicate rare alleles that are associated with prostate cancer. However, this study went beyond a simple description of statistical associations between genetic variants and cancer risk. Polygenic risk scores were applied to large cohorts from Kaiser Permanente and the UK Biobank, demonstrating the clinical utility of genetic predictors of disease risk. Furthermore, by placing their results in an evolutionary framework and integrating genetic information with functional data, the authors of this major study were able to bridge the gap between genome-wide association studies and the biological mechanisms underlying prostate cancer risk.See related article by Emami et al., 1695.
Collapse
Affiliation(s)
- Joseph Lachance
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia.
| |
Collapse
|
7
|
Schlieben LD, Prokisch H, Yépez VA. How Machine Learning and Statistical Models Advance Molecular Diagnostics of Rare Disorders Via Analysis of RNA Sequencing Data. Front Mol Biosci 2021; 8:647277. [PMID: 34141720 PMCID: PMC8204083 DOI: 10.3389/fmolb.2021.647277] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Accepted: 05/10/2021] [Indexed: 12/11/2022] Open
Abstract
Rare diseases, although individually rare, collectively affect approximately 350 million people worldwide. Currently, nearly 6,000 distinct rare disorders with a known molecular basis have been described, yet establishing a specific diagnosis based on the clinical phenotype is challenging. Increasing integration of whole exome sequencing into routine diagnostics of rare diseases is improving diagnostic rates. Nevertheless, about half of the patients do not receive a genetic diagnosis due to the challenges of variant detection and interpretation. During the last years, RNA sequencing is increasingly used as a complementary diagnostic tool providing functional data. Initially, arbitrary thresholds have been applied to call aberrant expression, aberrant splicing, and mono-allelic expression. With the application of RNA sequencing to search for the molecular diagnosis, the implementation of robust statistical models on normalized read counts allowed for the detection of significant outliers corrected for multiple testing. More recently, machine learning methods have been developed to improve the normalization of RNA sequencing read count data by taking confounders into account. Together the methods have increased the power and sensitivity of detection and interpretation of pathogenic variants, leading to diagnostic rates of 10-35% in rare diseases. In this review, we provide an overview of the methods used for RNA sequencing and illustrate how these can improve the diagnostic yield of rare diseases.
Collapse
Affiliation(s)
- Lea D. Schlieben
- School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Holger Prokisch
- School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany
- Institute of Neurogenomics, Helmholtz Zentrum München, Neuherberg, Germany
| | - Vicente A. Yépez
- School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany
- Department of Informatics, Technical University of Munich, Munich, Germany
| |
Collapse
|
8
|
Emami NC, Cavazos TB, Rashkin SR, Cario CL, Graff RE, Tai CG, Mefford JA, Kachuri L, Wan E, Wong S, Aaronson D, Presti J, Habel LA, Shan J, Ranatunga DK, Chao CR, Ghai NR, Jorgenson E, Sakoda LC, Kvale MN, Kwok PY, Schaefer C, Risch N, Hoffmann TJ, Van Den Eeden SK, Witte JS. A Large-Scale Association Study Detects Novel Rare Variants, Risk Genes, Functional Elements, and Polygenic Architecture of Prostate Cancer Susceptibility. Cancer Res 2021; 81:1695-1703. [PMID: 33293427 PMCID: PMC8137514 DOI: 10.1158/0008-5472.can-20-2635] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Revised: 10/27/2020] [Accepted: 12/02/2020] [Indexed: 11/16/2022]
Abstract
To identify rare variants associated with prostate cancer susceptibility and better characterize the mechanisms and cumulative disease risk associated with common risk variants, we conducted an integrated study of prostate cancer genetic etiology in two cohorts using custom genotyping microarrays, large imputation reference panels, and functional annotation approaches. Specifically, 11,984 men (6,196 prostate cancer cases and 5,788 controls) of European ancestry from Northern California Kaiser Permanente were genotyped and meta-analyzed with 196,269 men of European ancestry (7,917 prostate cancer cases and 188,352 controls) from the UK Biobank. Three novel loci, including two rare variants (European ancestry minor allele frequency < 0.01, at 3p21.31 and 8p12), were significant genome wide in a meta-analysis. Gene-based rare variant tests implicated a known prostate cancer gene (HOXB13), as well as a novel candidate gene (ILDR1), which encodes a receptor highly expressed in prostate tissue and is related to the B7/CD28 family of T-cell immune checkpoint markers. Haplotypic patterns of long-range linkage disequilibrium were observed for rare genetic variants at HOXB13 and other loci, reflecting their evolutionary history. In addition, a polygenic risk score (PRS) of 188 prostate cancer variants was strongly associated with risk (90th vs. 40th-60th percentile OR = 2.62, P = 2.55 × 10-191). Many of the 188 variants exhibited functional signatures of gene expression regulation or transcription factor binding, including a 6-fold difference in log-probability of androgen receptor binding at the variant rs2680708 (17q22). Rare variant and PRS associations, with concomitant functional interpretation of risk mechanisms, can help clarify the full genetic architecture of prostate cancer and other complex traits. SIGNIFICANCE: This study maps the biological relationships between diverse risk factors for prostate cancer, integrating different functional datasets to interpret and model genome-wide data from over 200,000 men with and without prostate cancer.See related commentary by Lachance, p. 1637.
Collapse
Affiliation(s)
- Nima C Emami
- Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Taylor B Cavazos
- Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California
| | - Sara R Rashkin
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Clinton L Cario
- Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Rebecca E Graff
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Caroline G Tai
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Joel A Mefford
- Program in Pharmaceutical Sciences and Pharmacogenomics, University of California San Francisco, San Francisco, California
| | - Linda Kachuri
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Eunice Wan
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
| | - Simon Wong
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
| | - David Aaronson
- Department of Urology, Kaiser Oakland Medical Center, Oakland, California
| | - Joseph Presti
- Department of Urology, Kaiser Oakland Medical Center, Oakland, California
| | - Laurel A Habel
- Division of Research, Kaiser Permanente Northern California, Oakland, California
| | - Jun Shan
- Division of Research, Kaiser Permanente Northern California, Oakland, California
| | - Dilrini K Ranatunga
- Division of Research, Kaiser Permanente Northern California, Oakland, California
| | - Chun R Chao
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California
| | - Nirupa R Ghai
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, California
| | - Eric Jorgenson
- Division of Research, Kaiser Permanente Northern California, Oakland, California
| | - Lori C Sakoda
- Division of Research, Kaiser Permanente Northern California, Oakland, California
| | - Mark N Kvale
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
| | - Pui-Yan Kwok
- Program in Pharmaceutical Sciences and Pharmacogenomics, University of California San Francisco, San Francisco, California
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
| | - Catherine Schaefer
- Division of Research, Kaiser Permanente Northern California, Oakland, California
| | - Neil Risch
- Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
- Program in Pharmaceutical Sciences and Pharmacogenomics, University of California San Francisco, San Francisco, California
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
- Division of Research, Kaiser Permanente Northern California, Oakland, California
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
| | - Thomas J Hoffmann
- Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
| | - Stephen K Van Den Eeden
- Division of Research, Kaiser Permanente Northern California, Oakland, California
- Department of Urology, University of California San Francisco, San Francisco, California
| | - John S Witte
- Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California.
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
- Program in Pharmaceutical Sciences and Pharmacogenomics, University of California San Francisco, San Francisco, California
- Institute for Human Genetics, University of California San Francisco, San Francisco, California
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, California
- Department of Urology, University of California San Francisco, San Francisco, California
| |
Collapse
|
9
|
Benton ML, Abraham A, LaBella AL, Abbot P, Rokas A, Capra JA. The influence of evolutionary history on human health and disease. Nat Rev Genet 2021; 22:269-283. [PMID: 33408383 PMCID: PMC7787134 DOI: 10.1038/s41576-020-00305-9] [Citation(s) in RCA: 97] [Impact Index Per Article: 32.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/26/2020] [Indexed: 01/29/2023]
Abstract
Nearly all genetic variants that influence disease risk have human-specific origins; however, the systems they influence have ancient roots that often trace back to evolutionary events long before the origin of humans. Here, we review how advances in our understanding of the genetic architectures of diseases, recent human evolution and deep evolutionary history can help explain how and why humans in modern environments become ill. Human populations exhibit differences in the prevalence of many common and rare genetic diseases. These differences are largely the result of the diverse environmental, cultural, demographic and genetic histories of modern human populations. Synthesizing our growing knowledge of evolutionary history with genetic medicine, while accounting for environmental and social factors, will help to achieve the promise of personalized genomics and realize the potential hidden in an individual's DNA sequence to guide clinical decisions. In short, precision medicine is fundamentally evolutionary medicine, and integration of evolutionary perspectives into the clinic will support the realization of its full potential.
Collapse
Affiliation(s)
- Mary Lauren Benton
- grid.152326.10000 0001 2264 7217Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN USA ,grid.252890.40000 0001 2111 2894Department of Computer Science, Baylor University, Waco, TX USA
| | - Abin Abraham
- grid.152326.10000 0001 2264 7217Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN USA ,grid.152326.10000 0001 2264 7217Vanderbilt University Medical Center, Vanderbilt University, Nashville, TN USA
| | - Abigail L. LaBella
- grid.152326.10000 0001 2264 7217Department of Biological Sciences, Vanderbilt University, Nashville, TN USA
| | - Patrick Abbot
- grid.152326.10000 0001 2264 7217Department of Biological Sciences, Vanderbilt University, Nashville, TN USA
| | - Antonis Rokas
- grid.152326.10000 0001 2264 7217Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN USA ,grid.152326.10000 0001 2264 7217Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN USA ,grid.152326.10000 0001 2264 7217Department of Biological Sciences, Vanderbilt University, Nashville, TN USA
| | - John A. Capra
- grid.152326.10000 0001 2264 7217Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN USA ,grid.152326.10000 0001 2264 7217Department of Biological Sciences, Vanderbilt University, Nashville, TN USA ,grid.266102.10000 0001 2297 6811Bakar Computational Health Sciences Institute and Department of Epidemiology and Biostatistics, University of California, San Francisco, CA USA
| |
Collapse
|
10
|
Spear ML, Diaz-Papkovich A, Ziv E, Yracheta JM, Gravel S, Torgerson DG, Hernandez RD. Recent shifts in the genomic ancestry of Mexican Americans may alter the genetic architecture of biomedical traits. eLife 2020; 9:e56029. [PMID: 33372659 PMCID: PMC7771964 DOI: 10.7554/elife.56029] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Accepted: 12/13/2020] [Indexed: 11/13/2022] Open
Abstract
People in the Americas represent a diverse continuum of populations with varying degrees of admixture among African, European, and Amerindigenous ancestries. In the United States, populations with non-European ancestry remain understudied, and thus little is known about the genetic architecture of phenotypic variation in these populations. Using genotype data from the Hispanic Community Health Study/Study of Latinos, we find that Amerindigenous ancestry increased by an average of ~20% spanning 1940s-1990s in Mexican Americans. These patterns result from complex interactions between several population and cultural factors which shaped patterns of genetic variation and influenced the genetic architecture of complex traits in Mexican Americans. We show for height how polygenic risk scores based on summary statistics from a European-based genome-wide association study perform poorly in Mexican Americans. Our findings reveal temporal changes in population structure within Hispanics/Latinos that may influence biomedical traits, demonstrating a need to improve our understanding of admixed populations.
Collapse
Affiliation(s)
- Melissa L Spear
- Biomedical Sciences Graduate Program, University of California, San FranciscoSan FranciscoUnited States
- Department of Bioengineering and Therapeutic Sciences, University of California, San FranciscoSan FranciscoUnited States
- McGill Genome Centre, McGill UniversityMontrealCanada
- Department of Human Genetics, McGill UniversityMontrealCanada
| | - Alex Diaz-Papkovich
- McGill Genome Centre, McGill UniversityMontrealCanada
- Quantitative Life Sciences Program, McGill UniversityMontrealCanada
| | - Elad Ziv
- Division of General Internal Medicine, University of California, San FranciscoSan FranciscoUnited States
- Department of Medicine, University of California, San FranciscoSan FranciscoUnited States
- Institute of Human Genetics, University of California, San FranciscoSan FranciscoUnited States
- Helen Diller Family Comprehensive Cancer Center, University of California, San FranciscoSan FranciscoUnited States
| | - Joseph M Yracheta
- Native BioData ConsortiumEagle ButteUnited States
- Bloomberg School of Public Health, Johns Hopkins UniversityBaltimoreUnited States
| | - Simon Gravel
- McGill Genome Centre, McGill UniversityMontrealCanada
- Department of Human Genetics, McGill UniversityMontrealCanada
| | - Dara G Torgerson
- McGill Genome Centre, McGill UniversityMontrealCanada
- Department of Human Genetics, McGill UniversityMontrealCanada
- Department of Epidemiology and Biostatistics University of California, San FranciscoSan FranciscoUnited States
| | - Ryan D Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California, San FranciscoSan FranciscoUnited States
- McGill Genome Centre, McGill UniversityMontrealCanada
- Department of Human Genetics, McGill UniversityMontrealCanada
- Institute of Human Genetics, University of California, San FranciscoSan FranciscoUnited States
- Bakar Computational Health Sciences Institute, University of California, San FranciscoSan FranciscoUnited States
- Quantitative Biosciences Institute, University of California, San FranciscoSan FranciscoUnited States
| |
Collapse
|
11
|
Woerner AE, Veeramah KR, Watkins JC, Hammer MF. The Role of Phylogenetically Conserved Elements in Shaping Patterns of Human Genomic Diversity. Mol Biol Evol 2020; 35:2284-2295. [PMID: 30113695 DOI: 10.1093/molbev/msy145] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Evolutionary genetic studies have shown a positive correlation between levels of nucleotide diversity and either rates of recombination or genetic distance to genes. Both positive-directional and purifying selection have been offered as the source of these correlations via genetic hitchhiking and background selection, respectively. Phylogenetically conserved elements (CEs) are short (∼100 bp), widely distributed (comprising ∼5% of genome), sequences that are often found far from genes. While the function of many CEs is unknown, CEs also are associated with reduced diversity at linked sites. Using high coverage (>80×) whole genome data from two human populations, the Yoruba and the CEU, we perform fine scale evaluations of diversity, rates of recombination, and linkage to genes. We find that the local rate of recombination has a stronger effect on levels of diversity than linkage to genes, and that these effects of recombination persist even in regions far from genes. Our whole genome modeling demonstrates that, rather than recombination or GC-biased gene conversion, selection on sites within or linked to CEs better explains the observed genomic diversity patterns. A major implication is that very few sites in the human genome are predicted to be free of the effects of selection. These sites, which we refer to as the human "neutralome," comprise only 1.2% of the autosomes and 5.1% of the X chromosome. Demographic analysis of the neutralome reveals larger population sizes and lower rates of growth for ancestral human populations than inferred by previous analyses.
Collapse
Affiliation(s)
- August E Woerner
- ARL Division of Biotechnology, University of Arizona, Tucson, AZ.,Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX
| | - Krishna R Veeramah
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY
| | | | - Michael F Hammer
- ARL Division of Biotechnology, University of Arizona, Tucson, AZ
| |
Collapse
|
12
|
Cuevas HE, Prom LK. Evaluation of genetic diversity, agronomic traits, and anthracnose resistance in the NPGS Sudan Sorghum Core collection. BMC Genomics 2020; 21:88. [PMID: 31992189 PMCID: PMC6988227 DOI: 10.1186/s12864-020-6489-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Accepted: 01/13/2020] [Indexed: 12/15/2022] Open
Abstract
Background The United States Department of Agriculture (USDA) National Plant Germplasm System (NPGS) sorghum core collection contains 3011 accessions randomly selected from 77 countries. Genomic and phenotypic characterization of this core collection is necessary to encourage and facilitate its utilization in breeding programs and to improve conservation efforts. In this study, we examined the genome sequences of 318 accessions belonging to the NPGS Sudan sorghum core set, and characterized their agronomic traits and anthracnose resistance response. Results We identified 183,144 single nucleotide polymorphisms (SNPs) located within or in proximity of 25,124 annotated genes using the genotyping-by-sequencing (GBS) approach. The core collection was genetically highly diverse, with an average pairwise genetic distance of 0.76 among accessions. Population structure and cluster analysis revealed five ancestral populations within the Sudan core set, with moderate to high level of genetic differentiation. In total, 171 accessions (54%) were assigned to one of these populations, which covered 96% of the total genomic variation. Genome scan based on Tajima’s D values revealed two populations under balancing selection. Phenotypic analysis showed differences in agronomic traits among the populations, suggesting that these populations belong to different ecogeographical regions. A total of 55 accessions were resistant to anthracnose; these accessions could represent multiple resistance sources. Genome-wide association study based on fixed and random model Circulating Probability (farmCPU) identified genomic regions associated with plant height, flowering time, panicle length and diameter, and anthracnose resistance response. Integrated analysis of the Sudan core set and sorghum association panel indicated that a large portion of the genetic variation in the Sudan core set might be present in breeding programs but remains unexploited within some clusters of accessions. Conclusions The NPGS Sudan core collection comprises genetically and phenotypically diverse germplasm with multiple anthracnose resistance sources. Population genomic analysis could be used to improve screening efforts and identify the most valuable germplasm for breeding programs. The new GBS data set generated in this study represents a novel genomic resource for plant breeders interested in mining the genetic diversity of the NPGS sorghum collection.
Collapse
Affiliation(s)
- Hugo E Cuevas
- USDA-ARS, Tropical Agriculture Research Station, 2200 Pedro Albizu Campos Avenue, Mayaguez, 00680, Puerto Rico
| | - Louis K Prom
- USDA-ARS, Southern Plains Agriculture Research Center, College Station, TX, 77845, USA.
| |
Collapse
|
13
|
Uricchio LH. Evolutionary perspectives on polygenic selection, missing heritability, and GWAS. Hum Genet 2020; 139:5-21. [PMID: 31201529 PMCID: PMC8059781 DOI: 10.1007/s00439-019-02040-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 06/06/2019] [Indexed: 12/26/2022]
Abstract
Genome-wide association studies (GWAS) have successfully identified many trait-associated variants, but there is still much we do not know about the genetic basis of complex traits. Here, we review recent theoretical and empirical literature regarding selection on complex traits to argue that "missing heritability" is as much an evolutionary problem as it is a statistical problem. We discuss empirical findings that suggest a role for selection in shaping the effect sizes and allele frequencies of causal variation underlying complex traits, and the limitations of these studies. We then use simulations of selection, realistic genome structure, and complex human demography to illustrate the results of recent theoretical work on polygenic selection, and show that statistical inference of causal loci is sharply affected by evolutionary processes. In particular, when selection acts on causal alleles, it hampers the ability to detect causal loci and constrains the transferability of GWAS results across populations. Last, we discuss the implications of these findings for future association studies, and suggest that future statistical methods to infer causal loci for genetic traits will benefit from explicit modeling of the joint distribution of effect sizes and allele frequencies under plausible evolutionary models.
Collapse
Affiliation(s)
- Lawrence H Uricchio
- Department of Biology, Stanford University, Stanford, CA, USA.
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA, USA.
| |
Collapse
|
14
|
Tong DMH, Hernandez RD. Population genetic simulation study of power in association testing across genetic architectures and study designs. Genet Epidemiol 2020; 44:90-103. [PMID: 31587362 PMCID: PMC6980249 DOI: 10.1002/gepi.22264] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Revised: 08/26/2019] [Accepted: 09/16/2019] [Indexed: 12/22/2022]
Abstract
While it is well established that genetics can be a major contributor to population variation of complex traits, the relative contributions of rare and common variants to phenotypic variation remains a matter of considerable debate. Here, we simulate genetic and phenotypic data across different case/control panel sampling strategies, sequencing methods, and genetic architecture models based on evolutionary forces to determine the statistical performance of rare variant association tests (RVATs) widely in use. We find that the highest statistical power of RVATs is achieved by sampling case/control individuals from the extremes of an underlying quantitative trait distribution. We also demonstrate that the use of genotyping arrays, in conjunction with imputation from a whole-genome sequenced (WGS) reference panel, recovers the vast majority (90%) of the power that could be achieved by sequencing the case/control panel using current tools. Finally, we show that for dichotomous traits, the statistical performance of RVATs decreases as rare variants become more important in the trait architecture. Our results extend previous work to show that RVATs are insufficiently powered to make generalizable conclusions about the role of rare variants in dichotomous complex traits.
Collapse
Affiliation(s)
- Dominic M. H. Tong
- University of California, Berkeley ‐ University of California, San Francisco Graduate Program in BioengineeringSan FranciscoCalifornia
| | - Ryan D. Hernandez
- Department of Bioengineering and Therapeutic SciencesUniversity of CaliforniaSan FranciscoCalifornia
- Department of Human GeneticsMcGill UniversityMontrealCanada
| |
Collapse
|
15
|
Hernandez RD, Uricchio LH, Hartman K, Ye C, Dahl A, Zaitlen N. Ultrarare variants drive substantial cis heritability of human gene expression. Nat Genet 2019; 51:1349-1355. [PMID: 31477931 PMCID: PMC6730564 DOI: 10.1038/s41588-019-0487-7] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2018] [Accepted: 07/08/2019] [Indexed: 11/09/2022]
Abstract
The vast majority of human mutations have minor allele frequencies under 1%, with the plurality observed only once (that is, 'singletons'). While Mendelian diseases are predominantly caused by rare alleles, their cumulative contribution to complex phenotypes is largely unknown. We develop and rigorously validate an approach to jointly estimate the contribution of all alleles, including singletons, to phenotypic variation. We apply our approach to transcriptional regulation, an intermediate between genetic variation and complex disease. Using whole-genome DNA and lymphoblastoid cell line RNA sequencing data from 360 European individuals, we conservatively estimate that singletons contribute approximately 25% of cis heritability across genes (dwarfing the contributions of other frequencies). The majority (approximately 76%) of singleton heritability derives from ultrarare variants absent from thousands of additional samples. We develop an inference procedure to demonstrate that our results are consistent with pervasive purifying selection shaping the regulatory architecture of most human genes.
Collapse
Affiliation(s)
- Ryan D Hernandez
- Bioengineering & Therapeutic Sciences, UCSF, San Francisco, CA, USA.
- Institute for Human Genetics, UCSF, San Francisco, CA, USA.
- Institute for Quantitative Biosciences, UCSF, San Francisco, CA, USA.
- Institute for Computational Health Sciences, UCSF, San Francisco, CA, USA.
- Department of Human Genetics, McGill University, Montreal, Quebec, Canada.
- McGill University and the Genome Quebec Innovation Center, Montreal, Quebec, Canada.
| | | | - Kevin Hartman
- Biological and Medical Informatics Graduate Program, UCSF, San Francisco, CA, USA
| | - Chun Ye
- Institute for Human Genetics, UCSF, San Francisco, CA, USA
- Epidemiology & Biostatistics, UCSF, San Francisco, CA, USA
| | - Andrew Dahl
- Institute for Human Genetics, UCSF, San Francisco, CA, USA
- Institute for Quantitative Biosciences, UCSF, San Francisco, CA, USA
| | - Noah Zaitlen
- Institute for Human Genetics, UCSF, San Francisco, CA, USA.
- Institute for Quantitative Biosciences, UCSF, San Francisco, CA, USA.
- Department of Medicine Lung Biology Center, UCSF, San Francisco, CA, USA.
| |
Collapse
|
16
|
Helle E, Córdova-Palomera A, Ojala T, Saha P, Potiny P, Gustafsson S, Ingelsson E, Bamshad M, Nickerson D, Chong JX, Ashley E, Priest JR. Loss of function, missense, and intronic variants in NOTCH1 confer different risks for left ventricular outflow tract obstructive heart defects in two European cohorts. Genet Epidemiol 2019; 43:215-226. [PMID: 30511478 PMCID: PMC6375786 DOI: 10.1002/gepi.22176] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Revised: 10/03/2018] [Accepted: 10/17/2018] [Indexed: 01/08/2023]
Abstract
Loss of function variants in NOTCH1 cause left ventricular outflow tract obstructive defects (LVOTO). However, the risk conferred by rare and noncoding variants in NOTCH1 for LVOTO remains largely uncharacterized. In a cohort of 49 families affected by hypoplastic left heart syndrome, a severe form of LVOTO, we discovered predicted loss of function NOTCH1 variants in 6% of individuals. Rare or low-frequency missense variants were found in 16% of families. To make a quantitative estimate of the genetic risk posed by variants in NOTCH1 for LVOTO, we studied associations of 400 coding and noncoding variants in NOTCH1 in 1,085 cases and 332,788 controls from the UK Biobank. Two rare intronic variants in strong linkage disequilibrium displayed significant association with risk for LVOTO amongst European-ancestry individuals. This result was replicated in an independent analysis of 210 cases and 68,762 controls of non-European and mixed ancestry. In conclusion, carrying rare predicted loss of function variants in NOTCH1 confer significant risk for LVOTO. In addition, the two intronic variants seem to be associated with an increased risk for these defects. Our approach demonstrates the utility of population-based data sets in quantifying the specific risk of individual variants for disease-related phenotypes.
Collapse
Affiliation(s)
- Emmi Helle
- Pediatric Research Center, Children's Hospital, University of Helsinki, Helsinki, Finland
- Division of Cardiovascular Medicine, Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA
| | - Aldo Córdova-Palomera
- Department of Pediatrics, Division of Pediatric Cardiology, Stanford University School of Medicine, Stanford, CA
| | - Tiina Ojala
- Pediatric Research Center, Children's Hospital, University of Helsinki, Helsinki, Finland
| | - Priyanka Saha
- Department of Pediatrics, Division of Pediatric Cardiology, Stanford University School of Medicine, Stanford, CA
| | - Praneetha Potiny
- Department of Pediatrics, Division of Pediatric Cardiology, Stanford University School of Medicine, Stanford, CA
| | - Stefan Gustafsson
- Department of Medical Sciences, Molecular Epidemiology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Erik Ingelsson
- Division of Cardiovascular Medicine, Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA
- Department of Medical Sciences, Molecular Epidemiology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Michael Bamshad
- Department of Pediatrics, University of Washington, Seattle, WA
- Department of Genome Sciences, University of Washington, Seattle, WA
- Division of Genetic Medicine, Seattle Children's Hospital, Seattle, Washington
| | - Deborah Nickerson
- Department of Genome Sciences, University of Washington, Seattle, WA
| | - Jessica X Chong
- Department of Pediatrics, University of Washington, Seattle, WA
| | - Euan Ashley
- Division of Cardiovascular Medicine, Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA
| | - James R Priest
- Division of Cardiovascular Medicine, Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA
- Department of Pediatrics, Division of Pediatric Cardiology, Stanford University School of Medicine, Stanford, CA
| |
Collapse
|
17
|
Prohaska A, Racimo F, Schork AJ, Sikora M, Stern AJ, Ilardo M, Allentoft ME, Folkersen L, Buil A, Moreno-Mayar JV, Korneliussen T, Geschwind D, Ingason A, Werge T, Nielsen R, Willerslev E. Human Disease Variation in the Light of Population Genomics. Cell 2019; 177:115-131. [DOI: 10.1016/j.cell.2019.01.052] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Revised: 01/23/2019] [Accepted: 01/29/2019] [Indexed: 01/25/2023]
|
18
|
Schoech AP, Jordan DM, Loh PR, Gazal S, O'Connor LJ, Balick DJ, Palamara PF, Finucane HK, Sunyaev SR, Price AL. Quantification of frequency-dependent genetic architectures in 25 UK Biobank traits reveals action of negative selection. Nat Commun 2019; 10:790. [PMID: 30770844 PMCID: PMC6377669 DOI: 10.1038/s41467-019-08424-6] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Accepted: 01/09/2019] [Indexed: 02/06/2023] Open
Abstract
Understanding the role of rare variants is important in elucidating the genetic basis of human disease. Negative selection can cause rare variants to have larger per-allele effect sizes than common variants. Here, we develop a method to estimate the minor allele frequency (MAF) dependence of SNP effect sizes. We use a model in which per-allele effect sizes have variance proportional to [p(1 - p)]α, where p is the MAF and negative values of α imply larger effect sizes for rare variants. We estimate α for 25 UK Biobank diseases and complex traits. All traits produce negative α estimates, with best-fit mean of -0.38 (s.e. 0.02) across traits. Despite larger rare variant effect sizes, rare variants (MAF < 1%) explain less than 10% of total SNP-heritability for most traits analyzed. Using evolutionary modeling and forward simulations, we validate the α model of MAF-dependent trait effects and assess plausible values of relevant evolutionary parameters.
Collapse
Affiliation(s)
- Armin P Schoech
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, 02115, MA, USA.
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, 02115, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, 02142, MA, USA.
| | - Daniel M Jordan
- Charles R. Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA
| | - Po-Ru Loh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, 02142, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, 02115, MA, USA
| | - Steven Gazal
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, 02115, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, 02142, MA, USA
| | - Luke J O'Connor
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, 02115, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, 02115, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, 02142, MA, USA
| | - Daniel J Balick
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, 02115, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, 02115, MA, USA
| | - Pier F Palamara
- Department of Statistics, University of Oxford, Oxford, OX1 3LB, UK
| | - Hilary K Finucane
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, 02142, MA, USA
| | - Shamil R Sunyaev
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, 02142, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, 02115, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, 02115, MA, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, 02115, MA, USA.
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, 02115, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, 02142, MA, USA.
| |
Collapse
|
19
|
Uricchio LH, Kitano HC, Gusev A, Zaitlen NA. An evolutionary compass for detecting signals of polygenic selection and mutational bias. Evol Lett 2019; 3:69-79. [PMID: 30788143 PMCID: PMC6369964 DOI: 10.1002/evl3.97] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Revised: 12/03/2018] [Accepted: 12/10/2018] [Indexed: 12/17/2022] Open
Abstract
Selection and mutation shape the genetic variation underlying human traits, but the specific evolutionary mechanisms driving complex trait variation are largely unknown. We developed a statistical method that uses polarized genome-wide association study (GWAS) summary statistics from a single population to detect signals of mutational bias and selection. We found evidence for nonneutral signals on variation underlying several traits (body mass index [BMI], schizophrenia, Crohn's disease, educational attainment, and height). We then used simulations that incorporate simultaneous negative and positive selection to show that these signals are consistent with mutational bias and shifts in the fitness-phenotype relationship, but not stabilizing selection or mutational bias alone. We additionally replicate two of our top three signals (BMI and educational attainment) in an external cohort, and show that population stratification may have confounded GWAS summary statistics for height in the GIANT cohort. Our results provide a flexible and powerful framework for evolutionary analysis of complex phenotypes in humans and other species, and offer insights into the evolutionary mechanisms driving variation in human polygenic traits.
Collapse
Affiliation(s)
| | - Hugo C. Kitano
- Department of Computer ScienceStanford UniversityStanfordCA
| | | | - Noah A. Zaitlen
- Department of MedicineUniversity of CaliforniaSan FranciscoCA
- Bioengineering and Therapeutic SciencesUniversity of CaliforniaSan FranciscoCA
| |
Collapse
|
20
|
Chong R, Insigne KD, Yao D, Burghard CP, Wang J, Hsiao YHE, Jones EM, Goodman DB, Xiao X, Kosuri S. A Multiplexed Assay for Exon Recognition Reveals that an Unappreciated Fraction of Rare Genetic Variants Cause Large-Effect Splicing Disruptions. Mol Cell 2019; 73:183-194.e8. [PMID: 30503770 PMCID: PMC6599603 DOI: 10.1016/j.molcel.2018.10.037] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Revised: 07/19/2018] [Accepted: 10/23/2018] [Indexed: 11/23/2022]
Abstract
Mutations that lead to splicing defects can have severe consequences on gene function and cause disease. Here, we explore how human genetic variation affects exon recognition by developing a multiplexed functional assay of splicing using Sort-seq (MFASS). We assayed 27,733 variants in the Exome Aggregation Consortium (ExAC) within or adjacent to 2,198 human exons in the MFASS minigene reporter and found that 3.8% (1,050) of variants, most of which are extremely rare, led to large-effect splice-disrupting variants (SDVs). Importantly, we find that 83% of SDVs are located outside of canonical splice sites, are distributed evenly across distinct exonic and intronic regions, and are difficult to predict a priori. Our results indicate extant, rare genetic variants can have large functional effects on splicing at appreciable rates, even outside the context of disease, and MFASS enables their empirical assessment at scale.
Collapse
Affiliation(s)
- Rockie Chong
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Kimberly D Insigne
- Bioinformatics Interdepartmental Graduate Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - David Yao
- Department of Genetics, Stanford University, Stanford, CA 94035, USA
| | - Christina P Burghard
- Bioinformatics Interdepartmental Graduate Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Jeffrey Wang
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Yun-Hua E Hsiao
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Eric M Jones
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Daniel B Goodman
- Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Xinshu Xiao
- Bioinformatics Interdepartmental Graduate Program, University of California, Los Angeles, Los Angeles, CA 90095, USA; Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA; Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA; Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA; UCLA-DOE Institute for Genomics and Proteomics, Quantitative and Computational Biology Institute, Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA 90095, USA.
| |
Collapse
|
21
|
The arms race between man and Mycobacterium tuberculosis: Time to regroup. INFECTION GENETICS AND EVOLUTION 2018; 66:361-375. [DOI: 10.1016/j.meegid.2017.08.021] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Revised: 08/21/2017] [Accepted: 08/22/2017] [Indexed: 12/12/2022]
|
22
|
Ragsdale AP, Moreau C, Gravel S. Genomic inference using diffusion models and the allele frequency spectrum. Curr Opin Genet Dev 2018; 53:140-147. [PMID: 30366252 DOI: 10.1016/j.gde.2018.10.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Revised: 09/14/2018] [Accepted: 10/07/2018] [Indexed: 01/25/2023]
Abstract
Evolutionary, biological, and demographic processes together shape observed variation in populations. Understanding how these processes influence variation allows us to infer past demography and the nature of selection in populations. Forward in time models such as the diffusion approximation provide a powerful tool for performing inference based on the distribution of allele frequencies. Here, we discuss recent computational developments and their application to reconstructing human demographic history. Using whole-genome sequence data for 797 French Canadian individuals, we assess the neutrality of synonymous variants and show that selection can bias inferred demography, mutation rates, and distributions of fitness effects. We argue that the simple evolutionary models investigated by Kimura and Ohta still provide important insight into modern genetic research.
Collapse
Affiliation(s)
- Aaron P Ragsdale
- Department of Human Genetics, McGill University, Montreal, QC, Canada
| | - Claudia Moreau
- Département des Sciences Fondamentales, Université du Québec à Chicoutimi, Chicoutimi, QC, Canada
| | - Simon Gravel
- Department of Human Genetics, McGill University, Montreal, QC, Canada.
| |
Collapse
|
23
|
Abstract
The population of the Mediterranean island of Sardinia has made important contributions to genome-wide association studies of complex disease traits and, based on ancient DNA (aDNA) studies of mainland Europe, Sardinia is hypothesized to be a unique refuge for early Neolithic ancestry. To provide new insights on the genetic history of this flagship population, we analyzed 3,514 whole-genome sequenced individuals from Sardinia. We find Sardinian samples show elevated levels of shared ancestry with Basque individuals, especially samples from the more historically isolated regions of Sardinia. Our analysis also uniquely illuminates how levels of genetic similarity with mainland aDNA samples varies subtly across the island. Together, our results indicate within-island sub-structure and sex-biased processes have substantially impacted the genetic history of Sardinia. These results give new insight to the demography of ancestral Sardinians and help further the understanding of sharing of disease risk alleles between Sardinia and mainland populations.
Collapse
|
24
|
Xue A, Wu Y, Zhu Z, Zhang F, Kemper KE, Zheng Z, Yengo L, Lloyd-Jones LR, Sidorenko J, Wu Y, McRae AF, Visscher PM, Zeng J, Yang J. Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes. Nat Commun 2018; 9:2941. [PMID: 30054458 PMCID: PMC6063971 DOI: 10.1038/s41467-018-04951-w] [Citation(s) in RCA: 474] [Impact Index Per Article: 79.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2018] [Accepted: 06/05/2018] [Indexed: 02/06/2023] Open
Abstract
Type 2 diabetes (T2D) is a very common disease in humans. Here we conduct a meta-analysis of genome-wide association studies (GWAS) with ~16 million genetic variants in 62,892 T2D cases and 596,424 controls of European ancestry. We identify 139 common and 4 rare variants associated with T2D, 42 of which (39 common and 3 rare variants) are independent of the known variants. Integration of the gene expression data from blood (n = 14,115 and 2765) with the GWAS results identifies 33 putative functional genes for T2D, 3 of which were targeted by approved drugs. A further integration of DNA methylation (n = 1980) and epigenomic annotation data highlight 3 genes (CAMK1D, TP53INP1, and ATP5G1) with plausible regulatory mechanisms, whereby a genetic variant exerts an effect on T2D through epigenetic regulation of gene expression. Our study uncovers additional loci, proposes putative genetic regulatory mechanisms for T2D, and provides evidence of purifying selection for T2D-associated variants.
Collapse
Affiliation(s)
- Angli Xue
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia
| | - Yang Wu
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia
| | - Zhihong Zhu
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia
| | - Futao Zhang
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia
| | - Kathryn E Kemper
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia
| | - Zhili Zheng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia
- The Eye Hospital, School of Ophthalmology & Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, 325027, China
| | - Loic Yengo
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia
| | - Luke R Lloyd-Jones
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia
| | - Julia Sidorenko
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia
- Estonian Genome Center, Institute of Genomics, University of Tartu, Tartu, 51010, Estonia
| | - Yeda Wu
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia
| | - Allan F McRae
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia
- Queensland Brain Institute, The University of Queensland, Brisbane, Queensland, 4072, Australia
| | - Peter M Visscher
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia
- Queensland Brain Institute, The University of Queensland, Brisbane, Queensland, 4072, Australia
| | - Jian Zeng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia.
| | - Jian Yang
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia.
- The Eye Hospital, School of Ophthalmology & Optometry, Wenzhou Medical University, Wenzhou, Zhejiang, 325027, China.
- Queensland Brain Institute, The University of Queensland, Brisbane, Queensland, 4072, Australia.
| |
Collapse
|
25
|
Torres R, Szpiech ZA, Hernandez RD. Human demographic history has amplified the effects of background selection across the genome. PLoS Genet 2018; 14:e1007387. [PMID: 29912945 PMCID: PMC6056204 DOI: 10.1371/journal.pgen.1007387] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Revised: 07/23/2018] [Accepted: 04/30/2018] [Indexed: 01/22/2023] Open
Abstract
Natural populations often grow, shrink, and migrate over time. Such demographic processes can affect genome-wide levels of genetic diversity. Additionally, genetic variation in functional regions of the genome can be altered by natural selection, which drives adaptive mutations to higher frequencies or purges deleterious ones. Such selective processes affect not only the sites directly under selection but also nearby neutral variation through genetic linkage via processes referred to as genetic hitchhiking in the context of positive selection and background selection (BGS) in the context of purifying selection. While there is extensive literature examining the consequences of selection at linked sites at demographic equilibrium, less is known about how non-equilibrium demographic processes influence the effects of hitchhiking and BGS. Utilizing a global sample of human whole-genome sequences from the Thousand Genomes Project and extensive simulations, we investigate how non-equilibrium demographic processes magnify and dampen the consequences of selection at linked sites across the human genome. When binning the genome by inferred strength of BGS, we observe that, compared to Africans, non-African populations have experienced larger proportional decreases in neutral genetic diversity in strong BGS regions. We replicate these findings in admixed populations by showing that non-African ancestral components of the genome have also been affected more severely in these regions. We attribute these differences to the strong, sustained/recurrent population bottlenecks that non-Africans experienced as they migrated out of Africa and throughout the globe. Furthermore, we observe a strong correlation between FST and the inferred strength of BGS, suggesting a stronger rate of genetic drift. Forward simulations of human demographic history with a model of BGS support these observations. Our results show that non-equilibrium demography significantly alters the consequences of selection at linked sites and support the need for more work investigating the dynamic process of multiple evolutionary forces operating in concert.
Collapse
Affiliation(s)
- Raul Torres
- Biomedical Sciences Graduate Program, University of California San Francisco, San Francisco, CA, United States of America
| | - Zachary A. Szpiech
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, United States of America
| | - Ryan D. Hernandez
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, United States of America
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, United States of America
- Institute for Computational Health Sciences, University of California San Francisco, San Francisco, CA, United States of America
- Quantitative Biosciences Institute, University of California San Francisco, San Francisco, CA, United States of America
- * E-mail:
| |
Collapse
|
26
|
Zeng J, de Vlaming R, Wu Y, Robinson MR, Lloyd-Jones LR, Yengo L, Yap CX, Xue A, Sidorenko J, McRae AF, Powell JE, Montgomery GW, Metspalu A, Esko T, Gibson G, Wray NR, Visscher PM, Yang J. Signatures of negative selection in the genetic architecture of human complex traits. Nat Genet 2018; 50:746-753. [PMID: 29662166 DOI: 10.1038/s41588-018-0101-4] [Citation(s) in RCA: 195] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Accepted: 03/05/2018] [Indexed: 11/09/2022]
Abstract
We develop a Bayesian mixed linear model that simultaneously estimates single-nucleotide polymorphism (SNP)-based heritability, polygenicity (proportion of SNPs with nonzero effects), and the relationship between SNP effect size and minor allele frequency for complex traits in conventionally unrelated individuals using genome-wide SNP data. We apply the method to 28 complex traits in the UK Biobank data (N = 126,752) and show that on average, 6% of SNPs have nonzero effects, which in total explain 22% of phenotypic variance. We detect significant (P < 0.05/28) signatures of natural selection in the genetic architecture of 23 traits, including reproductive, cardiovascular, and anthropometric traits, as well as educational attainment. The significant estimates of the relationship between effect size and minor allele frequency in complex traits are consistent with a model of negative (or purifying) selection, as confirmed by forward simulation. We conclude that negative selection acts pervasively on the genetic variants associated with human complex traits.
Collapse
Affiliation(s)
- Jian Zeng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - Ronald de Vlaming
- School of Business and Economics, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.,Erasmus University Rotterdam Institute for Behavior and Biology, Rotterdam, The Netherlands
| | - Yang Wu
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - Matthew R Robinson
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia.,Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Luke R Lloyd-Jones
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - Loic Yengo
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - Chloe X Yap
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - Angli Xue
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - Julia Sidorenko
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia.,Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Allan F McRae
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - Joseph E Powell
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | - Grant W Montgomery
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia
| | | | - Tonu Esko
- Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Greg Gibson
- School of Biological Sciences and Center for Integrative Genomics, Georgia Institute of Technology, Atlanta, GA, USA
| | - Naomi R Wray
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia.,Queensland Brain Institute, The University of Queensland, Brisbane, QLD, Australia
| | - Peter M Visscher
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia.,Queensland Brain Institute, The University of Queensland, Brisbane, QLD, Australia
| | - Jian Yang
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia. .,Queensland Brain Institute, The University of Queensland, Brisbane, QLD, Australia.
| |
Collapse
|
27
|
Abstract
Modern molecular genetic datasets, primarily collected to study the biology of human health and disease, can be used to directly measure the action of natural selection and reveal important features of contemporary human evolution. Here we leverage the UK Biobank data to test for the presence of linear and nonlinear natural selection in a contemporary population of the United Kingdom. We obtain phenotypic and genetic evidence consistent with the action of linear/directional selection. Phenotypic evidence suggests that stabilizing selection, which acts to reduce variance in the population without necessarily modifying the population mean, is widespread and relatively weak in comparison with estimates from other species.
Collapse
|
28
|
Li X, Kim Y, Tsang EK, Davis JR, Damani FN, Chiang C, Hess GT, Zappala Z, Strober BJ, Scott AJ, Li A, Ganna A, Bassik MC, Merker JD, Hall IM, Battle A, Montgomery SB. The impact of rare variation on gene expression across tissues. Nature 2017; 550:239-243. [PMID: 29022581 PMCID: PMC5877409 DOI: 10.1038/nature24267] [Citation(s) in RCA: 159] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Accepted: 09/13/2017] [Indexed: 12/24/2022]
Abstract
Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.
Collapse
Affiliation(s)
- Xin Li
- Department of Pathology, Stanford University, Stanford, California 94305, USA
| | - Yungil Kim
- Department of Computer Science, Johns Hopkins University, Baltimore 21218, Maryland, USA
| | - Emily K Tsang
- Department of Pathology, Stanford University, Stanford, California 94305, USA
- Biomedical Informatics Program, Stanford University, Stanford, California 94305, USA
| | - Joe R Davis
- Department of Pathology, Stanford University, Stanford, California 94305, USA
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Farhan N Damani
- Department of Computer Science, Johns Hopkins University, Baltimore 21218, Maryland, USA
| | - Colby Chiang
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, Missouri 63108, USA
| | - Gaelen T Hess
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Zachary Zappala
- Department of Pathology, Stanford University, Stanford, California 94305, USA
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Benjamin J Strober
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Alexandra J Scott
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, Missouri 63108, USA
| | - Amy Li
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Andrea Ganna
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Michael C Bassik
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | - Jason D Merker
- Department of Pathology, Stanford University, Stanford, California 94305, USA
| | - Ira M Hall
- McDonnell Genome Institute, Washington University School of Medicine, St Louis, Missouri 63108, USA
- Department of Medicine, Washington University School of Medicine, St Louis, Missouri 63110, USA
- Department of Genetics, Washington University School of Medicine, St Louis, Missouri 63110, USA
| | - Alexis Battle
- Department of Computer Science, Johns Hopkins University, Baltimore 21218, Maryland, USA
| | - Stephen B Montgomery
- Department of Pathology, Stanford University, Stanford, California 94305, USA
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
29
|
Yang J, Mezmouk S, Baumgarten A, Buckler ES, Guill KE, McMullen MD, Mumm RH, Ross-Ibarra J. Incomplete dominance of deleterious alleles contributes substantially to trait variation and heterosis in maize. PLoS Genet 2017; 13:e1007019. [PMID: 28953891 PMCID: PMC5633198 DOI: 10.1371/journal.pgen.1007019] [Citation(s) in RCA: 97] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Revised: 10/09/2017] [Accepted: 09/13/2017] [Indexed: 12/20/2022] Open
Abstract
Deleterious alleles have long been proposed to play an important role in patterning phenotypic variation and are central to commonly held ideas explaining the hybrid vigor observed in the offspring of a cross between two inbred parents. We test these ideas using evolutionary measures of sequence conservation to ask whether incorporating information about putatively deleterious alleles can inform genomic selection (GS) models and improve phenotypic prediction. We measured a number of agronomic traits in both the inbred parents and hybrids of an elite maize partial diallel population and re-sequenced the parents of the population. Inbred elite maize lines vary for more than 350,000 putatively deleterious sites, but show a lower burden of such sites than a comparable set of traditional landraces. Our modeling reveals widespread evidence for incomplete dominance at these loci, and supports theoretical models that more damaging variants are usually more recessive. We identify haplotype blocks using an identity-by-decent (IBD) analysis and perform genomic prediction analyses in which we weigh blocks on the basis of complementation for segregating putatively deleterious variants. Cross-validation results show that incorporating sequence conservation in genomic selection improves prediction accuracy for grain yield and other fitness-related traits as well as heterosis for those traits. Our results provide empirical support for an important role for incomplete dominance of deleterious alleles in explaining heterosis and demonstrate the utility of incorporating functional annotation in phenotypic prediction and plant breeding.
Collapse
Affiliation(s)
- Jinliang Yang
- Department of Plant Sciences, University of California, Davis, Davis, California, United States of America
| | - Sofiane Mezmouk
- Department of Plant Sciences, University of California, Davis, Davis, California, United States of America
| | | | - Edward S. Buckler
- School of Integrative Plant Sciences, Section of Plant Breeding and Genetics, Cornell University, Ithaca, New York, United States of America
- Institute for Genomic Diversity, Ithaca, New York, United States of America
- US Department of Agriculture–Agricultural Research Service, Ithaca, New York, United States of America
| | - Katherine E. Guill
- US Department of Agriculture, Agricultural Research Service, Columbia, Missouri, United States of America
| | - Michael D. McMullen
- US Department of Agriculture, Agricultural Research Service, Columbia, Missouri, United States of America
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Rita H. Mumm
- Department of Crop Sciences and the Illinois Plant Breeding Center, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Jeffrey Ross-Ibarra
- Department of Plant Sciences, University of California, Davis, Davis, California, United States of America
- Center for Population Biology and Genome Center, University of California, Davis, California, United States of America
| |
Collapse
|
30
|
Inference of the Distribution of Selection Coefficients for New Nonsynonymous Mutations Using Large Samples. Genetics 2017; 206:345-361. [PMID: 28249985 PMCID: PMC5419480 DOI: 10.1534/genetics.116.197145] [Citation(s) in RCA: 118] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2016] [Accepted: 02/14/2017] [Indexed: 12/23/2022] Open
Abstract
The distribution of fitness effects (DFE) has considerable importance in population genetics. To date, estimates of the DFE come from studies using a small number of individuals. Thus, estimates of the proportion of moderately to strongly deleterious new mutations may be unreliable because such variants are unlikely to be segregating in the data. Additionally, the true functional form of the DFE is unknown, and estimates of the DFE differ significantly between studies. Here we present a flexible and computationally tractable method, called Fit∂a∂i, to estimate the DFE of new mutations using the site frequency spectrum from a large number of individuals. We apply our approach to the frequency spectrum of 1300 Europeans from the Exome Sequencing Project ESP6400 data set, 1298 Danes from the LuCamp data set, and 432 Europeans from the 1000 Genomes Project to estimate the DFE of deleterious nonsynonymous mutations. We infer significantly fewer (0.38-0.84 fold) strongly deleterious mutations with selection coefficient |s| > 0.01 and more (1.24-1.43 fold) weakly deleterious mutations with selection coefficient |s| < 0.001 compared to previous estimates. Furthermore, a DFE that is a mixture distribution of a point mass at neutrality plus a gamma distribution fits better than a gamma distribution in two of the three data sets. Our results suggest that nearly neutral forces play a larger role in human evolution than previously thought.
Collapse
|
31
|
Rare genetic variants in SMAP1, B3GAT2, and RIMS1 contribute to pediatric venous thromboembolism. Blood 2017; 129:783-790. [DOI: 10.1182/blood-2016-07-728840] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Accepted: 12/19/2016] [Indexed: 12/30/2022] Open
Abstract
Key Points
Our study identified a region on chromosome 6 comprising the genes SMAP1, B3GAT2, and RIMS1 as novel susceptibility locus for pediatric VTE. Nonsynonymous variants in SMAP1 and RIMS1 are predicted as deleterious and may influence vesicle processing in blood cells.
Collapse
|
32
|
A Model of Compound Heterozygous, Loss-of-Function Alleles Is Broadly Consistent with Observations from Complex-Disease GWAS Datasets. PLoS Genet 2017; 13:e1006573. [PMID: 28103232 PMCID: PMC5289629 DOI: 10.1371/journal.pgen.1006573] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Revised: 02/02/2017] [Accepted: 01/05/2017] [Indexed: 12/17/2022] Open
Abstract
The genetic component of complex disease risk in humans remains largely unexplained. A corollary is that the allelic spectrum of genetic variants contributing to complex disease risk is unknown. Theoretical models that relate population genetic processes to the maintenance of genetic variation for quantitative traits may suggest profitable avenues for future experimental design. Here we use forward simulation to model a genomic region evolving under a balance between recurrent deleterious mutation and Gaussian stabilizing selection. We consider multiple genetic and demographic models, and several different methods for identifying genomic regions harboring variants associated with complex disease risk. We demonstrate that the model of gene action, relating genotype to phenotype, has a qualitative effect on several relevant aspects of the population genetic architecture of a complex trait. In particular, the genetic model impacts genetic variance component partitioning across the allele frequency spectrum and the power of statistical tests. Models with partial recessivity closely match the minor allele frequency distribution of significant hits from empirical genome-wide association studies without requiring homozygous effect sizes to be small. We highlight a particular gene-based model of incomplete recessivity that is appealing from first principles. Under that model, deleterious mutations in a genomic region partially fail to complement one another. This model of gene-based recessivity predicts the empirically observed inconsistency between twin and SNP based estimated of dominance heritability. Furthermore, this model predicts considerable levels of unexplained variance associated with intralocus epistasis. Our results suggest a need for improved statistical tools for region based genetic association and heritability estimation. Gene action determines how mutations affect phenotype. When placed in an evolutionary context, the details of the genotype-to-phenotype model can impact the maintenance of genetic variation for complex traits. Likewise, non-equilibrium demographic history may affect patterns of genetic variation. Here, we explore the impact of genetic model and population growth on distribution of genetic variance across the allele frequency spectrum underlying risk for a complex disease. Using forward-in-time population genetic simulations, we show that the genetic model has important impacts on the composition of variation for complex disease risk in a population. We explicitly simulate genome-wide association studies (GWAS) and perform heritability estimation on population samples. A particular model of gene-based partial recessivity, based on allelic non-complementation, aligns well with empirical results. This model is congruent with the dominance variance estimates from both SNPs and twins, and the minor allele frequency distribution of GWAS hits.
Collapse
|
33
|
Gao F, Keinan A. Explosive genetic evidence for explosive human population growth. Curr Opin Genet Dev 2016; 41:130-139. [PMID: 27710906 PMCID: PMC5161661 DOI: 10.1016/j.gde.2016.09.002] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Revised: 08/26/2016] [Accepted: 09/11/2016] [Indexed: 11/19/2022]
Abstract
The advent of next-generation sequencing technology has allowed the collection of vast amounts of genetic variation data. A recurring discovery from studying larger and larger samples of individuals had been the extreme, previously unexpected, excess of very rare genetic variants, which has been shown to be mostly due to the recent explosive growth of human populations. Here, we review recent literature that inferred recent changes in population size in different human populations and with different methodologies, with many pointing to recent explosive growth, especially in European populations for which more data has been available. We also review the state-of-the-art methods and software for the inference of historical population size changes that lead to these discoveries. Finally, we discuss the implications of recent population growth on personalized genomics, on purifying selection in the non-equilibrium state it entails and, as a consequence, on the genetic architecture underlying complex disease and the performance of mapping methods in discovering rare variants that contribute to complex disease risk.
Collapse
Affiliation(s)
- Feng Gao
- Department of Biological Statistics and Computational Biology, Ithaca, NY 14850, United States
| | - Alon Keinan
- Department of Biological Statistics and Computational Biology, Ithaca, NY 14850, United States.
| |
Collapse
|
34
|
Abstract
The wealth of available genetic information is allowing the reconstruction of human demographic and adaptive history. Demography and purifying selection affect the purge of rare, deleterious mutations from the human population, whereas positive and balancing selection can increase the frequency of advantageous variants, improving survival and reproduction in specific environmental conditions. In this review, I discuss how theoretical and empirical population genetics studies, using both modern and ancient DNA data, are a powerful tool for obtaining new insight into the genetic basis of severe disorders and complex disease phenotypes, rare and common, focusing particularly on infectious disease risk.
Collapse
Affiliation(s)
- Lluis Quintana-Murci
- Human Evolutionary Genetics Unit, Department of Genomes & Genetics, Institut Pasteur, Paris, 75015, France.
- Centre National de la Recherche Scientifique, URA3012, Paris, 75015, France.
- Center of Bioinformatics, Biostatistics and Integrative Biology, Institut Pasteur, Paris, 75015, France.
| |
Collapse
|
35
|
Yang J, Bakshi A, Zhu Z, Hemani G, Vinkhuyzen AAE, Lee SH, Robinson MR, Perry JRB, Nolte IM, van Vliet-Ostaptchouk JV, Snieder H, Esko T, Milani L, Mägi R, Metspalu A, Hamsten A, Magnusson PKE, Pedersen NL, Ingelsson E, Soranzo N, Keller MC, Wray NR, Goddard ME, Visscher PM. Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat Genet 2015; 47:1114-20. [PMID: 26323059 PMCID: PMC4589513 DOI: 10.1038/ng.3390] [Citation(s) in RCA: 511] [Impact Index Per Article: 56.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2015] [Accepted: 07/31/2015] [Indexed: 12/17/2022]
Abstract
We propose a method (GREML-LDMS) to estimate heritability for human complex traits in unrelated individuals using whole-genome sequencing data. We demonstrate using simulations based on whole-genome sequencing data that ∼97% and ∼68% of variation at common and rare variants, respectively, can be captured by imputation. Using the GREML-LDMS method, we estimate from 44,126 unrelated individuals that all ∼17 million imputed variants explain 56% (standard error (s.e.) = 2.3%) of variance for height and 27% (s.e. = 2.5%) of variance for body mass index (BMI), and we find evidence that height- and BMI-associated variants have been under natural selection. Considering the imperfect tagging of imputation and potential overestimation of heritability from previous family-based studies, heritability is likely to be 60-70% for height and 30-40% for BMI. Therefore, the missing heritability is small for both traits. For further discovery of genes associated with complex traits, a study design with SNP arrays followed by imputation is more cost-effective than whole-genome sequencing at current prices.
Collapse
Affiliation(s)
- Jian Yang
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
- University of Queensland Diamantina Institute, Translation Research Institute, Brisbane, Queensland, Australia
| | - Andrew Bakshi
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
| | - Zhihong Zhu
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
| | - Gibran Hemani
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
- Medical Research Council (MRC) Integrative Epidemiology Unit (IEU) at the University of Bristol, School of Social and Community Medicine, Bristol, UK
| | - Anna A E Vinkhuyzen
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
| | - Sang Hong Lee
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
- School of Environmental and Rural Science, University of New England, Armidale, New South Wales, Australia
| | - Matthew R Robinson
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
| | - John R B Perry
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Institute of Metabolic Science, Cambridge Biomedical Campus, Cambridge, UK
| | - Ilja M Nolte
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Jana V van Vliet-Ostaptchouk
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- Department of Endocrinology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Harold Snieder
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Tonu Esko
- Estonian Genome Center, University of Tartu, Tartu, Estonia
- Division of Endocrinology, Boston Children's Hospital, Cambridge, Massachusetts, USA
- Program in Medical and Populational Genetics, Broad Institute, Cambridge, Massachusetts, USA
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| | - Lili Milani
- Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Reedik Mägi
- Estonian Genome Center, University of Tartu, Tartu, Estonia
| | - Andres Metspalu
- Estonian Genome Center, University of Tartu, Tartu, Estonia
- Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Anders Hamsten
- Cardiovascular Genetics and Genomics Group, Atherosclerosis Research Unit, Department of Medicine Solna, Karolinska Institutet, Stockholm, Sweden
| | - Patrik K E Magnusson
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Nancy L Pedersen
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Erik Ingelsson
- Department of Medical Sciences, Molecular Epidemiology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- Division of Cardiovascular Medicine, Department of Medicine, Stanford University School of Medicine, Stanford, California, USA
| | - Nicole Soranzo
- Department of Human Genetics, Wellcome Trust Sanger Institute, Genome Campus, Hinxton, UK
- Department of Haematology, University of Cambridge, Cambridge, UK
| | - Matthew C Keller
- Department of Psychology and Neuroscience, University of Colorado, Boulder, Colorado, USA
- Institute for Behavioral Genetics, University of Colorado, Boulder, Colorado, USA
| | - Naomi R Wray
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
| | - Michael E Goddard
- Faculty of Veterinary and Agricultural Science, University of Melbourne, Parkville, Victoria, Australia
- Biosciences Research Division, Department of Economic Development, Jobs, Transport and Resources, Bundoora, Victoria, Australia
| | - Peter M Visscher
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
- University of Queensland Diamantina Institute, Translation Research Institute, Brisbane, Queensland, Australia
| |
Collapse
|