351
|
Gautier M, Klassmann A, Vitalis R. rehh 2.0: a reimplementation of the R package rehh to detect positive selection from haplotype structure. Mol Ecol Resour 2016; 17:78-90. [PMID: 27863062 DOI: 10.1111/1755-0998.12634] [Citation(s) in RCA: 205] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2016] [Revised: 10/29/2016] [Accepted: 10/31/2016] [Indexed: 01/01/2023]
Abstract
Identifying genomic regions with unusually high local haplotype homozygosity represents a powerful strategy to characterize candidate genes responding to natural or artificial positive selection. To that end, statistics measuring the extent of haplotype homozygosity within (e.g. EHH, iHS) and between (Rsb or XP-EHH) populations have been proposed in the literature. The rehh package for r was previously developed to facilitate genome-wide scans of selection, based on the analysis of long-range haplotypes. However, its performance was not sufficient to cope with the growing size of available data sets. Here, we propose a major upgrade of the rehh package, which includes an improved processing of the input files, a faster algorithm to enumerate haplotypes, as well as multithreading. As illustrated with the analysis of large human haplotype data sets, these improvements decrease the computation time by more than one order of magnitude. This new version of rehh will thus allow performing iHS-, Rsb- or XP-EHH-based scans on large data sets. The package rehh 2.0 is available from the CRAN repository (http://cran.r-project.org/web/packages/rehh/index.html) together with help files and a detailed manual.
Collapse
Affiliation(s)
- Mathieu Gautier
- INRA, UMR CBGP, Montferrier-sur-Lez, F-34988, France.,Institut de Biologie Computationnelle, Montpellier, F-34095, France
| | | | - Renaud Vitalis
- INRA, UMR CBGP, Montferrier-sur-Lez, F-34988, France.,Institut de Biologie Computationnelle, Montpellier, F-34095, France
| |
Collapse
|
352
|
Abstract
Meiotic recombination in mammals has been shown to largely cluster into hotspots, which are targeted by the chromatin modifier PRDM9. The canid family, including wolves and dogs, has undergone a series of disrupting mutations in this gene, rendering PRDM9 inactive. Given the importance of PRDM9, it is of great interest to learn how its absence in the dog genome affects patterns of recombination placement. We have used genotypes from domestic dog pedigrees to generate sex-specific genetic maps of recombination in this species. On a broad scale, we find that placement of recombination events in dogs is consistent with that in mice and apes, in that the majority of recombination occurs toward the telomeres in males, while female crossing over is more frequent and evenly spread along chromosomes. It has been previously suggested that dog recombination is more uniform in distribution than that of humans; however, we found that recombination in dogs is less uniform than in humans. We examined the distribution of recombination within the genome, and found that recombination is elevated immediately upstream of the transcription start site and around CpG islands, in agreement with previous studies, but that this effect is stronger in male dogs. We also found evidence for positive crossover interference influencing the spacing between recombination events in dogs, as has been observed in other species including humans and mice. Overall our data suggests that dogs have similar broad scale properties of recombination to humans, while fine scale recombination is similar to other species lacking PRDM9.
Collapse
|
353
|
Fromer M, Roussos P, Sieberts SK, Johnson JS, Kavanagh DH, Perumal TM, Ruderfer DM, Oh EC, Topol A, Shah HR, Klei LL, Kramer R, Pinto D, Gümüş ZH, Cicek AE, Dang KK, Browne A, Lu C, Xie L, Readhead B, Stahl EA, Xiao J, Parvizi M, Hamamsy T, Fullard JF, Wang YC, Mahajan MC, Derry JMJ, Dudley JT, Hemby SE, Logsdon BA, Talbot K, Raj T, Bennett DA, De Jager PL, Zhu J, Zhang B, Sullivan PF, Chess A, Purcell SM, Shinobu LA, Mangravite LM, Toyoshiba H, Gur RE, Hahn CG, Lewis DA, Haroutunian V, Peters MA, Lipska BK, Buxbaum JD, Schadt EE, Hirai K, Roeder K, Brennand KJ, Katsanis N, Domenici E, Devlin B, Sklar P. Gene expression elucidates functional impact of polygenic risk for schizophrenia. Nat Neurosci 2016; 19:1442-1453. [PMID: 27668389 PMCID: PMC5083142 DOI: 10.1038/nn.4399] [Citation(s) in RCA: 771] [Impact Index Per Article: 85.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2016] [Accepted: 09/01/2016] [Indexed: 12/15/2022]
Abstract
Over 100 genetic loci harbor schizophrenia-associated variants, yet how these variants confer liability is uncertain. The CommonMind Consortium sequenced RNA from dorsolateral prefrontal cortex of people with schizophrenia (N = 258) and control subjects (N = 279), creating a resource of gene expression and its genetic regulation. Using this resource, ∼20% of schizophrenia loci have variants that could contribute to altered gene expression and liability. In five loci, only a single gene was involved: FURIN, TSNARE1, CNTN4, CLCN3 or SNAP91. Altering expression of FURIN, TSNARE1 or CNTN4 changed neurodevelopment in zebrafish; knockdown of FURIN in human neural progenitor cells yielded abnormal migration. Of 693 genes showing significant case-versus-control differential expression, their fold changes were ≤ 1.33, and an independent cohort yielded similar results. Gene co-expression implicates a network relevant for schizophrenia. Our findings show that schizophrenia is polygenic and highlight the utility of this resource for mechanistic interpretations of genetic liability for brain diseases.
Collapse
Affiliation(s)
- Menachem Fromer
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Panos Roussos
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Psychiatry, JJ Peters Virginia Medical Center, Bronx, New York, USA
| | | | - Jessica S Johnson
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - David H Kavanagh
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | | | - Douglas M Ruderfer
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Edwin C Oh
- Center for Human Disease Modeling, Duke University, Durham, North Carolina, USA
- Department of Neurology, Duke University, Durham, North Carolina, USA
| | - Aaron Topol
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Hardik R Shah
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Lambertus L Klei
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| | - Robin Kramer
- Human Brain Collection Core, National Institutes of Health, NIMH, Bethesda, Maryland, USA
| | - Dalila Pinto
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Zeynep H Gümüş
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - A Ercument Cicek
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| | - Kristen K Dang
- Systems Biology, Sage Bionetworks, Seattle, Washington, USA
| | - Andrew Browne
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Cong Lu
- Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| | - Lu Xie
- Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| | - Ben Readhead
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Eli A Stahl
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Jianqiu Xiao
- Center for Human Disease Modeling, Duke University, Durham, North Carolina, USA
| | - Mahsa Parvizi
- Center for Human Disease Modeling, Duke University, Durham, North Carolina, USA
| | - Tymor Hamamsy
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - John F Fullard
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Ying-Chih Wang
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Milind C Mahajan
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | | | - Joel T Dudley
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Scott E Hemby
- Department of Basic Pharmaceutical Sciences, Fred Wilson School of Pharmacy, High Point University, High Point, North Carolina, USA
| | | | - Konrad Talbot
- Department of Neurosurgery, Cedars-Sinai Medical Center, Los Angeles, California, USA
| | - Towfique Raj
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- The Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, Illinois, USA
| | - Philip L De Jager
- The Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Departments of Neurology and Psychiatry, Brigham and Women's Hospital, Boston, Massachusetts, USA
| | - Jun Zhu
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Bin Zhang
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Patrick F Sullivan
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Andrew Chess
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Developmental and Regenerative Biology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Shaun M Purcell
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Leslie A Shinobu
- CNS Drug Discovery Unit, Pharmaceutical Research Division, Takeda Pharmaceutical Company Limited, Fujisawa, Kanagawa, Japan
| | | | - Hiroyoshi Toyoshiba
- Integrated Technology Research Laboratories, Pharmaceutical Research Division, Takeda Pharmaceutical Company Limited, Fujisawa, Kanagawa, Japan
| | - Raquel E Gur
- Neuropsychiatry Section, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Chang-Gyu Hahn
- Neuropsychiatric Signaling Program, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - David A Lewis
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
| | - Vahram Haroutunian
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Psychiatry, JJ Peters Virginia Medical Center, Bronx, New York, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Mette A Peters
- Systems Biology, Sage Bionetworks, Seattle, Washington, USA
| | - Barbara K Lipska
- Human Brain Collection Core, National Institutes of Health, NIMH, Bethesda, Maryland, USA
| | - Joseph D Buxbaum
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Eric E Schadt
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Keisuke Hirai
- CNS Drug Discovery Unit, Pharmaceutical Research Division, Takeda Pharmaceutical Company Limited, Fujisawa, Kanagawa, Japan
| | - Kathryn Roeder
- Department of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
- Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| | - Kristen J Brennand
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Nicholas Katsanis
- Center for Human Disease Modeling, Duke University, Durham, North Carolina, USA
- Department of Cell Biology and Pediatrics, Duke University, Durham, North Carolina, USA
| | - Enrico Domenici
- Laboratory of Neurogenomic Biomarkers, Centre for Integrative Biology (CIBIO), University of Trento, Trento, Italy
| | - Bernie Devlin
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA
- Department of Human Genetics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Pamela Sklar
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Institute for Genomics and Multiscale Biology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| |
Collapse
|
354
|
Mitra I, Tsang K, Ladd-Acosta C, Croen LA, Aldinger KA, Hendren RL, Traglia M, Lavillaureix A, Zaitlen N, Oldham MC, Levitt P, Nelson S, Amaral DG, Herz-Picciotto I, Fallin MD, Weiss LA. Pleiotropic Mechanisms Indicated for Sex Differences in Autism. PLoS Genet 2016; 12:e1006425. [PMID: 27846226 PMCID: PMC5147776 DOI: 10.1371/journal.pgen.1006425] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Accepted: 10/12/2016] [Indexed: 02/07/2023] Open
Abstract
Sexual dimorphism in common disease is pervasive, including a dramatic male preponderance in autism spectrum disorders (ASDs). Potential genetic explanations include a liability threshold model requiring increased polymorphism risk in females, sex-limited X-chromosome contribution, gene-environment interaction driven by differences in hormonal milieu, risk influenced by genes sex-differentially expressed in early brain development, or contribution from general mechanisms of sexual dimorphism shared with secondary sex characteristics. Utilizing a large single nucleotide polymorphism (SNP) dataset, we identify distinct sex-specific genome-wide significant loci. We investigate genetic hypotheses and find no evidence for increased genetic risk load in females, but evidence for sex heterogeneity on the X chromosome, and contribution of sex-heterogeneous SNPs for anthropometric traits to ASD risk. Thus, our results support pleiotropy between secondary sex characteristic determination and ASDs, providing a biological basis for sex differences in ASDs and implicating non brain-limited mechanisms.
Collapse
Affiliation(s)
- Ileena Mitra
- Department of Psychiatry and Institute for Human Genetics, University of California, San Francisco, California, United States of America
| | - Kathryn Tsang
- Department of Psychiatry and Institute for Human Genetics, University of California, San Francisco, California, United States of America
| | - Christine Ladd-Acosta
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America
| | - Lisa A. Croen
- Division of Research, Kaiser Permanente Northern California, California, United States of America
| | - Kimberly A. Aldinger
- Center for Integrative Brain Research, Seattle Children's Research Institute, Seattle, Washington, United States of America
| | - Robert L. Hendren
- Department of Psychiatry and Institute for Human Genetics, University of California, San Francisco, California, United States of America
| | - Michela Traglia
- Department of Psychiatry and Institute for Human Genetics, University of California, San Francisco, California, United States of America
| | - Alinoë Lavillaureix
- Department of Psychiatry and Institute for Human Genetics, University of California, San Francisco, California, United States of America
- Université Paris Descartes, Sorbonne Paris Cité, Faculty of Medicine, France
| | - Noah Zaitlen
- Department of Medicine, University of California, San Francisco, San Francisco, California, United States of America
| | - Michael C. Oldham
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, California, United States of America
| | - Pat Levitt
- Program in Developmental Neurogenetics, Institute for the Developing Mind, Children’s Hospital Los Angeles and Department of Pediatrics, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Stanley Nelson
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California, United States of America
| | - David G. Amaral
- Department of Psychiatry and Behavioral Sciences, Medicine and Medical Investigation of Neurodevelopmental Disorders (M.I.N.D.) Institute, University of California, Davis School of Medicine, Sacramento, California, United States of America
| | - Irva Herz-Picciotto
- Department of Public Health Sciences and Medicine and Medical Investigation of Neurodevelopmental Disorders (M.I.N.D.) Institute, University of California, Davis School of Medicine, Sacramento, California, United States of America
| | - M. Daniele Fallin
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America
| | - Lauren A. Weiss
- Department of Psychiatry and Institute for Human Genetics, University of California, San Francisco, California, United States of America
| |
Collapse
|
355
|
Lent S, Deng X, Cupples LA, Lunetta KL, Liu CT, Zhou Y. Imputing rare variants in families using a two-stage approach. BMC Proc 2016; 10:209-214. [PMID: 27980638 PMCID: PMC5133481 DOI: 10.1186/s12919-016-0032-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Recent focus on studying rare variants makes imputation accuracy of rare variants an important issue. Many approaches have been proposed to increase imputation accuracy among rare variants, from reference panel selection to combinations of existing methods to multistage analyses. We aimed to bring the strengths of these new approaches together with our proposed two-stage imputation for family data. METHODS Our imputation methods were tested on the region from 46.75Mb to 49.25Mb on chromosome 3. We did quality control based on the proportion of missing genotypes per variant and individual, leaving 495 individuals with 761 genome-wide association studies (GWAS) variants only, 45 with 14,077 sequence variants only, and 419 with both GWAS and sequencing data. All data were prephased using SHAPEIT2 with a duo hidden Markov model algorithm prior to performing imputation. Imputations were performed 100 times, each time masking the sequence data for 1 individual and imputing it from the GWAS data. We used well-imputed genotypes, defined as a probability of greater than 0.9, above 2 different minor allele frequency cutoffs-0.01 and 0.05-from Impute2 as input for Merlin, and compared these results to Impute2 and Merlin separately. The imputed results were evaluated using correlation measurement and the imputation quality score. RESULTS Our method improved imputation accuracy, measured by imputation quality score, for variants with minor allele frequency between 0.01 and 0.40, but failed to improve accuracy for variants with minor allele frequency less than 0.01 when we used a minor allele frequency cutoff of 0.01 for the Impute2 results. In contrast, our 2-stage approach with a minor allele frequency cutoff of 0.05 performed the worst of all methods for variants with minor allele frequency between 0.01 and 0.40. CONCLUSIONS This method gave promising results, but may be further improved by changing the inclusion criteria of Impute2 variants. More analyses are needed on a larger region with different inclusion thresholds to assess the accuracy of this approach.
Collapse
Affiliation(s)
- Samantha Lent
- Department of Biostatistics, Boston University, Boston, MA USA
| | - Xuan Deng
- Department of Biostatistics, Boston University, Boston, MA USA
| | | | | | - C T Liu
- Department of Biostatistics, Boston University, Boston, MA USA
| | - Yanhua Zhou
- Department of Biostatistics, Boston University, Boston, MA USA
| |
Collapse
|
356
|
Increasing Generality and Power of Rare-Variant Tests by Utilizing Extended Pedigrees. Am J Hum Genet 2016; 99:846-859. [PMID: 27666371 DOI: 10.1016/j.ajhg.2016.08.015] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2016] [Accepted: 08/17/2016] [Indexed: 11/24/2022] Open
Abstract
Recently, multiple studies have performed whole-exome or whole-genome sequencing to identify groups of rare variants associated with complex traits and diseases. They have primarily utilized case-control study designs that often require thousands of individuals to reach acceptable statistical power. Family-based studies can be more powerful because a rare variant can be enriched in an extended pedigree and segregate with the phenotype. Although many methods have been proposed for using family data to discover rare variants involved in a disease, a majority of them focus on a specific pedigree structure and are designed to analyze either binary or continuously measured outcomes. In this article, we propose RareIBD, a general and powerful approach to identifying rare variants involved in disease susceptibility. Our method can be applied to large extended families of arbitrary structure, including pedigrees with only affected individuals. The method accommodates both binary and quantitative traits. A series of simulation experiments suggest that RareIBD is a powerful test that outperforms existing approaches. In addition, our method accounts for individuals in top generations, which are not usually genotyped in extended families. In contrast to available statistical tests, RareIBD generates accurate p values even when genetic data from these individuals are missing. We applied RareIBD, as well as other methods, to two extended family datasets generated by different genotyping technologies and representing different ethnicities. The analysis of real data confirmed that RareIBD is the only method that properly controls type I error.
Collapse
|
357
|
Dennis J, Truong V, Aïssi D, Medina-Rivera A, Blankenberg S, Germain M, Lemire M, Antounians L, Civelek M, Schnabel R, Wells P, Wilson MD, Morange PE, Trégouët DA, Gagnon F. Single nucleotide polymorphisms in an intergenic chromosome 2q region associated with tissue factor pathway inhibitor plasma levels and venous thromboembolism. J Thromb Haemost 2016; 14:1960-1970. [PMID: 27490645 PMCID: PMC6544906 DOI: 10.1111/jth.13431] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2016] [Accepted: 07/01/2016] [Indexed: 02/01/2023]
Abstract
Essentials Tissue factor pathway inhibitor (TFPI) regulates the blood coagulation cascade. We replicated previously reported linkage of TFPI plasma levels to the chromosome 2q region. The putative causal locus, rs62187992, was associated with TFPI plasma levels and thrombosis. rs62187992 was marginally associated with TFPI expression in human aortic endothelial cells. Click to hear Ann Gil's presentation on new insights into thrombin activatable fibrinolysis inhibitor SUMMARY: Background Tissue factor pathway inhibitor (TFPI) regulates fibrin clot formation, and low TFPI plasma levels increase the risk of arterial thromboembolism and venous thromboembolism (VTE). TFPI plasma levels are also heritable, and a previous linkage scan implicated the chromosome 2q region, but no specific genes. Objectives To replicate the finding of the linkage region in an independent sample, and to identify the causal locus. Methods We first performed a linkage analysis of microsatellite markers and TFPI plasma levels in 251 individuals from the F5L Family Study, and replicated the finding of the linkage peak on chromosome 2q (LOD = 3.06). We next defined a follow-up region that included 112 603 single nucleotide polymorphisms (SNPs) under the linkage peak, and meta-analyzed associations between these SNPs and TFPI plasma levels across the F5L Family Study and the Marseille Thrombosis Association (MARTHA) Study, a study of 1033 unrelated VTE patients. SNPs with false discovery rate q-values of < 0.10 were tested for association with TFPI plasma levels in 892 patients with coronary artery disease in the AtheroGene Study. Results and Conclusions One SNP, rs62187992, was associated with TFPI plasma levels in all three samples (β = + 0.14 and P = 4.23 × 10-6 combined; β = + 0.16 and P = 0.02 in the F5L Family Study; β = + 0.13 and P = 6.3 × 10-4 in the MARTHA Study; β = + 0.17 and P = 0.03 in the AtheroGene Study), and contributed to the linkage peak in the F5L Family Study. rs62187992 was also associated with clinical VTE (odds ratio 0.90, P = 0.03) in the INVENT Consortium of > 7000 cases and their controls, and was marginally associated with TFPI expression (β = + 0.19, P = 0.08) in human aortic endothelial cells, a primary site of TFPI synthesis. The biological mechanisms underlying these associations remain to be elucidated.
Collapse
Affiliation(s)
- J Dennis
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - V Truong
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - D Aïssi
- Sorbonne Universités, UPMC Univ. Paris 06, Paris, France
- INSERM, UMR_S 1166, Paris, France
- ICAN Institute for Cardiometabolism and Nutrition, Paris, France
| | - A Medina-Rivera
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Santiago de Querétaro, Mexico
- Genetics and Genome Biology, Hospital for Sick Children, Toronto, Ontario, Canada
| | - S Blankenberg
- Department of General and Interventional Cardiology, University of Hamburg, Hamburg, Germany
| | - M Germain
- Sorbonne Universités, UPMC Univ. Paris 06, Paris, France
- INSERM, UMR_S 1166, Paris, France
- ICAN Institute for Cardiometabolism and Nutrition, Paris, France
| | - M Lemire
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - L Antounians
- Genetics and Genome Biology, Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - M Civelek
- Center for Public Health Genomics, Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
| | - R Schnabel
- Department of General and Interventional Cardiology, University of Hamburg, Hamburg, Germany
| | - P Wells
- Ottawa Hospital Research Institute, Ottawa, Ontario, Canada
| | - M D Wilson
- Genetics and Genome Biology, Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
| | - P-E Morange
- INSERM, UMR_S 1062, Marseille, France
- Inra, UMR_INRA 1260, Marseille, France
- Aix Marseille Université, Marseille, France
| | - D-A Trégouët
- Sorbonne Universités, UPMC Univ. Paris 06, Paris, France
- INSERM, UMR_S 1166, Paris, France
- ICAN Institute for Cardiometabolism and Nutrition, Paris, France
| | - F Gagnon
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
358
|
Levine AP, Pontikos N, Schiff ER, Jostins L, Speed D, Lovat LB, Barrett JC, Grasberger H, Plagnol V, Segal AW. Genetic Complexity of Crohn's Disease in Two Large Ashkenazi Jewish Families. Gastroenterology 2016; 151:698-709. [PMID: 27373512 PMCID: PMC5643259 DOI: 10.1053/j.gastro.2016.06.040] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Revised: 06/21/2016] [Accepted: 06/27/2016] [Indexed: 12/21/2022]
Abstract
BACKGROUND & AIMS Crohn's disease (CD) is a highly heritable disease that is particularly common in the Ashkenazi Jewish population. We studied 2 large Ashkenazi Jewish families with a high prevalence of CD in an attempt to identify novel genetic risk variants. METHODS Ashkenazi Jewish patients with CD and a positive family history were recruited from the University College London Hospital. We used genome-wide, single-nucleotide polymorphism data to assess the burden of common CD-associated risk variants and for linkage analysis. Exome sequencing was performed and rare variants that were predicted to be deleterious and were observed at a high frequency in cases were prioritized. We undertook within-family association analysis after imputation and assessed candidate variants for evidence of association with CD in an independent cohort of Ashkenazi Jewish individuals. We examined the effects of a variant in DUOX2 on hydrogen peroxide production in HEK293 cells. RESULTS We identified 2 families (1 with >800 members and 1 with >200 members) containing 54 and 26 cases of CD or colitis, respectively. Both families had a significant enrichment of previously described common CD-associated risk variants. No genome-wide significant linkage was observed. Exome sequencing identified candidate variants, including a missense mutation in DUOX2 that impaired its function and a frameshift mutation in CSF2RB that was associated with CD in an independent cohort of Ashkenazi Jewish individuals. CONCLUSIONS In a study of 2 large Ashkenazi Jewish with multiple cases of CD, we found the genetic basis of the disease to be complex, with a role for common and rare genetic variants. We identified a frameshift mutation in CSF2RB that was replicated in an independent cohort. These findings show the value of family studies and the importance of the innate immune system in the pathogenesis of CD.
Collapse
Affiliation(s)
- Adam P. Levine
- Division of Medicine, University College London (UCL), London, United Kingdom
| | - Nikolas Pontikos
- UCL Genetics Institute, University College London (UCL), London, United Kingdom
| | - Elena R. Schiff
- Division of Medicine, University College London (UCL), London, United Kingdom
| | - Luke Jostins
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Doug Speed
- UCL Genetics Institute, University College London (UCL), London, United Kingdom
| | | | - Laurence B. Lovat
- Department of Surgery and Interventional Science, National Medical Laser Centre, University College London (UCL), London, United Kingdom
| | - Jeffrey C. Barrett
- Medical Genomics, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Helmut Grasberger
- Division of Gastroenterology, University of Michigan Medical School, Ann Arbor, Michigan
| | - Vincent Plagnol
- UCL Genetics Institute, University College London (UCL), London, United Kingdom
| | - Anthony W. Segal
- Division of Medicine, University College London (UCL), London, United Kingdom,Reprint requests Address requests for reprints to: Anthony W. Segal, FRS, Division of Medicine, University College London, Rayne Building, 5 University Street, London, WC1E 6JF, United Kingdom.Division of MedicineUniversity College LondonRayne Building5 University StreetLondonWC1E 6JF, United Kingdom
| |
Collapse
|
359
|
Li L, Cheng WY, Glicksberg BS, Gottesman O, Tamler R, Chen R, Bottinger EP, Dudley JT. Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Sci Transl Med 2016; 7:311ra174. [PMID: 26511511 DOI: 10.1126/scitranslmed.aaa9364] [Citation(s) in RCA: 301] [Impact Index Per Article: 33.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Type 2 diabetes (T2D) is a heterogeneous complex disease affecting more than 29 million Americans alone with a rising prevalence trending toward steady increases in the coming decades. Thus, there is a pressing clinical need to improve early prevention and clinical management of T2D and its complications. Clinicians have understood that patients who carry the T2D diagnosis have a variety of phenotypes and susceptibilities to diabetes-related complications. We used a precision medicine approach to characterize the complexity of T2D patient populations based on high-dimensional electronic medical records (EMRs) and genotype data from 11,210 individuals. We successfully identified three distinct subgroups of T2D from topology-based patient-patient networks. Subtype 1 was characterized by T2D complications diabetic nephropathy and diabetic retinopathy; subtype 2 was enriched for cancer malignancy and cardiovascular diseases; and subtype 3 was associated most strongly with cardiovascular diseases, neurological diseases, allergies, and HIV infections. We performed a genetic association analysis of the emergent T2D subtypes to identify subtype-specific genetic markers and identified 1279, 1227, and 1338 single-nucleotide polymorphisms (SNPs) that mapped to 425, 322, and 437 unique genes specific to subtypes 1, 2, and 3, respectively. By assessing the human disease-SNP association for each subtype, the enriched phenotypes and biological functions at the gene level for each subtype matched with the disease comorbidities and clinical differences that we identified through EMRs. Our approach demonstrates the utility of applying the precision medicine paradigm in T2D and the promise of extending the approach to the study of other complex, multifactorial diseases.
Collapse
Affiliation(s)
- Li Li
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 700 Lexington Ave., New York, NY 10065, USA
| | - Wei-Yi Cheng
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 700 Lexington Ave., New York, NY 10065, USA
| | - Benjamin S Glicksberg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 700 Lexington Ave., New York, NY 10065, USA
| | - Omri Gottesman
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Ronald Tamler
- Division of Endocrinology, Diabetes, and Bone Diseases, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Rong Chen
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 700 Lexington Ave., New York, NY 10065, USA
| | - Erwin P Bottinger
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029, USA
| | - Joel T Dudley
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 700 Lexington Ave., New York, NY 10065, USA. Department of Health Policy and Research, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| |
Collapse
|
360
|
Ancestral Origins and Genetic History of Tibetan Highlanders. Am J Hum Genet 2016; 99:580-594. [PMID: 27569548 DOI: 10.1016/j.ajhg.2016.07.002] [Citation(s) in RCA: 140] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Accepted: 07/01/2016] [Indexed: 12/30/2022] Open
Abstract
The origin of Tibetans remains one of the most contentious puzzles in history, anthropology, and genetics. Analyses of deeply sequenced (30×-60×) genomes of 38 Tibetan highlanders and 39 Han Chinese lowlanders, together with available data on archaic and modern humans, allow us to comprehensively characterize the ancestral makeup of Tibetans and uncover their origins. Non-modern human sequences compose ∼6% of the Tibetan gene pool and form unique haplotypes in some genomic regions, where Denisovan-like, Neanderthal-like, ancient-Siberian-like, and unknown ancestries are entangled and elevated. The shared ancestry of Tibetan-enriched sequences dates back to ∼62,000-38,000 years ago, predating the Last Glacial Maximum (LGM) and representing early colonization of the plateau. Nonetheless, most of the Tibetan gene pool is of modern human origin and diverged from that of Han Chinese ∼15,000 to ∼9,000 years ago, which can be largely attributed to post-LGM arrivals. Analysis of ∼200 contemporary populations showed that Tibetans share ancestry with populations from East Asia (∼82%), Central Asia and Siberia (∼11%), South Asia (∼6%), and western Eurasia and Oceania (∼1%). Our results support that Tibetans arose from a mixture of multiple ancestral gene pools but that their origins are much more complicated and ancient than previously suspected. We provide compelling evidence of the co-existence of Paleolithic and Neolithic ancestries in the Tibetan gene pool, indicating a genetic continuity between pre-historical highland-foragers and present-day Tibetans. In particular, highly differentiated sequences harbored in highlanders' genomes were most likely inherited from pre-LGM settlers of multiple ancestral origins (SUNDer) and maintained in high frequency by natural selection.
Collapse
|
361
|
Genome-wide scans reveal variants at EDAR predominantly affecting hair straightness in Han Chinese and Uyghur populations. Hum Genet 2016; 135:1279-1286. [PMID: 27487801 DOI: 10.1007/s00439-016-1718-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2016] [Accepted: 07/23/2016] [Indexed: 10/21/2022]
Abstract
Hair straightness/curliness is one of the most conspicuous features of human variation and is particularly diverse among populations. A recent genome-wide scan found common variants in the Trichohyalin (TCHH) gene that are associated with hair straightness in Europeans, but different genes might affect this phenotype in other populations. By sampling 2899 Han Chinese, we performed the first genome-wide scan of hair straightness in East Asians, and found EDAR (rs3827760) as the predominant gene (P = 4.67 × 10-16), accounting for 3.66 % of the total variance. The candidate gene approach did not find further significant associations, suggesting that hair straightness may be affected by a large number of genes with subtle effects. Notably, genetic variants associated with hair straightness in Europeans are generally low in frequency in Han Chinese, and vice versa. To evaluate the relative contribution of these variants, we performed a second genome-wide scan in 709 samples from the Uyghur, an admixed population with both eastern and western Eurasian ancestries. In Uyghurs, both EDAR (rs3827760: P = 1.92 × 10-12) and TCHH (rs11803731: P = 1.46 × 10-3) are associated with hair straightness, but EDAR (OR 0.415) has a greater effect than TCHH (OR 0.575). We found no significant interaction between EDAR and TCHH (P = 0.645), suggesting that these two genes affect hair straightness through different mechanisms. Furthermore, haplotype analysis indicates that TCHH is not subject to selection. While EDAR is under strong selection in East Asia, it does not appear to be subject to selection after the admixture in Uyghurs. These suggest that hair straightness is unlikely a trait under selection.
Collapse
|
362
|
Morris DL, Sheng Y, Zhang Y, Wang YF, Zhu Z, Tombleson P, Chen L, Cunninghame Graham DS, Bentham J, Roberts AL, Chen R, Zuo X, Wang T, Wen L, Yang C, Liu L, Yang L, Li F, Huang Y, Yin X, Yang S, Rönnblom L, Fürnrohr BG, Voll RE, Schett G, Costedoat-Chalumeau N, Gaffney PM, Lau YL, Zhang X, Yang W, Cui Y, Vyse TJ. Genome-wide association meta-analysis in Chinese and European individuals identifies ten new loci associated with systemic lupus erythematosus. Nat Genet 2016; 48:940-946. [PMID: 27399966 PMCID: PMC4966635 DOI: 10.1038/ng.3603] [Citation(s) in RCA: 239] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2015] [Accepted: 06/01/2016] [Indexed: 12/14/2022]
Abstract
Systemic lupus erythematosus (SLE; OMIM 152700) is a genetically complex autoimmune disease. Genome-wide association studies (GWASs) have identified more than 50 loci as robustly associated with the disease in single ancestries, but genome-wide transancestral studies have not been conducted. We combined three GWAS data sets from Chinese (1,659 cases and 3,398 controls) and European (4,036 cases and 6,959 controls) populations. A meta-analysis of these studies showed that over half of the published SLE genetic associations are present in both populations. A replication study in Chinese (3,043 cases and 5,074 controls) and European (2,643 cases and 9,032 controls) subjects found ten previously unreported SLE loci. Our study provides further evidence that the majority of genetic risk polymorphisms for SLE are contained within the same regions across both populations. Furthermore, a comparison of risk allele frequencies and genetic risk scores suggested that the increased prevalence of SLE in non-Europeans (including Asians) has a genetic basis.
Collapse
Affiliation(s)
- David L Morris
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | - Yujun Sheng
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
- Department of Dermatology, China-Japan Friendship Hospital, Beijing, China
| | - Yan Zhang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Yong-Fei Wang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Zhengwei Zhu
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Philip Tombleson
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | - Lingyan Chen
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | | | - James Bentham
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK
| | - Amy L Roberts
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | - Ruoyan Chen
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Xianbo Zuo
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Tingyou Wang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Leilei Wen
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Chao Yang
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Lu Liu
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Lulu Yang
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Feng Li
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Yuanbo Huang
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Xianyong Yin
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Sen Yang
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Lars Rönnblom
- Department of Medical Sciences, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Barbara G Fürnrohr
- Department of Internal Medicine 3, University of Erlangen-Nuremberg, Erlangen, Germany
- Institute for Clinical Immunology, University of Erlangen-Nuremberg, Erlangen, Germany
- Division of Genetic Epidemiology, Medical University Innsbruck, Innsbruck, Austria
- Division of Biological Chemistry, Medical University Innsbruck, Innsbruck, Austria
| | - Reinhard E Voll
- Department of Internal Medicine 3, University of Erlangen-Nuremberg, Erlangen, Germany
- Institute for Clinical Immunology, University of Erlangen-Nuremberg, Erlangen, Germany
- Department of Rheumatology, University Hospital Freiburg, Freiburg, Germany
- Department of Rheumatology and Clinical Immunology, University Hospital Freiburg, Freiburg, Germany
- Centre for Chronic Immunodeficiency, University Hospital Freiburg, Freiburg, Germany
| | - Georg Schett
- Department of Internal Medicine 3, University of Erlangen-Nuremberg, Erlangen, Germany
- Institute for Clinical Immunology, University of Erlangen-Nuremberg, Erlangen, Germany
| | - Nathalie Costedoat-Chalumeau
- AP-HP, Hôpital Cochin, Centre de référence maladies auto-immunes et systémiques rares, Paris, France
- Université Paris Descartes-Sorbonne Paris Cité, Paris, France
| | - Patrick M Gaffney
- Arthritis and Clinical Immunology Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, USA
| | - Yu Lung Lau
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
- The University of Hong Kong Shenzhen Hospital, Shenzhen, China
| | - Xuejun Zhang
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
- Department of Dermatology, Huashan Hospital of Fudan University, Shanghai, China
| | - Wanling Yang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Yong Cui
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
- Department of Dermatology, China-Japan Friendship Hospital, Beijing, China
| | - Timothy J Vyse
- Division of Genetics and Molecular Medicine, King's College London, London, UK
- Division of Immunology, Infection and Inflammatory Disease, King's College London, London, UK
| |
Collapse
|
363
|
A thrifty variant in CREBRF strongly influences body mass index in Samoans. Nat Genet 2016; 48:1049-1054. [PMID: 27455349 PMCID: PMC5069069 DOI: 10.1038/ng.3620] [Citation(s) in RCA: 157] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Accepted: 06/15/2016] [Indexed: 12/14/2022]
Abstract
Samoans are a unique founder population with a high prevalence of obesity, making them well suited for identifying new genetic contributors to obesity. We conducted a genome-wide association study (GWAS) in 3,072 Samoans, discovered a variant, rs12513649, strongly associated with body mass index (BMI) (P = 5.3 × 10(-14)), and replicated the association in 2,102 additional Samoans (P = 1.2 × 10(-9)). Targeted sequencing identified a strongly associated missense variant, rs373863828 (p.Arg457Gln), in CREBRF (meta P = 1.4 × 10(-20)). Although this variant is extremely rare in other populations, it is common in Samoans (frequency of 0.259), with an effect size much larger than that of any other known common BMI risk variant (1.36-1.45 kg/m(2) per copy of the risk-associated allele). In comparison to wild-type CREBRF, the Arg457Gln variant when overexpressed selectively decreased energy use and increased fat storage in an adipocyte cell model. These data, in combination with evidence of positive selection of the allele encoding p.Arg457Gln, support a 'thrifty' variant hypothesis as a factor in human obesity.
Collapse
|
364
|
Schulz CA, Christensson A, Ericson U, Almgren P, Hindy G, Nilsson PM, Struck J, Bergmann A, Melander O, Orho-Melander M. High Level of Fasting Plasma Proenkephalin-A Predicts Deterioration of Kidney Function and Incidence of CKD. J Am Soc Nephrol 2016; 28:291-303. [PMID: 27401687 DOI: 10.1681/asn.2015101177] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2015] [Accepted: 05/20/2016] [Indexed: 11/03/2022] Open
Abstract
High levels of proenkephalin-A (pro-ENK) have been associated with decreased eGFR in an acute setting. Here, we examined whether pro-ENK levels predict CKD and decline of renal function in a prospective cohort of 2568 participants without CKD (eGFR>60 ml/min per 1.73 m2) at baseline. During a mean follow-up of 16.6 years, 31.7% of participants developed CKD. Participants with baseline pro-ENK levels in the highest tertile had significantly greater yearly mean decline of eGFR (Ptrend<0.001) and rise of cystatin C (Ptrend=0.01) and creatinine (Ptrend<0.001) levels. Furthermore, compared with participants in the lowest tertile, participants in the highest tertile of baseline pro-ENK concentration had increased CKD incidence (odds ratio, 1.51; 95% confidence interval, 1.18 to 1.94) when adjusted for multiple factors. Adding pro-ENK to a model of conventional risk factors in net reclassification improvement analysis resulted in reclassification of 14.14% of participants. Genome-wide association analysis in 4150 participants of the same cohort revealed the strongest association of pro-ENK levels with rs1012178 near the PENK gene, where the minor T-allele associated with a 0.057 pmol/L higher pro-ENK level per allele (P=4.67x10-21). Furthermore, the T-allele associated with a 19% increased risk of CKD per allele (P=0.03) and a significant decrease in the instrumental variable estimator for eGFR (P<0.01) in a Mendelian randomization analysis. In conclusion, circulating plasma pro-ENK level predicts incident CKD and may aid in identifying subjects in need of primary preventive regimens. Additionally, the Mendelian randomization analysis suggests a causal relationship between pro-ENK level and deterioration of kidney function over time.
Collapse
Affiliation(s)
- Christina-Alexandra Schulz
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | - Anders Christensson
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | - Ulrika Ericson
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | - Peter Almgren
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | - George Hindy
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | - Peter M Nilsson
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | | | - Andreas Bergmann
- Sphingotec GmbH, Hennigsdorf, Germany; and.,Waltraut Bergmann Foundation, Hohen Neuendorf, Germany
| | - Olle Melander
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | - Marju Orho-Melander
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden;
| |
Collapse
|
365
|
Alston C, Compton A, Formosa L, Strecker V, Oláhová M, Haack T, Smet J, Stouffs K, Diakumis P, Ciara E, Cassiman D, Romain N, Yarham J, He L, De Paepe B, Vanlander A, Seneca S, Feichtinger R, Płoski R, Rokicki D, Pronicka E, Haller R, Van Hove J, Bahlo M, Mayr J, Van Coster R, Prokisch H, Wittig I, Ryan M, Thorburn D, Taylor R. Biallelic Mutations in TMEM126B Cause Severe Complex I Deficiency with a Variable Clinical Phenotype. Am J Hum Genet 2016; 99:217-27. [PMID: 27374774 PMCID: PMC5005451 DOI: 10.1016/j.ajhg.2016.05.021] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Accepted: 05/18/2016] [Indexed: 11/22/2022] Open
Abstract
Complex I deficiency is the most common biochemical phenotype observed in individuals with mitochondrial disease. With 44 structural subunits and over 10 assembly factors, it is unsurprising that complex I deficiency is associated with clinical and genetic heterogeneity. Massively parallel sequencing (MPS) technologies including custom, targeted gene panels or unbiased whole-exome sequencing (WES) are hugely powerful in identifying the underlying genetic defect in a clinical diagnostic setting, yet many individuals remain without a genetic diagnosis. These individuals might harbor mutations in poorly understood or uncharacterized genes, and their diagnosis relies upon characterization of these orphan genes. Complexome profiling recently identified TMEM126B as a component of the mitochondrial complex I assembly complex alongside proteins ACAD9, ECSIT, NDUFAF1, and TIMMDC1. Here, we describe the clinical, biochemical, and molecular findings in six cases of mitochondrial disease from four unrelated families affected by biallelic (c.635G>T [p.Gly212Val] and/or c.401delA [p.Asn134Ilefs∗2]) TMEM126B variants. We provide functional evidence to support the pathogenicity of these TMEM126B variants, including evidence of founder effects for both variants, and establish defects within this gene as a cause of complex I deficiency in association with either pure myopathy in adulthood or, in one individual, a severe multisystem presentation (chronic renal failure and cardiomyopathy) in infancy. Functional experimentation including viral rescue and complexome profiling of subject cell lines has confirmed TMEM126B as the tenth complex I assembly factor associated with human disease and validates the importance of both genome-wide sequencing and proteomic approaches in characterizing disease-associated genes whose physiological roles have been previously undetermined.
Collapse
|
366
|
Staples J, Witherspoon D, Jorde L, Nickerson D, Below J, Huff C, Huff CD. PADRE: Pedigree-Aware Distant-Relationship Estimation. Am J Hum Genet 2016; 99:154-62. [PMID: 27374771 DOI: 10.1016/j.ajhg.2016.05.020] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Accepted: 05/16/2016] [Indexed: 10/21/2022] Open
Abstract
Accurate estimation of shared ancestry is an important component of many genetic studies; current prediction tools accurately estimate pairwise genetic relationships up to the ninth degree. Pedigree-aware distant-relationship estimation (PADRE) combines relationship likelihoods generated by estimation of recent shared ancestry (ERSA) with likelihoods from family networks reconstructed by pedigree reconstruction and identification of a maximum unrelated set (PRIMUS), improving the power to detect distant relationships between pedigrees. Using PADRE, we estimated relationships from simulated pedigrees and three extended pedigrees, correctly predicting 20% more fourth- through ninth-degree simulated relationships than when using ERSA alone. By leveraging pedigree information, PADRE can even identify genealogical relationships between individuals who are genetically unrelated. For example, although 95% of 13(th)-degree relatives are genetically unrelated, in simulations, PADRE correctly predicted 50% of 13(th)-degree relationships to within one degree of relatedness. The improvement in prediction accuracy was consistent between simulated and actual pedigrees. We also applied PADRE to the HapMap3 CEU samples and report new cryptic relationships and validation of previously described relationships between families. PADRE greatly expands the range of relationships that can be estimated by using genetic data in pedigrees.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Chad D Huff
- Department of Epidemiology, The University of Texas M.D. Anderson Cancer Center, Houston, TX 77030, USA.
| |
Collapse
|
367
|
Loh PR, Palamara PF, Price AL. Fast and accurate long-range phasing in a UK Biobank cohort. Nat Genet 2016; 48:811-6. [PMID: 27270109 PMCID: PMC4925291 DOI: 10.1038/ng.3571] [Citation(s) in RCA: 221] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 04/22/2016] [Indexed: 01/01/2023]
Abstract
Recent work has leveraged the extensive genotyping of the Icelandic population to perform long-range phasing (LRP), enabling accurate imputation and association analysis of rare variants in target samples typed on genotyping arrays. Here we develop a fast and accurate LRP method, Eagle, that extends this paradigm to populations with much smaller proportions of genotyped samples by harnessing long (>4-cM) identical-by-descent (IBD) tracts shared among distantly related individuals. We applied Eagle to N ≈ 150,000 samples (0.2% of the British population) from the UK Biobank, and we determined that it is 1-2 orders of magnitude faster than existing methods while achieving similar or better phasing accuracy (switch error rate ≈ 0.3%, corresponding to perfect phase in a majority of 10-Mb segments). We also observed that, when used within an imputation pipeline, Eagle prephasing improved downstream imputation accuracy in comparison to prephasing in batches using existing methods, as necessary to achieve comparable computational cost.
Collapse
Affiliation(s)
- Po-Ru Loh
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Pier Francesco Palamara
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| |
Collapse
|
368
|
Utsunomiya YT, Milanesi M, Utsunomiya ATH, Ajmone-Marsan P, Garcia JF. GHap: an R package for genome-wide haplotyping. Bioinformatics 2016; 32:2861-2. [DOI: 10.1093/bioinformatics/btw356] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2016] [Accepted: 05/31/2016] [Indexed: 11/13/2022] Open
|
369
|
Abstract
The UK Biobank (UKB) has recently released genotypes on 152,328 individuals together with extensive phenotypic and lifestyle information. We present a new phasing method SHAPEIT3 that can handle such biobank scale datasets and results in switch error rates as low as ~0.3%. The method exhibits O(NlogN) scaling in sample size (N), enabling fast and accurate phasing of even larger cohorts.
Collapse
|
370
|
Adhikari K, Fuentes-Guajardo M, Quinto-Sánchez M, Mendoza-Revilla J, Camilo Chacón-Duque J, Acuña-Alonzo V, Jaramillo C, Arias W, Lozano RB, Pérez GM, Gómez-Valdés J, Villamil-Ramírez H, Hunemeier T, Ramallo V, Silva de Cerqueira CC, Hurtado M, Villegas V, Granja V, Gallo C, Poletti G, Schuler-Faccini L, Salzano FM, Bortolini MC, Canizales-Quinteros S, Cheeseman M, Rosique J, Bedoya G, Rothhammer F, Headon D, González-José R, Balding D, Ruiz-Linares A. A genome-wide association scan implicates DCHS2, RUNX2, GLI3, PAX1 and EDAR in human facial variation. Nat Commun 2016; 7:11616. [PMID: 27193062 PMCID: PMC4874031 DOI: 10.1038/ncomms11616] [Citation(s) in RCA: 126] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2015] [Accepted: 04/14/2016] [Indexed: 12/28/2022] Open
Abstract
We report a genome-wide association scan for facial features in ∼6,000 Latin Americans. We evaluated 14 traits on an ordinal scale and found significant association (P values<5 × 10−8) at single-nucleotide polymorphisms (SNPs) in four genomic regions for three nose-related traits: columella inclination (4q31), nose bridge breadth (6p21) and nose wing breadth (7p13 and 20p11). In a subsample of ∼3,000 individuals we obtained quantitative traits related to 9 of the ordinal phenotypes and, also, a measure of nasion position. Quantitative analyses confirmed the ordinal-based associations, identified SNPs in 2q12 associated to chin protrusion, and replicated the reported association of nasion position with SNPs in PAX3. Strongest association in 2q12, 4q31, 6p21 and 7p13 was observed for SNPs in the EDAR, DCHS2, RUNX2 and GLI3 genes, respectively. Associated SNPs in 20p11 extend to PAX1. Consistent with the effect of EDAR on chin protrusion, we documented alterations of mandible length in mice with modified Edar funtion. Humans show great diversity in facial appearance and this variation is highly heritable. Here, Andres Ruiz-Linares and colleagues examined facial features in admixed Latin Americans and identify genome-wide associations for 14 facial traits, including four gene loci (RUNX2, GLI3, DCHS2 and PAX1) influencing nose morphology.
Collapse
Affiliation(s)
- Kaustubh Adhikari
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Macarena Fuentes-Guajardo
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK.,Departamento de Tecnología Médica, Facultad de Ciencias de la Salud, Universidad de Tarapacá, Arica 1000009, Chile
| | - Mirsha Quinto-Sánchez
- Centro Nacional Patagónico, CONICET, Unidad de Diversidad, Sistematica y Evolucion, Puerto Madryn U912OACD, Argentina
| | - Javier Mendoza-Revilla
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK.,Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú
| | - Juan Camilo Chacón-Duque
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Victor Acuña-Alonzo
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK.,Laboratorio de Genética Molecular, Escuela Nacional de Antropologia e Historia, México City 14030, México
| | - Claudia Jaramillo
- GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia
| | - William Arias
- GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia
| | - Rodrigo Barquera Lozano
- Laboratorio de Genética Molecular, Escuela Nacional de Antropologia e Historia, México City 14030, México.,Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México City 4510, México
| | - Gastón Macín Pérez
- Laboratorio de Genética Molecular, Escuela Nacional de Antropologia e Historia, México City 14030, México.,Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México City 4510, México
| | - Jorge Gómez-Valdés
- Departamento de Anatomía, Facultad de Medicina, Universidad Nacional Autónoma de México (UNAM), México City 04510, México
| | - Hugo Villamil-Ramírez
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México City 4510, México
| | - Tábita Hunemeier
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Virginia Ramallo
- Centro Nacional Patagónico, CONICET, Unidad de Diversidad, Sistematica y Evolucion, Puerto Madryn U912OACD, Argentina.,Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Caio C Silva de Cerqueira
- Centro Nacional Patagónico, CONICET, Unidad de Diversidad, Sistematica y Evolucion, Puerto Madryn U912OACD, Argentina.,Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Malena Hurtado
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú
| | - Valeria Villegas
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú
| | - Vanessa Granja
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú
| | - Carla Gallo
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú
| | - Giovanni Poletti
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú
| | - Lavinia Schuler-Faccini
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Francisco M Salzano
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Maria-Cátira Bortolini
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Samuel Canizales-Quinteros
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México City 4510, México
| | - Michael Cheeseman
- Division of Developmental Biology, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, UK
| | - Javier Rosique
- Departamento de Antropología, Universidad de Antioquia, Medellín 5001000, Colombia
| | - Gabriel Bedoya
- GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia
| | | | - Denis Headon
- Division of Developmental Biology, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, UK
| | - Rolando González-José
- Centro Nacional Patagónico, CONICET, Unidad de Diversidad, Sistematica y Evolucion, Puerto Madryn U912OACD, Argentina
| | - David Balding
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK.,Schools of BioSciences and Mathematics and Statistics, University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Andrés Ruiz-Linares
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK
| |
Collapse
|
371
|
Cook JP, Morris AP. Multi-ethnic genome-wide association study identifies novel locus for type 2 diabetes susceptibility. Eur J Hum Genet 2016; 24:1175-80. [PMID: 27189021 PMCID: PMC4947384 DOI: 10.1038/ejhg.2016.17] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Revised: 12/21/2015] [Accepted: 02/01/2016] [Indexed: 12/16/2022] Open
Abstract
Genome-wide association studies (GWAS) have traditionally been undertaken in homogeneous populations from the same ancestry group. However, with the increasing availability of GWAS in large-scale multi-ethnic cohorts, we have evaluated a framework for detecting association of genetic variants with complex traits, allowing for population structure, and developed a powerful test of heterogeneity in allelic effects between ancestry groups. We have applied the methodology to identify and characterise loci associated with susceptibility to type 2 diabetes (T2D) using GWAS data from the Resource for Genetic Epidemiology on Adult Health and Aging, a large multi-ethnic population-based cohort, created for investigating the genetic and environmental basis of age-related diseases. We identified a novel locus for T2D susceptibility at genome-wide significance (P<5 × 10−8) that maps to TOMM40-APOE, a region previously implicated in lipid metabolism and Alzheimer's disease. We have also confirmed previous reports that single-nucleotide polymorphisms at the TCF7L2 locus demonstrate the greatest extent of heterogeneity in allelic effects between ethnic groups, with the lowest risk observed in populations of East Asian ancestry.
Collapse
Affiliation(s)
- James P Cook
- Department of Biostatistics, University of Liverpool, Liverpool, UK
| | - Andrew P Morris
- Department of Biostatistics, University of Liverpool, Liverpool, UK.,Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
| |
Collapse
|
372
|
Multiallelic copy number variation in the complement component 4A (C4A) gene is associated with late-stage age-related macular degeneration (AMD). J Neuroinflammation 2016; 13:81. [PMID: 27090374 PMCID: PMC4835888 DOI: 10.1186/s12974-016-0548-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2016] [Accepted: 04/11/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Age-related macular degeneration (AMD) is the leading cause of vision loss in Western societies with a strong genetic component. Candidate gene studies as well as genome-wide association studies strongly implicated genetic variations in complement genes to be involved in disease risk. So far, no association of AMD with complement component 4 (C4) was reported probably due to the complex nature of the C4 locus on chromosome 6. METHODS We used multiplex ligation-dependent probe amplification (MLPA) to determine the copy number of the C4 gene as well as of both relevant isoforms, C4A and C4B, and assessed their association with AMD using logistic regression models. RESULTS Here, we report on the analysis of 2645 individuals (1536 probands and 1109 unaffected controls), across three different centers, for multiallelic copy number variation (CNV) at the C4 locus. We find strong statistical significance for association of increased copy number of C4A (OR 0.81 (0.73; 0.89);P = 4.4 × 10(-5)), with the effect most pronounced in individuals over 78 years (OR 0.67 (0.55; 0.81)) and females (OR 0.77 (0.68; 0.87)). Furthermore, this association is independent of known AMD-associated risk variants in the nearby CFB/C2 locus, particularly in females and in individuals over 78 years. CONCLUSIONS Our data strengthen the notion that complement dysregulation plays a crucial role in AMD etiology, an important finding for early intervention strategies and future therapeutics. In addition, for the first time, we provide evidence that multiallelic CNVs are associated with AMD pathology.
Collapse
|
373
|
Transcript Isoform Variation Associated with Cytosine Modification in Human Lymphoblastoid Cell Lines. Genetics 2016; 203:985-95. [PMID: 27029734 DOI: 10.1534/genetics.115.185504] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Accepted: 03/27/2016] [Indexed: 11/18/2022] Open
Abstract
Cytosine modification on DNA is variable among individuals, which could correlate with gene expression variation. The effect of cytosine modification on interindividual transcript isoform variation (TIV), however, remains unclear. In this study, we assessed the extent of cytosine modification-specific TIV in lymphoblastoid cell lines (LCLs) derived from unrelated individuals of European and African descent. Our study detected cytosine modification-specific TIVs for 17% of the analyzed genes at a 5% false discovery rate. Forty-five percent of the TIV-associated cytosine modifications correlated with the overall gene expression levels as well, with the corresponding CpG sites overrepresented in transcript initiation sites, transcription factor binding sites, and distinct histone modification peaks, suggesting that alternative isoform transcription underlies the TIVs. Our analysis also revealed 33% of the TIV-associated cytosine modifications that affected specific exons, with the corresponding CpG sites overrepresented in exon/intron junctions, splicing branching points, and transcript termination sites, implying that the TIVs are attributable to alternative splicing or transcription termination. Genetic and epigenetic regulation of TIV shared target preference but exerted independent effects on 61% of the common exon targets. Cytosine modification-specific TIVs detected from LCLs were differentially enriched in those detected from various tissues in The Cancer Genome Atlas, indicating their developmental dependency. Genes containing cytosine modification-specific TIVs were enriched in pathways of cancers and metabolic disorders. Our study demonstrated a prominent effect of cytosine modification variation on the transcript isoform spectrum over gross transcript abundance and revealed epigenetic contributions to diseases that were mediated through cytosine modification-specific TIV.
Collapse
|
374
|
Bukowicki M, Franssen SU, Schlötterer C. High rates of phasing errors in highly polymorphic species with low levels of linkage disequilibrium. Mol Ecol Resour 2016; 16:874-82. [PMID: 26929272 DOI: 10.1111/1755-0998.12516] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Revised: 01/27/2016] [Accepted: 02/08/2016] [Indexed: 12/01/2022]
Abstract
Short read sequencing of diploid individuals does not permit the direct inference of the sequence on each of the two homologous chromosomes. Although various phasing software packages exist, they were primarily tailored for and tested on human data, which differ from other species in factors that influence phasing, such as SNP density, amounts of linkage disequilibrium (LD) and sample sizes. Despite becoming increasingly popular for other species, the reliability of phasing in non-human data has not been evaluated to a sufficient extent. We scrutinized the phasing accuracy for Drosophila melanogaster, a species with high polymorphism levels and reduced LD relative to humans. We phased two D. melanogaster populations and compared the results to the known haplotypes. The performance increased with size of the reference panel and was highest when the reference panel and phased individuals were from the same population. Full genomic SNP data and inclusion of sequence read information also improved phasing. Despite humans and Drosophila having similar switch error rates between polymorphic sites, the distances between switch errors were much shorter in Drosophila with only fragments <300-1500 bp being correctly phased with ≥95% confidence. This suggests that the higher SNP density cannot compensate for the higher recombination rate in D. melanogaster. Furthermore, we show that populations that have gone through demographic events such as bottlenecks can be phased with higher accuracy. Our results highlight that statistically phased data are particularly error prone in species with large population sizes or populations lacking suitable reference panels.
Collapse
Affiliation(s)
- Marek Bukowicki
- Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Veterinärplatz 1, Austria
| | - Susanne U Franssen
- Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Veterinärplatz 1, Austria
| | - Christian Schlötterer
- Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Veterinärplatz 1, Austria
| |
Collapse
|
375
|
Boitard S, Rodríguez W, Jay F, Mona S, Austerlitz F. Inferring Population Size History from Large Samples of Genome-Wide Molecular Data - An Approximate Bayesian Computation Approach. PLoS Genet 2016; 12:e1005877. [PMID: 26943927 PMCID: PMC4778914 DOI: 10.1371/journal.pgen.1005877] [Citation(s) in RCA: 102] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 01/27/2016] [Indexed: 12/02/2022] Open
Abstract
Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey), PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles. Molecular data sampled from extant individuals contains considerable information about their demographic history. In particular, one classical question in population genetics is to reconstruct past population size changes from such data. Relating these changes to various climatic, geological or anthropogenic events allows characterizing the main factors driving genetic diversity and can have major outcomes for conservation. Until recently, mostly very simple histories, including one or two population size changes, could be estimated from genetic data. This has changed with the sequencing of entire genomes in many species, and several methods allow now inferring complex histories consisting of several tens of population size changes. However, analyzing entire genomes, while accounting for recombination, remains a statistical and numerical challenge. These methods, therefore, can only be applied to small samples with a few diploid genomes. We overcome this limitation by using an approximate estimation approach, where observed genomes are summarized using a small number of statistics related to allele frequencies and linkage disequilibrium. In contrast to previous approaches, we show that our method allows us to reconstruct also the most recent part (the last 100 generations) of the population size history. As an illustration, we apply it to large samples of whole-genome sequences in four cattle breeds.
Collapse
Affiliation(s)
- Simon Boitard
- Institut de Systématique, Évolution, Biodiversité ISYEB - UMR 7205 - CNRS & MNHN & UPMC & EPHE, Ecole Pratique des Hautes Etudes, Sorbonne Universités, Paris, France
- GABI, INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France
- * E-mail:
| | - Willy Rodríguez
- UMR CNRS 5219, Institut de Mathématiques de Toulouse, Université de Toulouse, Toulouse, France
| | - Flora Jay
- UMR 7206 Eco-anthropologie et Ethnobiologie, Muséum National d’Histoire Naturelle, CNRS, Université Paris Diderot, Paris, France
- LRI, Paris-Sud University, CNRS UMR 8623, Orsay, France
| | - Stefano Mona
- Institut de Systématique, Évolution, Biodiversité ISYEB - UMR 7205 - CNRS & MNHN & UPMC & EPHE, Ecole Pratique des Hautes Etudes, Sorbonne Universités, Paris, France
| | - Frédéric Austerlitz
- UMR 7206 Eco-anthropologie et Ethnobiologie, Muséum National d’Histoire Naturelle, CNRS, Université Paris Diderot, Paris, France
| |
Collapse
|
376
|
Simonti CN, Vernot B, Bastarache L, Bottinger E, Carrell DS, Chisholm RL, Crosslin DR, Hebbring SJ, Jarvik GP, Kullo IJ, Li R, Pathak J, Ritchie MD, Roden DM, Verma SS, Tromp G, Prato JD, Bush WS, Akey JM, Denny JC, Capra JA. The phenotypic legacy of admixture between modern humans and Neandertals. Science 2016; 351:737-41. [PMID: 26912863 DOI: 10.1126/science.aad2149] [Citation(s) in RCA: 172] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Many modern human genomes retain DNA inherited from interbreeding with archaic hominins, such as Neandertals, yet the influence of this admixture on human traits is largely unknown. We analyzed the contribution of common Neandertal variants to over 1000 electronic health record (EHR)-derived phenotypes in ~28,000 adults of European ancestry. We discovered and replicated associations of Neandertal alleles with neurological, psychiatric, immunological, and dermatological phenotypes. Neandertal alleles together explained a significant fraction of the variation in risk for depression and skin lesions resulting from sun exposure (actinic keratosis), and individual Neandertal alleles were significantly associated with specific human phenotypes, including hypercoagulation and tobacco use. Our results establish that archaic admixture influences disease risk in modern humans, provide hypotheses about the effects of hundreds of Neandertal haplotypes, and demonstrate the utility of EHR data in evolutionary analyses.
Collapse
Affiliation(s)
- Corinne N Simonti
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA
| | - Benjamin Vernot
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
| | | | - David S Carrell
- Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
| | - Rex L Chisholm
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - David R Crosslin
- Department of Genome Sciences, University of Washington, Seattle, WA, USA. Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
| | - Scott J Hebbring
- Center for Human Genetics, Marshfield Clinic, Marshfield, WI, USA
| | - Gail P Jarvik
- Department of Genome Sciences, University of Washington, Seattle, WA, USA. Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
| | - Iftikhar J Kullo
- Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN, USA
| | - Rongling Li
- Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jyotishman Pathak
- Division of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Marylyn D Ritchie
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA. Biomedical and Translational Informatics, Geisinger Health System, Danville, PA, USA
| | - Dan M Roden
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA. Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA. Department of Medicine, Vanderbilt University, Nashville, TN, USA. Department of Pharmacology, Vanderbilt University, Nashville, TN, USA
| | - Shefali S Verma
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
| | - Gerard Tromp
- Weis Center for Research, Geisinger Health System, Danville, PA, USA. Division of Molecular Biology and Human Genetics, Department of Biomedical Sciences, Faculty of Health Science, Stellenbosch University, Tygerberg, South Africa
| | - Jeffrey D Prato
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
| | - William S Bush
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, USA
| | - Joshua M Akey
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Joshua C Denny
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA. Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA. Department of Medicine, Vanderbilt University, Nashville, TN, USA
| | - John A Capra
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA. Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA. Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA. Center for Quantitative Sciences, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
377
|
Adhikari K, Fontanil T, Cal S, Mendoza-Revilla J, Fuentes-Guajardo M, Chacón-Duque JC, Al-Saadi F, Johansson JA, Quinto-Sanchez M, Acuña-Alonzo V, Jaramillo C, Arias W, Barquera Lozano R, Macín Pérez G, Gómez-Valdés J, Villamil-Ramírez H, Hunemeier T, Ramallo V, Silva de Cerqueira CC, Hurtado M, Villegas V, Granja V, Gallo C, Poletti G, Schuler-Faccini L, Salzano FM, Bortolini MC, Canizales-Quinteros S, Rothhammer F, Bedoya G, Gonzalez-José R, Headon D, López-Otín C, Tobin DJ, Balding D, Ruiz-Linares A. A genome-wide association scan in admixed Latin Americans identifies loci influencing facial and scalp hair features. Nat Commun 2016; 7:10815. [PMID: 26926045 PMCID: PMC4773514 DOI: 10.1038/ncomms10815] [Citation(s) in RCA: 121] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 01/25/2016] [Indexed: 12/20/2022] Open
Abstract
We report a genome-wide association scan in over 6,000 Latin Americans for features of scalp hair (shape, colour, greying, balding) and facial hair (beard thickness, monobrow, eyebrow thickness). We found 18 signals of association reaching genome-wide significance (P values 5 × 10(-8) to 3 × 10(-119)), including 10 novel associations. These include novel loci for scalp hair shape and balding, and the first reported loci for hair greying, monobrow, eyebrow and beard thickness. A newly identified locus influencing hair shape includes a Q30R substitution in the Protease Serine S1 family member 53 (PRSS53). We demonstrate that this enzyme is highly expressed in the hair follicle, especially the inner root sheath, and that the Q30R substitution affects enzyme processing and secretion. The genome regions associated with hair features are enriched for signals of selection, consistent with proposals regarding the evolution of human hair.
Collapse
Affiliation(s)
- Kaustubh Adhikari
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Tania Fontanil
- Departamento de Bioquímica y Biología Molecular, IUOPA, Universidad de Oviedo, Oviedo 33006, Spain
| | - Santiago Cal
- Departamento de Bioquímica y Biología Molecular, IUOPA, Universidad de Oviedo, Oviedo 33006, Spain
| | - Javier Mendoza-Revilla
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, 31, Perú
| | - Macarena Fuentes-Guajardo
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
- Departamento de Tecnología Médica, Facultad de Ciencias de la Salud, Universidad de Tarapacá, Arica 1000009, Chile
| | - Juan-Camilo Chacón-Duque
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Farah Al-Saadi
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Jeanette A. Johansson
- Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, UK
| | | | - Victor Acuña-Alonzo
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
- National Institute of Anthropology and History, México 4510, México
| | - Claudia Jaramillo
- GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia
| | - William Arias
- GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia
| | - Rodrigo Barquera Lozano
- National Institute of Anthropology and History, México 4510, México
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México 4510, México
| | - Gastón Macín Pérez
- National Institute of Anthropology and History, México 4510, México
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México 4510, México
| | | | - Hugo Villamil-Ramírez
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México 4510, México
| | - Tábita Hunemeier
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Virginia Ramallo
- Centro Nacional Patagónico, CONICET, Puerto Madryn U9129ACD, Argentina
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Caio C. Silva de Cerqueira
- Centro Nacional Patagónico, CONICET, Puerto Madryn U9129ACD, Argentina
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Malena Hurtado
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, 31, Perú
| | - Valeria Villegas
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, 31, Perú
| | - Vanessa Granja
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, 31, Perú
| | - Carla Gallo
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, 31, Perú
| | - Giovanni Poletti
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, 31, Perú
| | - Lavinia Schuler-Faccini
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Francisco M. Salzano
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Maria-Cátira Bortolini
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Samuel Canizales-Quinteros
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México 4510, México
| | | | - Gabriel Bedoya
- GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia
| | | | - Denis Headon
- Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, UK
| | - Carlos López-Otín
- Departamento de Bioquímica y Biología Molecular, IUOPA, Universidad de Oviedo, Oviedo 33006, Spain
| | - Desmond J. Tobin
- Centre for Skin Sciences, Faculty of Life Sciences, University of Bradford, Bradford BD7 1DP, Victoria, UK
| | - David Balding
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
- Schools of BioSciences and Mathematics and Statistics, University of Melbourne, Melbourne 3010, Australia
| | - Andrés Ruiz-Linares
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
| |
Collapse
|
378
|
DeLorenze GN, Nelson CL, Scott WK, Allen AS, Ray GT, Tsai AL, Quesenberry CP, Fowler VG. Polymorphisms in HLA Class II Genes Are Associated With Susceptibility to Staphylococcus aureus Infection in a White Population. J Infect Dis 2016; 213:816-23. [PMID: 26450422 PMCID: PMC4747615 DOI: 10.1093/infdis/jiv483] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Accepted: 09/30/2015] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Staphylococcus aureus can cause life-threatening infections. Human susceptibility to S. aureus infection may be influenced by host genetic variation. METHODS A genome-wide association study (GWAS) in a large health plan-based cohort included biologic specimens from 4701 culture-confirmed S. aureus cases and 45 344 matched controls; 584 535 single-nucleotide polymorphisms (SNPs) were genotyped on an array specific to individuals of European ancestry. Coverage was increased by imputation of >25 million common SNPs, using the 1000 Genomes Reference panel. In addition, human leukocyte antigen (HLA) serotypes were also imputed. RESULTS Logistic regression analysis, performed under the assumption of an additive genetic model, revealed several imputed SNPs (eg, rs115231074: odds ratio [OR], 1.22 [P = 1.3 × 10(-10)]; rs35079132: OR, 1.24 [P = 3.8 × 10(-8)]) achieving genome-wide significance on chromosome 6 in the HLA class II region. One adjacent genotyped SNP was nearly genome-wide significant (rs4321864: OR, 1.13; P = 8.8 × 10(-8)). These polymorphisms are located near the genes encoding HLA-DRA and HLA-DRB1. Results of further logistic regression analysis, in which the most significant GWAS SNPs were conditioned on HLA-DRB1*04 serotype, showed additional support for the strength of association between HLA class II genetic variants and S. aureus infection. CONCLUSIONS Our study results are the first reported evidence of human genetic susceptibility to S. aureus infection.
Collapse
Affiliation(s)
| | | | - William K Scott
- John P. Hussman Institute for Human Genomics Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami Miller School of Medicine, Florida
| | - Andrew S Allen
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, North Carolina
| | - G Thomas Ray
- Division of Research, Kaiser Permanente Northern California, Oakland
| | - Ai-Lin Tsai
- Division of Research, Kaiser Permanente Northern California, Oakland
| | | | - Vance G Fowler
- Duke Clinical Research Institute Division of Infectious Diseases, Duke University Medical Center
| |
Collapse
|
379
|
Genetic variants near MLST8 and DHX57 affect the epigenetic age of the cerebellum. Nat Commun 2016; 7:10561. [PMID: 26830004 PMCID: PMC4740877 DOI: 10.1038/ncomms10561] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2015] [Accepted: 12/29/2015] [Indexed: 12/17/2022] Open
Abstract
DNA methylation (DNAm) levels lend themselves for defining an epigenetic biomarker of aging known as the ‘epigenetic clock'. Our genome-wide association study (GWAS) of cerebellar epigenetic age acceleration identifies five significant (P<5.0 × 10−8) SNPs in two loci: 2p22.1 (inside gene DHX57) and 16p13.3 near gene MLST8 (a subunit of mTOR complex 1 and 2). We find that the SNP in 16p13.3 has a cis-acting effect on the expression levels of MLST8 (P=6.9 × 10−18) in most brain regions. In cerebellar samples, the SNP in 2p22.1 has a cis-effect on DHX57 (P=4.4 × 10−5). Gene sets found by our GWAS analysis of cerebellar age acceleration exhibit significant overlap with those of Alzheimer's disease (P=4.4 × 10−15), age-related macular degeneration (P=6.4 × 10−6), and Parkinson's disease (P=2.6 × 10−4). Overall, our results demonstrate the utility of a new paradigm for understanding aging and age-related diseases: it will be fruitful to use epigenetic tissue age as endophenotype in GWAS. This genome-wide association study identifies five significant SNPs in two loci which are associated with the epigenetic age of post-mortem cerebellar tissue according to a DNA methylation based biomarker of human aging.
Collapse
|
380
|
Levine ME, Lu AT, Bennett DA, Horvath S. Epigenetic age of the pre-frontal cortex is associated with neuritic plaques, amyloid load, and Alzheimer's disease related cognitive functioning. Aging (Albany NY) 2015; 7:1198-211. [PMID: 26684672 PMCID: PMC4712342 DOI: 10.18632/aging.100864] [Citation(s) in RCA: 298] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
There is an urgent need to develop molecular biomarkers of brain age in order to advance our understanding of age related neurodegeneration. Recently, we developed a highly accurate epigenetic biomarker of tissue age (known as epigenetic clock) which is based on DNA methylation levels. Here we use n=700 dorsolateral prefrontal cortex (DLPFC) samples from Caucasian subjects of the Religious Order Study and the Rush Memory and Aging Project to examine the association between epigenetic age and Alzheimer's disease (AD) related cognitive decline, and AD related neuropathological markers. Epigenetic age acceleration of DLPFC is correlated with several neuropathological measurements including diffuse plaques (r=0.12, p=0.0015), neuritic plaques (r=0.11, p=0.0036), and amyloid load (r=0.091, p=0.016). Further, it is associated with a decline in global cognitive functioning (β=-0.500, p=0.009), episodic memory (β=-0.411, p=0.009) and working memory (β=-0.405, p=0.011) among individuals with AD. The neuropathological markers may mediate the association between epigenetic age and cognitive decline. Genetic complex trait analysis (GCTA) revealed that epigenetic age acceleration is heritable (h2=0.41) and has significant genetic correlations with diffuse plaques (r=0.24, p=0.010) and possibly working memory (r=-0.35, p=0.065). Overall, these results suggest that the epigenetic clock may lend itself as a molecular biomarker of brain age.
Collapse
Affiliation(s)
- Morgan E. Levine
- 1 Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA,2 Center for Neurobehavioral Genetics, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Ake T. Lu
- 1 Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - David A. Bennett
- 3 Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL 60612, USA,4 Department of Neurological Sciences, Rush University Medical Center, Chicago, IL 60612, USA
| | - Steve Horvath
- 1 Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA,5 Biostatistics, School of Public Health, University of California Los Angeles, Los Angeles, CA 90095, USA
| |
Collapse
|
381
|
Traylor M, Anderson CD, Hurford R, Bevan S, Markus HS. Oxidative phosphorylation and lacunar stroke: Genome-wide enrichment analysis of common variants. Neurology 2015; 86:141-5. [PMID: 26674331 PMCID: PMC4731691 DOI: 10.1212/wnl.0000000000002260] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Accepted: 09/08/2015] [Indexed: 11/15/2022] Open
Abstract
OBJECTIVE We investigated whether oxidative phosphorylation (OXPHOS) abnormalities were associated with lacunar stroke, hypothesizing that these would be more strongly associated in patients with multiple lacunar infarcts and leukoaraiosis (LA). METHODS In 1,012 MRI-confirmed lacunar stroke cases and 964 age-matched controls recruited from general practice surgeries, we investigated associations between common genetic variants within the OXPHOS pathway and lacunar stroke using a permutation-based enrichment approach. Cases were phenotyped using MRI into those with multiple infarcts or LA (MLI/LA) and those with isolated lacunar infarcts (ILI) based on the number of subcortical infarcts and degree of LA, using the Fazekas grading. Using gene-level association statistics, we tested for enrichment of genes in the OXPHOS pathway with all lacunar stroke and the 2 subtypes. RESULTS There was a specific association with strong evidence of enrichment in the top 1% of genes in the MLI/LA (subtype p = 0.0017) but not in the ILI subtype (p = 1). Genes in the top percentile for the all lacunar stroke analysis were not significantly enriched (p = 0.07). CONCLUSIONS Our results implicate the OXPHOS pathway in the pathogenesis of lacunar stroke, and show the association is specific to patients with the MLI/LA subtype. They show that MRI-based subtyping of lacunar stroke can provide insights into disease pathophysiology, and imply that different radiologic subtypes of lacunar stroke subtypes have distinct underlying pathophysiologic processes.
Collapse
Affiliation(s)
- Matthew Traylor
- From Clinical Neurosciences (M.T., R.H., H.S.M.), University of Cambridge, UK; School of Life Science (S.B.), University of Lincoln, UK; and the Center for Human Genetic Research (C.D.A.), Department of Neurology, Massachusetts General Hospital, Boston.
| | - Christopher D Anderson
- From Clinical Neurosciences (M.T., R.H., H.S.M.), University of Cambridge, UK; School of Life Science (S.B.), University of Lincoln, UK; and the Center for Human Genetic Research (C.D.A.), Department of Neurology, Massachusetts General Hospital, Boston
| | - Robert Hurford
- From Clinical Neurosciences (M.T., R.H., H.S.M.), University of Cambridge, UK; School of Life Science (S.B.), University of Lincoln, UK; and the Center for Human Genetic Research (C.D.A.), Department of Neurology, Massachusetts General Hospital, Boston
| | - Steve Bevan
- From Clinical Neurosciences (M.T., R.H., H.S.M.), University of Cambridge, UK; School of Life Science (S.B.), University of Lincoln, UK; and the Center for Human Genetic Research (C.D.A.), Department of Neurology, Massachusetts General Hospital, Boston
| | - Hugh S Markus
- From Clinical Neurosciences (M.T., R.H., H.S.M.), University of Cambridge, UK; School of Life Science (S.B.), University of Lincoln, UK; and the Center for Human Genetic Research (C.D.A.), Department of Neurology, Massachusetts General Hospital, Boston
| |
Collapse
|
382
|
Benton MC, Lea RA, Macartney-Coxson D, Bellis C, Carless MA, Curran JE, Hanna M, Eccles D, Chambers GK, Blangero J, Griffiths LR. Serum bilirubin concentration is modified by UGT1A1 haplotypes and influences risk of type-2 diabetes in the Norfolk Island genetic isolate. BMC Genet 2015; 16:136. [PMID: 26628212 PMCID: PMC4667444 DOI: 10.1186/s12863-015-0291-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Accepted: 11/02/2015] [Indexed: 02/06/2023] Open
Abstract
Background Located in the Pacific Ocean between Australia and New Zealand, the unique population isolate of Norfolk Island has been shown to exhibit increased prevalence of metabolic disorders (type-2 diabetes, cardiovascular disease) compared to mainland Australia. We investigated this well-established genetic isolate, utilising its unique genomic structure to increase the ability to detect related genetic markers. A pedigree-based genome-wide association study of 16 routinely collected blood-based clinical traits in 382 Norfolk Island individuals was performed. Results A striking association peak was located at chromosome 2q37.1 for both total bilirubin and direct bilirubin, with 29 SNPs reaching statistical significance (P < 1.84 × 10−7). Strong linkage disequilibrium was observed across a 200 kb region spanning the UDP-glucuronosyltransferase family, including UGT1A1, an enzyme known to metabolise bilirubin. Given the epidemiological literature suggesting negative association between CVD-risk and serum bilirubin we further explored potential associations using stepwise multivariate regression, revealing significant association between direct bilirubin concentration and type-2 diabetes risk. In the Norfolk Island cohort increased direct bilirubin was associated with a 28 % reduction in type-2 diabetes risk (OR: 0.72, 95 % CI: 0.57-0.91, P = 0.005). When adjusted for genotypic effects the overall model was validated, with the adjusted model predicting a 30 % reduction in type-2 diabetes risk with increasing direct bilirubin concentrations (OR: 0.70, 95 % CI: 0.53-0.89, P = 0.0001). Conclusions In summary, a pedigree-based GWAS of blood-based clinical traits in the Norfolk Island population has identified variants within the UDPGT family directly associated with serum bilirubin levels, which is in turn implicated with reduced risk of developing type-2 diabetes within this population. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0291-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- M C Benton
- Genomics Research Centre, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, QLD, 4059, Australia.
| | - R A Lea
- Genomics Research Centre, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, QLD, 4059, Australia.
| | - D Macartney-Coxson
- Kenepuru Science Centre, Institute of Environmental Science and Research, Wellington, 5240, New Zealand.
| | - C Bellis
- Genomics Research Centre, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, QLD, 4059, Australia. .,Texas Biomedical Research Institute, San Antonio, TX, 78227-5301, USA.
| | - M A Carless
- Texas Biomedical Research Institute, San Antonio, TX, 78227-5301, USA.
| | - J E Curran
- Texas Biomedical Research Institute, San Antonio, TX, 78227-5301, USA.
| | - M Hanna
- Genomics Research Centre, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, QLD, 4059, Australia.
| | - D Eccles
- Genomics Research Centre, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, QLD, 4059, Australia.
| | - G K Chambers
- School of Biological Sciences, Victoria University of Wellington, Wellington, 6140, New Zealand.
| | - J Blangero
- South Texas Diabetes and Obesity Institute, University of Texas, Rio Grande Valley School of Medicine, Brownsville, TX, 78520, USA.
| | - L R Griffiths
- Genomics Research Centre, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, QLD, 4059, Australia.
| |
Collapse
|
383
|
Conjunctival fibrosis and the innate barriers to Chlamydia trachomatis intracellular infection: a genome wide association study. Sci Rep 2015; 5:17447. [PMID: 26616738 PMCID: PMC4663496 DOI: 10.1038/srep17447] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Accepted: 10/29/2015] [Indexed: 01/26/2023] Open
Abstract
Chlamydia trachomatis causes both trachoma and sexually transmitted
infections. These diseases have similar pathology and potentially similar genetic
predisposing factors. We aimed to identify polymorphisms and pathways associated
with pathological sequelae of ocular Chlamydia trachomatis infections in The
Gambia. We report a discovery phase genome-wide association study (GWAS) of scarring
trachoma (1090 cases, 1531 controls) that identified 27 SNPs with strong, but not
genome-wide significant, association with disease
(5 × 10−6 > P > 5 × 10−8).
The most strongly associated SNP (rs111513399,
P = 5.38 × 10−7)
fell within a gene (PREX2) with homology to factors known to facilitate
chlamydial entry to the host cell. Pathway analysis of GWAS data was significantly
enriched for mitotic cell cycle processes (P = 0.001), the
immune response (P = 0.00001) and for multiple cell surface
receptor signalling pathways. New analyses of published transcriptome data sets from
Gambia, Tanzania and Ethiopia also revealed that the same cell cycle and immune
response pathways were enriched at the transcriptional level in various disease
states. Although unconfirmed, the data suggest that genetic associations with
chlamydial scarring disease may be focussed on processes relating to the immune
response, the host cell cycle and cell surface receptor signalling.
Collapse
|
384
|
The role of common genetic variation in educational attainment and income: evidence from the National Child Development Study. Sci Rep 2015; 5:16509. [PMID: 26561353 PMCID: PMC4642349 DOI: 10.1038/srep16509] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 10/14/2015] [Indexed: 11/16/2022] Open
Abstract
We investigated the role of common genetic variation in educational attainment and household income. We used data from 5,458 participants of the National Child Development Study to estimate: 1) the associations of rs9320913, rs11584700 and rs4851266 and socioeconomic position and educational phenotypes; and 2) the univariate chip-heritability of each phenotype, and the genetic correlation between each phenotype and educational attainment at age 16. The three SNPs were associated with most measures of educational attainment. Common genetic variation contributed to 6 of 14 socioeconomic background phenotypes, and 17 of 29 educational phenotypes. We found evidence of genetic correlations between educational attainment at age 16 and 4 of 14 social background and 8 of 28 educational phenotypes. This suggests common genetic variation contributes both to differences in educational attainment and its relationship with other phenotypes. However, we remain cautious that cryptic population structure, assortative mating, and dynastic effects may influence these associations.
Collapse
|
385
|
Seldin MF, Alkhairy OK, Lee AT, Lamb JA, Sussman J, Pirskanen-Matell R, Piehl F, Verschuuren JJGM, Kostera-Pruszczyk A, Szczudlik P, McKee D, Maniaol AH, Harbo HF, Lie BA, Melms A, Garchon HJ, Willcox N, Gregersen PK, Hammarstrom L. Genome-Wide Association Study of Late-Onset Myasthenia Gravis: Confirmation of TNFRSF11A and Identification of ZBTB10 and Three Distinct HLA Associations. Mol Med 2015; 21:769-781. [PMID: 26562150 DOI: 10.2119/molmed.2015.00232] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2015] [Accepted: 11/09/2015] [Indexed: 01/05/2023] Open
Abstract
To investigate the genetics of late-onset myasthenia gravis (LOMG), we conducted a genome-wide association study imputation of>6 million single nucleotide polymorphisms (SNPs) in 532 LOMG cases (anti-acetylcholine receptor [AChR] antibody positive; onset age≥50 years) and 2,128 controls matched for sex and population substructure. The data confirm reported TNFRSF11A associations (rs4574025, P = 3.9 × 10-7, odds ratio [OR] 1.42) and identify a novel candidate gene, ZBTB10, achieving genome-wide significance (rs6998967, P = 8.9 × 10-10, OR 0.53). Several other SNPs showed suggestive significance including rs2476601 (P = 6.5 × 10-6, OR 1.62) encoding the PTPN22 R620W variant noted in early-onset myasthenia gravis (EOMG) and other autoimmune diseases. In contrast, EOMG-associated SNPs in TNIP1 showed no association in LOMG, nor did other loci suggested for EOMG. Many SNPs within the major histocompatibility complex (MHC) region showed strong associations in LOMG, but with smaller effect sizes than in EOMG (highest OR ~2 versus ~6 in EOMG). Moreover, the strongest associations were in opposite directions from EOMG, including an OR of 0.54 for DQA1*05:01 in LOMG (P = 5.9 × 10-12) versus 2.82 in EOMG (P = 3.86 × 10-45). Association and conditioning studies for the MHC region showed three distinct and largely independent association peaks for LOMG corresponding to (a) MHC class II (highest attenuation when conditioning on DQA1), (b) HLA-A and (c) MHC class III SNPs. Conditioning studies of human leukocyte antigen (HLA) amino acid residues also suggest potential functional correlates. Together, these findings emphasize the value of subgrouping myasthenia gravis patients for clinical and basic investigations and imply distinct predisposing mechanisms in LOMG.
Collapse
Affiliation(s)
- Michael F Seldin
- Department of Biochemistry and Molecular Medicine, and Department of Medicine, University of California, Davis, California, United States of America
| | - Omar K Alkhairy
- Division of Clinical Immunology, Karolinska Institutet at Karolinska University Hospital Huddinge, Stockholm, Sweden
| | - Annette T Lee
- The Robert S. Boas Center for Genomics and Human Genetics, Feinstein Institute for Medical Research, North Shore-LIJ Health System, Manhasset, New York, United States of America
| | - Janine A Lamb
- Centre for Integrated Genomic Medical Research, Manchester Academic Health Science Centre, University of Manchester, Manchester, United Kingdom
| | - Jon Sussman
- Department of Neurology, Greater Manchester Neuroscience Centre, Manchester, United Kingdom
| | | | - Fredrik Piehl
- Department of Neurology, Karolinska University Hospital Solna, Stockholm, Sweden
| | | | | | - Piotr Szczudlik
- Department of Neurology, Medical University of Warsaw, Warsaw, Poland
| | - David McKee
- Department of Neurology, Greater Manchester Neuroscience Centre, Manchester, United Kingdom
| | - Angelina H Maniaol
- Department of Neurology, Oslo University Hospital, Ullevål, Oslo, Norway
| | - Hanne F Harbo
- Department of Neurology, Oslo University Hospital and University of Oslo, Oslo, Norway
| | - Benedicte A Lie
- Department of Medical Genetics, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Arthur Melms
- Department of Neurology, Tübingen University Medical Center, Tübingen, Germany, and Neurologische Klinik, Universitàtsklinikum Erlangen, Erlangen, Germany
| | | | - Nicholas Willcox
- Nuffield Department of Clinical Neurosciences, Weatherall Institute for Molecular Medicine, University of Oxford, Oxford, United Kingdom
| | - Peter K Gregersen
- The Robert S. Boas Center for Genomics and Human Genetics, Feinstein Institute for Medical Research, North Shore-LIJ Health System, Manhasset, New York, United States of America
| | - Lennart Hammarstrom
- Division of Clinical Immunology, Karolinska Institutet at Karolinska University Hospital Huddinge, Stockholm, Sweden
| |
Collapse
|
386
|
Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR. A global reference for human genetic variation. Nature 2015; 526:68-74. [PMID: 26432245 PMCID: PMC4750478 DOI: 10.1038/nature15393] [Citation(s) in RCA: 11384] [Impact Index Per Article: 1138.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Accepted: 08/20/2015] [Indexed: 12/04/2022]
Abstract
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies. Results for the final phase of the 1000 Genomes Project are presented including whole-genome sequencing, targeted exome sequencing, and genotyping on high-density SNP arrays for 2,504 individuals across 26 populations, providing a global reference data set to support biomedical genetics. The 1000 Genomes Project has sought to comprehensively catalogue human genetic variation across populations, providing a valuable public genomic resource. The data obtained so far have found applications ranging from association studies and fine mapping studies to the filtering of likely neutral variants in rare-disease cohorts. The authors now report on the final phase of the project, phase 3, which covers previously uncharacterized areas of human genetic diversity in terms of the populations sampled and categories of characterized variation. The sample now includes more than 2,500 individuals from 26 global populations, with low coverage whole-genome and deep exome sequencing, as well as dense microarray genotyping. They find that while most common variants are shared across populations, rarer variants are often restricted to closely related populations. The authors also demonstrate the use of the phase 3 dataset as a reference panel for imputation to improve the resolution in genetic association studies.
Collapse
|
387
|
Coronary risk in relation to genetic variation in MEOX2 and TCF15 in a Flemish population. BMC Genet 2015; 16:116. [PMID: 26428460 PMCID: PMC4591634 DOI: 10.1186/s12863-015-0272-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 09/11/2015] [Indexed: 01/07/2023] Open
Abstract
Background In mice MEOX2/TCF15 heterodimers are highly expressed in heart endothelial cells and are involved in the transcriptional regulation of lipid transport. In a general population, we investigated whether genetic variation in these genes predicted coronary heart disease (CHD). Results In 2027 participants randomly recruited from a Flemish population (51.0 % women; mean age 43.6 years), we genotyped six SNPs in MEOX2 and four in TCF15. Over 15.2 years (median), CHD, myocardial infarction, coronary revascularisation and ischaemic cardiomyopathy occurred in 106, 53, 78 and 22 participants. For SNPs, we contrasted CHD risk in minor-allele heterozygotes and homozygotes (variant) vs. major-allele homozygotes (reference) and for haplotypes carriers (variant) vs. non-carriers. In multivariable-adjusted analyses with correction for multiple testing, CHD risk was associated with MEOX2 SNPs (P ≤ 0.049), but not with TCF15 SNPs (P ≥ 0.29). The MEOX2 GTCCGC haplotype (frequency 16.5 %) was associated with the sex- and age-standardised CHD incidence (5.26 vs. 3.03 events per 1000 person-years; P = 0.036); the multivariable-adjusted hazard ratio [HR] of CHD was 1.78 (95 % confidence interval, 1.25–2.56; P = 0.0054). For myocardial infarction, coronary revascularisation, and ischaemic cardiomyopathy, the corresponding HRs were 1.96 (1.16–3.31), 1.87 (1.20–2.91) and 3.16 (1.41–7.09), respectively. The MEOX2 GTCCGC haplotype significantly improved the prediction of CHD over and beyond traditional risk factors and was associated with similar population-attributable risk as smoking (18.7 % vs. 16.2 %). Conclusions Genetic variation in MEOX2, but not TCF15, is a strong predictor of CHD. Further experimental studies should elucidate the underlying molecular mechanisms. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0272-2) contains supplementary material, which is available to authorized users.
Collapse
|
388
|
Abstract
Large population studies of immune system genes are essential for characterizing their role in diseases, including autoimmune conditions. Of key interest are a group of genes encoding the killer cell immunoglobulin-like receptors (KIRs), which have known and hypothesized roles in autoimmune diseases, resistance to viruses, reproductive conditions, and cancer. These genes are highly polymorphic, which makes typing expensive and time consuming. Consequently, despite their importance, KIRs have been little studied in large cohorts. Statistical imputation methods developed for other complex loci (e.g., human leukocyte antigen [HLA]) on the basis of SNP data provide an inexpensive high-throughput alternative to direct laboratory typing of these loci and have enabled important findings and insights for many diseases. We present KIR∗IMP, a method for imputation of KIR copy number. We show that KIR∗IMP is highly accurate and thus allows the study of KIRs in large cohorts and enables detailed investigation of the role of KIRs in human disease.
Collapse
|
389
|
Kanterakis A, Deelen P, van Dijk F, Byelas H, Dijkstra M, Swertz MA. Molgenis-impute: imputation pipeline in a box. BMC Res Notes 2015; 8:359. [PMID: 26286716 PMCID: PMC4541731 DOI: 10.1186/s13104-015-1309-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Accepted: 07/30/2015] [Indexed: 12/12/2022] Open
Abstract
Background Genotype imputation is an important procedure in current genomic analysis such as genome-wide association studies, meta-analyses and fine mapping. Although high quality tools are available that perform the steps of this process, considerable effort and expertise is required to set up and run a best practice imputation pipeline, particularly for larger genotype datasets, where imputation has to scale out in parallel on computer clusters. Results Here we present MOLGENIS-impute, an ‘imputation in a box’ solution that seamlessly and transparently automates the set up and running of all the steps of the imputation process. These steps include genome build liftover (liftovering), genotype phasing with SHAPEIT2, quality control, sample and chromosomal chunking/merging, and imputation with IMPUTE2. MOLGENIS-impute builds on MOLGENIS-compute, a simple pipeline management platform for submission and monitoring of bioinformatics tasks in High Performance Computing (HPC) environments like local/cloud servers, clusters and grids. All the required tools, data and scripts are downloaded and installed in a single step. Researchers with diverse backgrounds and expertise have tested MOLGENIS-impute on different locations and imputed over 30,000 samples so far using the 1,000 Genomes Project and new Genome of the Netherlands data as the imputation reference. The tests have been performed on PBS/SGE clusters, cloud VMs and in a grid HPC environment. Conclusions MOLGENIS-impute gives priority to the ease of setting up, configuring and running an imputation. It has minimal dependencies and wraps the pipeline in a simple command line interface, without sacrificing flexibility to adapt or limiting the options of underlying imputation tools. It does not require knowledge of a workflow system or programming, and is targeted at researchers who just want to apply best practices in imputation via simple commands. It is built on the MOLGENIS compute workflow framework to enable customization with additional computational steps or it can be included in other bioinformatics pipelines. It is available as open source from: https://github.com/molgenis/molgenis-imputation. Electronic supplementary material The online version of this article (doi:10.1186/s13104-015-1309-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alexandros Kanterakis
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Patrick Deelen
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Freerk van Dijk
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Heorhiy Byelas
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Martijn Dijkstra
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Morris A Swertz
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| |
Collapse
|
390
|
Li W, Fu G, Rao W, Xu W, Ma L, Guo S, Song Q. GenomeLaser: fast and accurate haplotyping from pedigree genotypes. Bioinformatics 2015; 31:3984-7. [PMID: 26286810 DOI: 10.1093/bioinformatics/btv452] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2015] [Accepted: 07/28/2015] [Indexed: 01/12/2023] Open
Abstract
UNLABELLED We present a software tool called GenomeLaser that determines the haplotypes of each person from unphased high-throughput genotypes in family pedigrees. This method features high accuracy, chromosome-range phasing distance, linear computing, flexible pedigree types and flexible genetic marker types. AVAILABILITY AND IMPLEMENTATION http://www.4dgenome.com/software/genomelaser.html.
Collapse
Affiliation(s)
- Wenzhi Li
- Department of Neurosurgery, First Affiliated Hospital of Medical School, Xi'an Jiaotong University, Xi'an, Shaanxi, 710061 China, Cardiovascular Research Institute and Department of Medicine, Morehouse School of Medicine, Atlanta, GA, 30310 USA
| | - Guoxing Fu
- 4DGENOME Inc, Atlanta, GA, 30033 USA and
| | | | - Wei Xu
- Cardiovascular Research Institute and Department of Medicine, Morehouse School of Medicine, Atlanta, GA, 30310 USA
| | - Li Ma
- Cardiovascular Research Institute and Department of Medicine, Morehouse School of Medicine, Atlanta, GA, 30310 USA, 4DGENOME Inc, Atlanta, GA, 30033 USA and
| | - Shiwen Guo
- Department of Neurosurgery, First Affiliated Hospital of Medical School, Xi'an Jiaotong University, Xi'an, Shaanxi, 710061 China
| | - Qing Song
- Cardiovascular Research Institute and Department of Medicine, Morehouse School of Medicine, Atlanta, GA, 30310 USA, 4DGENOME Inc, Atlanta, GA, 30033 USA and Center of Big Data and Bioinformatics, First Affiliated Hospital of Medical School, Xi'an Jiaotong University, Xi'an, Shaanxi, 710061 China
| |
Collapse
|
391
|
Maclean CA, Chue Hong NP, Prendergast JGD. hapbin: An Efficient Program for Performing Haplotype-Based Scans for Positive Selection in Large Genomic Datasets. Mol Biol Evol 2015; 32:3027-9. [PMID: 26248562 PMCID: PMC4651233 DOI: 10.1093/molbev/msv172] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2015] [Accepted: 07/27/2015] [Indexed: 11/19/2022] Open
Abstract
Understanding how the genome is shaped by selective processes forms an integral part of modern biology. However, as genomic datasets continue to grow larger it is becoming increasingly difficult to apply traditional statistics for detecting signatures of selection to these cohorts. There is therefore a pressing need for the development of the next generation of computational and analytical tools for detecting signatures of selection in large genomic datasets. Here, we present hapbin, an efficient multithreaded implementation of extended haplotype homzygosity-based statistics for detecting selection, which is up to 3,400 times faster than the current fastest implementations of these algorithms.
Collapse
Affiliation(s)
- Colin A Maclean
- EPCC, School of Physics and Astronomy, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Neil P Chue Hong
- EPCC, School of Physics and Astronomy, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | | |
Collapse
|
392
|
Multicohort analysis of the maternal age effect on recombination. Nat Commun 2015; 6:7846. [PMID: 26242864 PMCID: PMC4580993 DOI: 10.1038/ncomms8846] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Accepted: 06/18/2015] [Indexed: 11/09/2022] Open
Abstract
Several studies have reported that the number of crossovers increases with maternal age in humans, but others have found the opposite. Resolving the true effect has implications for understanding the maternal age effect on aneuploidies. Here, we revisit this question in the largest sample to date using single nucleotide polymorphism (SNP)-chip data, comprising over 6,000 meioses from nine cohorts. We develop and fit a hierarchical model to allow for differences between cohorts and between mothers. We estimate that over 10 years, the expected number of maternal crossovers increases by 2.1% (95% credible interval (0.98%, 3.3%)). Our results are not consistent with the larger positive and negative effects previously reported in smaller cohorts. We see heterogeneity between cohorts that is likely due to chance effects in smaller samples, or possibly to confounders, emphasizing that care should be taken when interpreting results from any specific cohort about the effect of maternal age on recombination.
Collapse
|
393
|
VanRaden PM, Sun C, O'Connell JR. Fast imputation using medium or low-coverage sequence data. BMC Genet 2015; 16:82. [PMID: 26168789 PMCID: PMC4501077 DOI: 10.1186/s12863-015-0243-7] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Accepted: 06/29/2015] [Indexed: 12/23/2022] Open
Abstract
Background Accurate genotype imputation can greatly reduce costs and increase benefits by combining whole-genome sequence data of varying read depth and array genotypes of varying densities. For large populations, an efficient strategy chooses the two haplotypes most likely to form each genotype and updates posterior allele probabilities from prior probabilities within those two haplotypes as each individual’s sequence is processed. Directly using allele read counts can improve imputation accuracy and reduce computation compared with calling or computing genotype probabilities first and then imputing. Results A new algorithm was implemented in findhap (version 4) software and tested using simulated bovine and actual human sequence data with different combinations of reference population size, sequence read depth and error rate. Read depths of ≥8× may be desired for direct investigation of sequenced individuals, but for a given total cost, sequencing more individuals at read depths of 2× to 4× gave more accurate imputation from array genotypes. Imputation accuracy improved further if reference individuals had both low-coverage sequence and high-density (HD) microarray data, and remained high even with a read error rate of 16 %. With read depths of ≤4×, findhap (version 4) had higher accuracy than Beagle (version 4); computing time was up to 400 times faster with findhap than with Beagle. For 10,000 sequenced individuals plus 250 with HD array genotypes to test imputation, findhap used 7 hours, 10 processors and 50 GB of memory for 1 million loci on one chromosome. Computing times increased in proportion to population size but less than proportional to number of variants. Conclusions Simultaneous genotype calling from low-coverage sequence data and imputation from array genotypes of various densities is done very efficiently within findhap by updating allele probabilities within the two haplotypes for each individual. Accuracy of genotype calling and imputation were high with both simulated bovine and actual human genomes reduced to low-coverage sequence and HD microarray data. More efficient imputation allows geneticists to locate and test effects of more DNA variants from more individuals and to include those in future prediction and selection.
Collapse
Affiliation(s)
- Paul M VanRaden
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD, 20705-2350, USA.
| | - Chuanyu Sun
- National Association of Animal Breeders, Columbia, Missouri, 65205, USA.
| | | |
Collapse
|
394
|
Hartati H, Utsunomiya YT, Sonstegard TS, Garcia JF, Jakaria J, Muladno M. Evidence of Bos javanicus x Bos indicus hybridization and major QTLs for birth weight in Indonesian Peranakan Ongole cattle. BMC Genet 2015; 16:75. [PMID: 26141727 PMCID: PMC4491226 DOI: 10.1186/s12863-015-0229-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2015] [Accepted: 06/10/2015] [Indexed: 11/10/2022] Open
Abstract
Background Peranakan Ongole (PO) is a major Indonesian Bos indicus breed that derives from animals imported from India in the late 19th century. Early imports were followed by hybridization with the Bos javanicus subspecies of cattle. Here, we used genomic data to partition the ancestry components of PO cattle and map loci implicated in birth weight. Results We found that B. javanicus contributes about 6-7 % to the average breed composition of PO cattle. Only two nearly fixed B. javanicus haplotypes were identified, suggesting that most of the B. javanicus variants are segregating under drift or by the action of balancing selection. The zebu component of the PO genome was estimated to derive from at least two distinct ancestral pools. Additionally, well-known loci underlying body size in other beef cattle breeds, such as the PLAG1 region on chromosome 14, were found to also affect birth weight in PO cattle. Conclusions This study is the first attempt to characterize PO at the genome level, and contributes evidence of successful, stabilized B. indicus x B. javanicus hybridization. Additionally, previously described loci implicated in body size in worldwide beef cattle breeds also affect birth weight in PO cattle. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0229-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hartati Hartati
- Beef Cattle Research Station, Indonesian Agency for Agricultural Research and Development, Ministry of Agriculture, Jln. Pahlawan no. 2 Grati, Pasuruan, East Java, 16784, Indonesia.
| | - Yuri Tani Utsunomiya
- Faculdade de Ciências Agrárias e Veterinárias, UNESP - Univ Estadual Paulista, Jaboticabal, São Paulo, 14884-900, Brazil.
| | - Tad Stewart Sonstegard
- ARS-USDA - Agricultural Research Service - United States Department of Agriculture, Animal Genomics and Improvement Laboratory, Beltsville, MD, 20705, USA.
| | - José Fernando Garcia
- Faculdade de Ciências Agrárias e Veterinárias, UNESP - Univ Estadual Paulista, Jaboticabal, São Paulo, 14884-900, Brazil. .,Faculdade de Medicina Veterinária de Araçatuba, UNESP - Univ Estadual Paulista, Araçatuba, São Paulo, 16050-680, Brazil.
| | - Jakaria Jakaria
- Faculty of Animal Science, Bogor Agriculture University, Jln. Agatis kampus IPB Dramaga, Bogor, 16680, Indonesia.
| | - Muladno Muladno
- Faculty of Animal Science, Bogor Agriculture University, Jln. Agatis kampus IPB Dramaga, Bogor, 16680, Indonesia.
| |
Collapse
|
395
|
Vockley CM, Guo C, Majoros WH, Nodzenski M, Scholtens DM, Hayes MG, Lowe WL, Reddy TE. Massively parallel quantification of the regulatory effects of noncoding genetic variation in a human cohort. Genome Res 2015; 25:1206-14. [PMID: 26084464 PMCID: PMC4510004 DOI: 10.1101/gr.190090.115] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2015] [Accepted: 06/15/2015] [Indexed: 12/30/2022]
Abstract
We report a novel high-throughput method to empirically quantify individual-specific regulatory element activity at the population scale. The approach combines targeted DNA capture with a high-throughput reporter gene expression assay. As demonstration, we measured the activity of more than 100 putative regulatory elements from 95 individuals in a single experiment. In agreement with previous reports, we found that most genetic variants have weak effects on distal regulatory element activity. Because haplotypes are typically maintained within but not between assayed regulatory elements, the approach can be used to identify causal regulatory haplotypes that likely contribute to human phenotypes. Finally, we demonstrate the utility of the method to functionally fine map causal regulatory variants in regions of high linkage disequilibrium identified by expression quantitative trait loci (eQTL) analyses.
Collapse
Affiliation(s)
- Christopher M Vockley
- Department of Cell Biology, Duke University Medical School, Durham, North Carolina 27710, USA; Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA
| | - Cong Guo
- Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA; University Program in Genetics and Genomics, Duke University, Durham, North Carolina 27710, USA
| | - William H Majoros
- Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA; Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27710, USA
| | - Michael Nodzenski
- Department of Preventive Medicine, Division of Biostatistics, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611, USA
| | - Denise M Scholtens
- Department of Preventive Medicine, Division of Biostatistics, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611, USA
| | - M Geoffrey Hayes
- Division of Endocrinology, Metabolism and Molecular Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611, USA
| | - William L Lowe
- Division of Endocrinology, Metabolism and Molecular Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611, USA
| | - Timothy E Reddy
- Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27710, USA; Department of Biostatistics and Bioinformatics, Duke University Medical School, Durham, North Carolina 27710, USA
| |
Collapse
|
396
|
Lee D, Bigdeli TB, Williamson VS, Vladimirov VI, Riley BP, Fanous AH, Bacanu SA. DISTMIX: direct imputation of summary statistics for unmeasured SNPs from mixed ethnicity cohorts. Bioinformatics 2015; 31:3099-104. [PMID: 26059716 PMCID: PMC4576696 DOI: 10.1093/bioinformatics/btv348] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2014] [Accepted: 05/29/2015] [Indexed: 01/09/2023] Open
Abstract
Motivation: To increase the signal resolution for large-scale meta-analyses of genome-wide association studies, genotypes at unmeasured single nucleotide polymorphisms (SNPs) are commonly imputed using large multi-ethnic reference panels. However, the ever increasing size and ethnic diversity of both reference panels and cohorts makes genotype imputation computationally challenging for moderately sized computer clusters. Moreover, genotype imputation requires subject-level genetic data, which unlike summary statistics provided by virtually all studies, is not publicly available. While there are much less demanding methods which avoid the genotype imputation step by directly imputing SNP statistics, e.g. Directly Imputing summary STatistics (DIST) proposed by our group, their implicit assumptions make them applicable only to ethnically homogeneous cohorts. Results: To decrease computational and access requirements for the analysis of cosmopolitan cohorts, we propose DISTMIX, which extends DIST capabilities to the analysis of mixed ethnicity cohorts. The method uses a relevant reference panel to directly impute unmeasured SNP statistics based only on statistics at measured SNPs and estimated/user-specified ethnic proportions. Simulations show that the proposed method adequately controls the Type I error rates. The 1000 Genomes panel imputation of summary statistics from the ethnically diverse Psychiatric Genetic Consortium Schizophrenia Phase 2 suggests that, when compared to genotype imputation methods, DISTMIX offers comparable imputation accuracy for only a fraction of computational resources. Availability and implementation: DISTMIX software, its reference population data, and usage examples are publicly available at http://code.google.com/p/distmix. Contact:dlee4@vcu.edu Supplementary information:Supplementary Data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Donghyung Lee
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA, Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - T Bernard Bigdeli
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Vernell S Williamson
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Vladimir I Vladimirov
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA, Center for Biomarker Research & Personalized Medicine, Virginia Commonwealth University, Richmond, VA 23298, USA and Lieber Institute for Brain Development, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Brien P Riley
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Ayman H Fanous
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Silviu-Alin Bacanu
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
| |
Collapse
|
397
|
Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data. PLoS Genet 2015; 11:e1005271. [PMID: 26043085 PMCID: PMC4456389 DOI: 10.1371/journal.pgen.1005271] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Accepted: 05/12/2015] [Indexed: 12/23/2022] Open
Abstract
Sequencing family DNA samples provides an attractive alternative to population based designs to identify rare variants associated with human disease due to the enrichment of causal variants in pedigrees. Previous studies showed that genotype calling accuracy can be improved by modeling family relatedness compared to standard calling algorithms. Current family-based variant calling methods use sequencing data on single variants and ignore the identity-by-descent (IBD) sharing along the genome. In this study we describe a new computational framework to accurately estimate the IBD sharing from the sequencing data, and to utilize the inferred IBD among family members to jointly call genotypes in pedigrees. Through simulations and application to real data, we showed that IBD can be reliably estimated across the genome, even at very low coverage (e.g. 2X), and genotype accuracy can be dramatically improved. Moreover, the improvement is more pronounced for variants with low frequencies, especially at low to intermediate coverage (e.g. 10X to 20X), making our approach effective in studying rare variants in cost-effective whole genome sequencing in pedigrees. We hope that our tool is useful to the research community for identifying rare variants for human disease through family-based sequencing. To identify disease variants that occur less frequently in population, sequencing families in which multiple individuals are affected is more powerful due to the enrichment of causal variants. An important step in such studies is to infer individual genotypes from sequencing data. Existing methods do not utilize full familial transmission information and therefore result in reduced accuracy of inferred genotypes. In this study we describe a new method that infers shared genetic materials among family members and then incorporate the shared genomic information in a novel algorithm that can accurately infer genotypes. Our method is particularly advantageous when inferring low frequency variants with fewer sequence data, making it effective in analyzing genome-wide sequence data. We implemented the algorithm in a computationally efficient tool to facilitate cost-effective sequencing in families for identifying disease genetic variants.
Collapse
|
398
|
The Kalash genetic isolate: ancient divergence, drift, and selection. Am J Hum Genet 2015; 96:775-83. [PMID: 25937445 PMCID: PMC4570283 DOI: 10.1016/j.ajhg.2015.03.012] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2015] [Accepted: 03/26/2015] [Indexed: 02/05/2023] Open
Abstract
The Kalash represent an enigmatic isolated population of Indo-European speakers who have been living for centuries in the Hindu Kush mountain ranges of present-day Pakistan. Previous Y chromosome and mitochondrial DNA markers provided no support for their claimed Greek descent following Alexander III of Macedon's invasion of this region, and analysis of autosomal loci provided evidence of a strong genetic bottleneck. To understand their origins and demography further, we genotyped 23 unrelated Kalash samples on the Illumina HumanOmni2.5M-8 BeadChip and sequenced one male individual at high coverage on an Illumina HiSeq 2000. Comparison with published data from ancient hunter-gatherers and European farmers showed that the Kalash share genetic drift with the Paleolithic Siberian hunter-gatherers and might represent an extremely drifted ancient northern Eurasian population that also contributed to European and Near Eastern ancestry. Since the split from other South Asian populations, the Kalash have maintained a low long-term effective population size (2,319-2,603) and experienced no detectable gene flow from their geographic neighbors in Pakistan or from other extant Eurasian populations. The mean time of divergence between the Kalash and other populations currently residing in this region was estimated to be 11,800 (95% confidence interval = 10,600-12,600) years ago, and thus they represent present-day descendants of some of the earliest migrants into the Indian sub-continent from West Asia.
Collapse
|
399
|
Xue Y, Prado-Martinez J, Sudmant PH, Narasimhan V, Ayub Q, Szpak M, Frandsen P, Chen Y, Yngvadottir B, Cooper DN, de Manuel M, Hernandez-Rodriguez J, Lobon I, Siegismund HR, Pagani L, Quail MA, Hvilsom C, Mudakikwa A, Eichler EE, Cranfield MR, Marques-Bonet T, Tyler-Smith C, Scally A. Mountain gorilla genomes reveal the impact of long-term population decline and inbreeding. Science 2015; 348:242-245. [PMID: 25859046 PMCID: PMC4668944 DOI: 10.1126/science.aaa3952] [Citation(s) in RCA: 243] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2014] [Accepted: 03/03/2015] [Indexed: 12/30/2022]
Abstract
Mountain gorillas are an endangered great ape subspecies and a prominent focus for conservation, yet we know little about their genomic diversity and evolutionary past. We sequenced whole genomes from multiple wild individuals and compared the genomes of all four Gorilla subspecies. We found that the two eastern subspecies have experienced a prolonged population decline over the past 100,000 years, resulting in very low genetic diversity and an increased overall burden of deleterious variation. A further recent decline in the mountain gorilla population has led to extensive inbreeding, such that individuals are typically homozygous at 34% of their sequence, leading to the purging of severely deleterious recessive mutations from the population. We discuss the causes of their decline and the consequences for their future survival.
Collapse
Affiliation(s)
- Yali Xue
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Javier Prado-Martinez
- Institut de Biologia Evolutiva (CSIC/UPF), Parque de Investigación Biomédica de Barcelona (PRBB), Barcelona, Catalonia 08003, Spain
| | - Peter H. Sudmant
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Vagheesh Narasimhan
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge CB3 0WA, UK
| | - Qasim Ayub
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Michal Szpak
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Peter Frandsen
- Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| | - Yuan Chen
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Bryndis Yngvadottir
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - David N. Cooper
- Institute of Medical Genetics, Cardiff University, Cardiff CF14 4XN, UK
| | - Marc de Manuel
- Institut de Biologia Evolutiva (CSIC/UPF), Parque de Investigación Biomédica de Barcelona (PRBB), Barcelona, Catalonia 08003, Spain
| | - Jessica Hernandez-Rodriguez
- Institut de Biologia Evolutiva (CSIC/UPF), Parque de Investigación Biomédica de Barcelona (PRBB), Barcelona, Catalonia 08003, Spain
| | - Irene Lobon
- Institut de Biologia Evolutiva (CSIC/UPF), Parque de Investigación Biomédica de Barcelona (PRBB), Barcelona, Catalonia 08003, Spain
| | - Hans R. Siegismund
- Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| | - Luca Pagani
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
- Department of Biological, Geological and Environmental Sciences, University of Bologna, 40134 Bologna, Italy
| | - Michael A. Quail
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Christina Hvilsom
- Research and Conservation, Copenhagen Zoo, DK-2000 Frederiksberg, Denmark
| | | | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, Seattle, WA 91895, USA
| | - Michael R. Cranfield
- Gorilla Doctors, Karen C. Drayer Wildlife Health Center, University of California, Davis, CA 95616, USA
| | - Tomas Marques-Bonet
- Institut de Biologia Evolutiva (CSIC/UPF), Parque de Investigación Biomédica de Barcelona (PRBB), Barcelona, Catalonia 08003, Spain
- Centro Nacional de Análisis Genómico (Parc Cientific de Barcelona), Baldiri Reixac 4, 08028 Barcelona, Spain
| | - Chris Tyler-Smith
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Aylwyn Scally
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK
| |
Collapse
|
400
|
Haplotype phasing and inheritance of copy number variants in nuclear families. PLoS One 2015; 10:e0122713. [PMID: 25853576 PMCID: PMC4390228 DOI: 10.1371/journal.pone.0122713] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Accepted: 02/12/2015] [Indexed: 11/19/2022] Open
Abstract
DNA copy number variants (CNVs) that alter the copy number of a particular DNA segment in the genome play an important role in human phenotypic variability and disease susceptibility. A number of CNVs overlapping with genes have been shown to confer risk to a variety of human diseases thus highlighting the relevance of addressing the variability of CNVs at a higher resolution. So far, it has not been possible to deterministically infer the allelic composition of different haplotypes present within the CNV regions. We have developed a novel computational method, called PiCNV, which enables to resolve the haplotype sequence composition within CNV regions in nuclear families based on SNP genotyping microarray data. The algorithm allows to i) phase normal and CNV-carrying haplotypes in the copy number variable regions, ii) resolve the allelic copies of rearranged DNA sequence within the haplotypes and iii) infer the heritability of identified haplotypes in trios or larger nuclear families. To our knowledge this is the first program available that can deterministically phase null, mono-, di-, tri- and tetraploid genotypes in CNV loci. We applied our method to study the composition and inheritance of haplotypes in CNV regions of 30 HapMap Yoruban trios and 34 Estonian families. For 93.6% of the CNV loci, PiCNV enabled to unambiguously phase normal and CNV-carrying haplotypes and follow their transmission in the corresponding families. Furthermore, allelic composition analysis identified the co-occurrence of alternative allelic copies within 66.7% of haplotypes carrying copy number gains. We also observed less frequent transmission of CNV-carrying haplotypes from parents to children compared to normal haplotypes and identified an emergence of several de novo deletions and duplications in the offspring.
Collapse
|