2
|
Zheng J, Erzurumluoglu AM, Elsworth BL, Kemp JP, Howe L, Haycock PC, Hemani G, Tansey K, Laurin C, Pourcain BS, Warrington NM, Finucane HK, Price AL, Bulik-Sullivan BK, Anttila V, Paternoster L, Gaunt TR, Evans DM, Neale BM. LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics 2017; 33:272-279. [PMID: 27663502 PMCID: PMC5542030 DOI: 10.1093/bioinformatics/btw613] [Citation(s) in RCA: 577] [Impact Index Per Article: 82.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2016] [Revised: 08/19/2016] [Accepted: 09/20/2016] [Indexed: 12/30/2022] Open
Abstract
MOTIVATION LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously. RESULTS In this manuscript, we describe LD Hub - a centralized database of summary-level GWAS results for 173 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies. AVAILABILITY AND IMPLEMENTATION The web interface and instructions for using LD Hub are available at http://ldsc.broadinstitute.org/ CONTACT: jie.zheng@bristol.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jie Zheng
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK
| | - A Mesut Erzurumluoglu
- Genetic Epidemiology Group, Department of Health Sciences, University of Leicester, Leicester, UK
| | - Benjamin L Elsworth
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK
| | - John P Kemp
- University of Queensland Diamantina Institute, Translational Research Institute, Brisbane, QLD, Australia
| | - Laurence Howe
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK
| | - Philip C Haycock
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK
| | - Gibran Hemani
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK
| | - Katherine Tansey
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK
| | - Charles Laurin
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK
| | | | - Beate St Pourcain
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK
| | - Nicole M Warrington
- University of Queensland Diamantina Institute, Translational Research Institute, Brisbane, QLD, Australia
| | - Hilary K Finucane
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Brendan K Bulik-Sullivan
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytical and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Verneri Anttila
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Lavinia Paternoster
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK
| | - Tom R Gaunt
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK
| | - David M Evans
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol, UK
- University of Queensland Diamantina Institute, Translational Research Institute, Brisbane, QLD, Australia
| | - Benjamin M Neale
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytical and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| |
Collapse
|
3
|
Zheng J, Rodriguez S, Laurin C, Baird D, Trela-Larsen L, Erzurumluoglu MA, Zheng Y, White J, Giambartolomei C, Zabaneh D, Morris R, Kumari M, Casas JP, Hingorani AD, Evans DM, Gaunt TR, Day INM. HAPRAP: a haplotype-based iterative method for statistical fine mapping using GWAS summary statistics. Bioinformatics 2017; 33:79-86. [PMID: 27591082 PMCID: PMC5544112 DOI: 10.1093/bioinformatics/btw565] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2014] [Revised: 04/29/2016] [Accepted: 08/26/2016] [Indexed: 11/21/2022] Open
Abstract
MOTIVATION Fine mapping is a widely used approach for identifying the causal variant(s) at disease-associated loci. Standard methods (e.g. multiple regression) require individual level genotypes. Recent fine mapping methods using summary-level data require the pairwise correlation coefficients ([Formula: see text]) of the variants. However, haplotypes rather than pairwise [Formula: see text], are the true biological representation of linkage disequilibrium (LD) among multiple loci. In this article, we present an empirical iterative method, HAPlotype Regional Association analysis Program (HAPRAP), that enables fine mapping using summary statistics and haplotype information from an individual-level reference panel. RESULTS Simulations with individual-level genotypes show that the results of HAPRAP and multiple regression are highly consistent. In simulation with summary-level data, we demonstrate that HAPRAP is less sensitive to poor LD estimates. In a parametric simulation using Genetic Investigation of ANthropometric Traits height data, HAPRAP performs well with a small training sample size (N < 2000) while other methods become suboptimal. Moreover, HAPRAP's performance is not affected substantially by single nucleotide polymorphisms (SNPs) with low minor allele frequencies. We applied the method to existing quantitative trait and binary outcome meta-analyses (human height, QTc interval and gallbladder disease); all previous reported association signals were replicated and two additional variants were independently associated with human height. Due to the growing availability of summary level data, the value of HAPRAP is likely to increase markedly for future analyses (e.g. functional prediction and identification of instruments for Mendelian randomization). AVAILABILITY AND IMPLEMENTATION The HAPRAP package and documentation are available at http://apps.biocompute.org.uk/haprap/ CONTACT: : jie.zheng@bristol.ac.uk or tom.gaunt@bristol.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jie Zheng
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, Bristol, UK
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Santiago Rodriguez
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, Bristol, UK
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Charles Laurin
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, Bristol, UK
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Denis Baird
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, Bristol, UK
| | - Lea Trela-Larsen
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Mesut A Erzurumluoglu
- School of Social and Community Medicine, University of Bristol, Bristol, UK
- Department of Health Sciences, Genetic Epidemiology Group, University of Leicester, Leicester, UK
| | - Yi Zheng
- Dedman College of Humanities and Sciences, Southern Methodist University, Dallas, TX, USA
| | - Jon White
- Department of Genetics, Environment and Evolution, University College London Genetics Institute, London, UK
| | - Claudia Giambartolomei
- Department of Genetics, Environment and Evolution, University College London Genetics Institute, London, UK
| | - Delilah Zabaneh
- Department of Genetics, Environment and Evolution, University College London Genetics Institute, London, UK
| | - Richard Morris
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Meena Kumari
- Department of Genetics, Environment and Evolution, University College London Genetics Institute, London, UK
| | - Juan P Casas
- Department of Genetics, Environment and Evolution, University College London Genetics Institute, London, UK
- Department of Primary Care & Population Health, University College London, Royal Free Campus, London, UK
| | - Aroon D Hingorani
- Department of Genetics, Environment and Evolution, University College London Genetics Institute, London, UK
- Centre for Clinical Pharmacology, University College London, London, UK, Division of Medicine
| | | | - David M Evans
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, Bristol, UK
- University of Queensland Diamantina Institute, Translational Research Institute, Brisbane, Australia, QLD
| | - Tom R Gaunt
- MRC Integrative Epidemiology Unit, School of Social and Community Medicine, Bristol, UK
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Ian N M Day
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| |
Collapse
|
4
|
Lipids, obesity and gallbladder disease in women: insights from genetic studies using the cardiovascular gene-centric 50K SNP array. Eur J Hum Genet 2015; 24:106-12. [PMID: 25920552 PMCID: PMC4681116 DOI: 10.1038/ejhg.2015.63] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Revised: 02/18/2015] [Accepted: 02/20/2015] [Indexed: 01/04/2023] Open
Abstract
Gallbladder disease (GBD) has an overall prevalence of 10–40% depending on factors such as age, gender, population, obesity and diabetes, and represents a major economic burden. Although gallstones are composed of cholesterol by-products and are associated with obesity, presumed causal pathways remain unproven, although BMI reduction is typically recommended. We performed genetic studies to discover candidate genes and define pathways involved in GBD. We genotyped 15 241 women of European ancestry from three cohorts, including 3216 with GBD, using the Human cardiovascular disease (HumanCVD) BeadChip containing up to ~53 000 single-nucleotide polymorphisms (SNPs). Effect sizes with P-values for development of GBD were generated. We identify two new loci associated with GBD, GCKR rs1260326:T>C (P=5.88 × 10−7, ß=−0.146) and TTC39B rs686030:C>A (P=6.95x10−7, ß=0.271) and detect four independent SNP effects in ABCG8 rs4953023:G>A (P=7.41 × 10−47, ß=0.734), ABCG8 rs4299376:G>T (P=2.40 × 10−18, ß=0.278), ABCG5 rs6544718:T>C (P=2.08 × 10−14, ß=0.044) and ABCG5 rs6720173:G>C (P=3.81 × 10−12, ß=0.262) in conditional analyses taking genotypes of rs4953023:G>A as a covariate. We also delineate the risk effects among many genotypes known to influence lipids. These data, from the largest GBD genetic study to date, show that specific, mainly hepatocyte-centred, components of lipid metabolism are important to GBD risk in women. We discuss the potential pharmaceutical implications of our findings.
Collapse
|