1
|
Kontou PI, Bagos PG. The goldmine of GWAS summary statistics: a systematic review of methods and tools. BioData Min 2024; 17:31. [PMID: 39238044 PMCID: PMC11375927 DOI: 10.1186/s13040-024-00385-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 08/27/2024] [Indexed: 09/07/2024] Open
Abstract
Genome-wide association studies (GWAS) have revolutionized our understanding of the genetic architecture of complex traits and diseases. GWAS summary statistics have become essential tools for various genetic analyses, including meta-analysis, fine-mapping, and risk prediction. However, the increasing number of GWAS summary statistics and the diversity of software tools available for their analysis can make it challenging for researchers to select the most appropriate tools for their specific needs. This systematic review aims to provide a comprehensive overview of the currently available software tools and databases for GWAS summary statistics analysis. We conducted a comprehensive literature search to identify relevant software tools and databases. We categorized the tools and databases by their functionality, including data management, quality control, single-trait analysis, and multiple-trait analysis. We also compared the tools and databases based on their features, limitations, and user-friendliness. Our review identified a total of 305 functioning software tools and databases dedicated to GWAS summary statistics, each with unique strengths and limitations. We provide descriptions of the key features of each tool and database, including their input/output formats, data types, and computational requirements. We also discuss the overall usability and applicability of each tool for different research scenarios. This comprehensive review will serve as a valuable resource for researchers who are interested in using GWAS summary statistics to investigate the genetic basis of complex traits and diseases. By providing a detailed overview of the available tools and databases, we aim to facilitate informed tool selection and maximize the effectiveness of GWAS summary statistics analysis.
Collapse
Affiliation(s)
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of Thessaly, 35131, Lamia, Greece.
| |
Collapse
|
2
|
Ray D, Loomis SJ, Venkataraghavan S, Zhang J, Tin A, Yu B, Chatterjee N, Selvin E, Duggal P. Characterizing Common and Rare Variations in Nontraditional Glycemic Biomarkers Using Multivariate Approaches on Multiancestry ARIC Study. Diabetes 2024; 73:1537-1550. [PMID: 38869630 PMCID: PMC11333373 DOI: 10.2337/db23-0318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 06/05/2024] [Indexed: 06/14/2024]
Abstract
Genetic studies of nontraditional glycemic biomarkers, glycated albumin and fructosamine, can shed light on unknown aspects of type 2 diabetes genetics and biology. We performed a multiphenotype genome-wide association study of glycated albumin and fructosamine from 7,395 White and 2,016 Black participants in the Atherosclerosis Risk in Communities (ARIC) study on common variants from genotyped/imputed data. We discovered two genome-wide significant loci, one mapping to a known type 2 diabetes gene (ARAP1/STARD10) and another mapping to a novel region (UGT1A complex of genes), using multiomics gene-mapping strategies in diabetes-relevant tissues. We identified additional loci that were ancestry- and sex-specific (e.g., PRKCA in African ancestry, FCGRT in European ancestry, TEX29 in males). Further, we implemented multiphenotype gene-burden tests on whole-exome sequence data from 6,590 White and 2,309 Black ARIC participants. Ten variant sets annotated to genes across different variant aggregation strategies were exome-wide significant only in multiancestry analysis, of which CD1D, EGFL7/AGPAT2, and MIR126 had notable enrichment of rare predicted loss of function variants in African ancestry despite smaller sample sizes. Overall, 8 of 14 discovered loci and genes were implicated to influence these biomarkers via glycemic pathways, and most of them were not previously implicated in studies of type 2 diabetes. This study illustrates improved locus discovery and potential effector gene discovery by leveraging joint patterns of related biomarkers across the entire allele frequency spectrum in multiancestry analysis. Future investigation of the loci and genes potentially acting through glycemic pathways may help us better understand the risk of developing type 2 diabetes. ARTICLE HIGHLIGHTS
Collapse
Affiliation(s)
- Debashree Ray
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
| | | | - Sowmya Venkataraghavan
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
| | - Jiachen Zhang
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
| | - Adrienne Tin
- School of Medicine, University of Mississippi Medical Center, Jackson, MS
| | - Bing Yu
- Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX
| | - Nilanjan Chatterjee
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
- Department of Oncology, School of Medicine, Johns Hopkins University, Baltimore, MD
| | - Elizabeth Selvin
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
- Welch Center for Prevention, Epidemiology, & Clinical Research, Johns Hopkins University, Baltimore, MD
| | - Priya Duggal
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
| |
Collapse
|
3
|
Morgante F, Carbonetto P, Wang G, Zou Y, Sarkar A, Stephens M. A flexible empirical Bayes approach to multivariate multiple regression, and its improved accuracy in predicting multi-tissue gene expression from genotypes. PLoS Genet 2023; 19:e1010539. [PMID: 37418505 PMCID: PMC10355440 DOI: 10.1371/journal.pgen.1010539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 06/02/2023] [Indexed: 07/09/2023] Open
Abstract
Predicting phenotypes from genotypes is a fundamental task in quantitative genetics. With technological advances, it is now possible to measure multiple phenotypes in large samples. Multiple phenotypes can share their genetic component; therefore, modeling these phenotypes jointly may improve prediction accuracy by leveraging effects that are shared across phenotypes. However, effects can be shared across phenotypes in a variety of ways, so computationally efficient statistical methods are needed that can accurately and flexibly capture patterns of effect sharing. Here, we describe new Bayesian multivariate, multiple regression methods that, by using flexible priors, are able to model and adapt to different patterns of effect sharing and specificity across phenotypes. Simulation results show that these new methods are fast and improve prediction accuracy compared with existing methods in a wide range of settings where effects are shared. Further, in settings where effects are not shared, our methods still perform competitively with state-of-the-art methods. In real data analyses of expression data in the Genotype Tissue Expression (GTEx) project, our methods improve prediction performance on average for all tissues, with the greatest gains in tissues where effects are strongly shared, and in the tissues with smaller sample sizes. While we use gene expression prediction to illustrate our methods, the methods are generally applicable to any multi-phenotype applications, including prediction of polygenic scores and breeding values. Thus, our methods have the potential to provide improvements across fields and organisms.
Collapse
Affiliation(s)
- Fabio Morgante
- Center for Human Genetics, Clemson University, Greenwood, South Carolina, United States of America
- Department of Genetics and Biochemistry, Clemson University, Clemson, South Carolina, United States of America
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - Peter Carbonetto
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Research Computing Center, University of Chicago, Chicago, Illinois, United States of America
| | - Gao Wang
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Department of Neurology, Columbia University, New York, New York, United States of America
- Gertrude H. Sergievsky Center, Columbia University, New York, New York, United States of America
| | - Yuxin Zou
- Department of Statistics, University of Chicago, Chicago, Illinois, United States of America
- Regeneron Genetics Center, Regeneron Pharmaceuticals Inc., Tarrytown, New York, United States of America
| | - Abhishek Sarkar
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Matthew Stephens
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Department of Statistics, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
4
|
Ray D, Loomis SJ, Venkataraghavan S, Tin A, Yu B, Chatterjee N, Selvin E, Duggal P. Characterizing common and rare variations in non-traditional glycemic biomarkers using multivariate approaches on multi-ancestry ARIC study. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.13.23289200. [PMID: 37398180 PMCID: PMC10312851 DOI: 10.1101/2023.06.13.23289200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Glycated hemoglobin, fasting glucose, glycated albumin, and fructosamine are biomarkers that reflect different aspects of the glycemic process. Genetic studies of these glycemic biomarkers can shed light on unknown aspects of type 2 diabetes genetics and biology. While there exists several GWAS of glycated hemoglobin and fasting glucose, very few GWAS have focused on glycated albumin or fructosamine. We performed a multi-phenotype GWAS of glycated albumin and fructosamine from 7,395 White and 2,016 Black participants in the Atherosclerosis Risk in Communities (ARIC) study on the common variants from genotyped/imputed data. We found 2 genome-wide significant loci, one mapping to known type 2 diabetes gene (ARAP1/STARD10, p = 2.8 × 10-8) and another mapping to a novel gene (UGT1A, p = 1.4 × 10-8) using multi-omics gene mapping strategies in diabetes-relevant tissues. We identified additional loci that were ancestry-specific (e.g., PRKCA from African ancestry individuals, p = 1.7 × 10-8) and sex-specific (TEX29 locus in males only, p = 3.0 × 10-8). Further, we implemented multi-phenotype gene-burden tests on whole-exome sequence data from 6,590 White and 2,309 Black ARIC participants. Eleven genes across different rare variant aggregation strategies were exome-wide significant only in multi-ancestry analysis. Four out of 11 genes had notable enrichment of rare predicted loss of function variants in African ancestry participants despite smaller sample size. Overall, 8 out of 15 loci/genes were implicated to influence these biomarkers via glycemic pathways. This study illustrates improved locus discovery and potential effector gene discovery by leveraging joint patterns of related biomarkers across entire allele frequency spectrum in multi-ancestry analyses. Most of the loci/genes we identified have not been previously implicated in studies of type 2 diabetes, and future investigation of the loci/genes potentially acting through glycemic pathways may help us better understand risk of developing type 2 diabetes.
Collapse
Affiliation(s)
- Debashree Ray
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
| | | | - Sowmya Venkataraghavan
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
| | - Adrienne Tin
- School of Medicine, University of Mississippi Medical Center, Jackson, MS
| | - Bing Yu
- Department of Epidemiology, UTHealth School of Public Health, Houston, TX
| | - Nilanjan Chatterjee
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
- Department of Oncology, School of Medicine, Johns Hopkins University, Baltimore, MD
| | - Elizabeth Selvin
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
- Welch Center for Prevention, Epidemiology, & Clinical Research, Johns Hopkins University, Baltimore, MD
| | - Priya Duggal
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
| |
Collapse
|
5
|
Chu BB, Ko S, Zhou JJ, Jensen A, Zhou H, Sinsheimer JS, Lange K. Multivariate genome-wide association analysis by iterative hard thresholding. Bioinformatics 2023; 39:btad193. [PMID: 37067496 PMCID: PMC10133532 DOI: 10.1093/bioinformatics/btad193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 04/07/2023] [Accepted: 04/13/2023] [Indexed: 04/18/2023] Open
Abstract
MOTIVATION In a genome-wide association study, analyzing multiple correlated traits simultaneously is potentially superior to analyzing the traits one by one. Standard methods for multivariate genome-wide association study operate marker-by-marker and are computationally intensive. RESULTS We present a sparsity constrained regression algorithm for multivariate genome-wide association study based on iterative hard thresholding and implement it in a convenient Julia package MendelIHT.jl. In simulation studies with up to 100 quantitative traits, iterative hard thresholding exhibits similar true positive rates, smaller false positive rates, and faster execution times than GEMMA's linear mixed models and mv-PLINK's canonical correlation analysis. On UK Biobank data with 470 228 variants, MendelIHT completed a three-trait joint analysis (n=185 656) in 20 h and an 18-trait joint analysis (n=104 264) in 53 h with an 80 GB memory footprint. In short, MendelIHT enables geneticists to fit a single regression model that simultaneously considers the effect of all SNPs and dozens of traits. AVAILABILITY AND IMPLEMENTATION Software, documentation, and scripts to reproduce our results are available from https://github.com/OpenMendel/MendelIHT.jl.
Collapse
Affiliation(s)
- Benjamin B Chu
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
| | - Seyoon Ko
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, CA 90095-1554, United States
| | - Jin J Zhou
- Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
| | - Aubrey Jensen
- Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, CA 90095-1554, United States
| | - Hua Zhou
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, CA 90095-1554, United States
| | - Janet S Sinsheimer
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
| | - Kenneth Lange
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Statistics at UCLA, Los Angeles, CA 90095-1554, United States
| |
Collapse
|
6
|
Hill C, Duffy S, Coulter T, Maxwell AP, McKnight AJ. Harnessing Genomic Analysis to Explore the Role of Telomeres in the Pathogenesis and Progression of Diabetic Kidney Disease. Genes (Basel) 2023; 14:609. [PMID: 36980881 PMCID: PMC10048490 DOI: 10.3390/genes14030609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 02/20/2023] [Accepted: 02/21/2023] [Indexed: 03/06/2023] Open
Abstract
The prevalence of diabetes is increasing globally, and this trend is predicted to continue for future decades. Research is needed to uncover new ways to manage diabetes and its co-morbidities. A significant secondary complication of diabetes is kidney disease, which can ultimately result in the need for renal replacement therapy, via dialysis or transplantation. Diabetic kidney disease presents a substantial burden to patients, their families and global healthcare services. This review highlights studies that have harnessed genomic, epigenomic and functional prediction tools to uncover novel genes and pathways associated with DKD that are useful for the identification of therapeutic targets or novel biomarkers for risk stratification. Telomere length regulation is a specific pathway gaining attention recently because of its association with DKD. Researchers are employing both observational and genetics-based studies to identify telomere-related genes associated with kidney function decline in diabetes. Studies have also uncovered novel functions for telomere-related genes beyond the immediate regulation of telomere length, such as transcriptional regulation and inflammation. This review summarises studies that have revealed the potential to harness therapeutics that modulate telomere length, or the associated epigenetic modifications, for the treatment of DKD, to potentially slow renal function decline and reduce the global burden of this disease.
Collapse
Affiliation(s)
- Claire Hill
- Centre for Public Health, Queen’s University of Belfast, Belfast BT12 6BA, UK
| | - Seamus Duffy
- Centre for Public Health, Queen’s University of Belfast, Belfast BT12 6BA, UK
| | - Tiernan Coulter
- Centre for Public Health, Queen’s University of Belfast, Belfast BT12 6BA, UK
| | - Alexander Peter Maxwell
- Centre for Public Health, Queen’s University of Belfast, Belfast BT12 6BA, UK
- Regional Nephrology Unit, Belfast City Hospital, Belfast BT9 7AB, UK
| | - Amy Jayne McKnight
- Centre for Public Health, Queen’s University of Belfast, Belfast BT12 6BA, UK
| |
Collapse
|
7
|
Smith SP, Shahamatdar S, Cheng W, Zhang S, Paik J, Graff M, Haiman C, Matise TC, North KE, Peters U, Kenny E, Gignoux C, Wojcik G, Crawford L, Ramachandran S. Enrichment analyses identify shared associations for 25 quantitative traits in over 600,000 individuals from seven diverse ancestries. Am J Hum Genet 2022; 109:871-884. [PMID: 35349783 PMCID: PMC9118115 DOI: 10.1016/j.ajhg.2022.03.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 03/02/2022] [Indexed: 12/12/2022] Open
Abstract
Since 2005, genome-wide association (GWA) datasets have been largely biased toward sampling European ancestry individuals, and recent studies have shown that GWA results estimated from self-identified European individuals are not transferable to non-European individuals because of various confounding challenges. Here, we demonstrate that enrichment analyses that aggregate SNP-level association statistics at multiple genomic scales-from genes to genomic regions and pathways-have been underutilized in the GWA era and can generate biologically interpretable hypotheses regarding the genetic basis of complex trait architecture. We illustrate examples of the robust associations generated by enrichment analyses while studying 25 continuous traits assayed in 566,786 individuals from seven diverse self-identified human ancestries in the UK Biobank and the Biobank Japan as well as 44,348 admixed individuals from the PAGE consortium including cohorts of African American, Hispanic and Latin American, Native Hawaiian, and American Indian/Alaska Native individuals. We identify 1,000 gene-level associations that are genome-wide significant in at least two ancestry cohorts across these 25 traits as well as highly conserved pathway associations with triglyceride levels in European, East Asian, and Native Hawaiian cohorts.
Collapse
Affiliation(s)
- Samuel Pattillo Smith
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA; Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, RI 02912, USA
| | - Sahar Shahamatdar
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA; Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, RI 02912, USA
| | - Wei Cheng
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA; Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, RI 02912, USA
| | - Selena Zhang
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
| | - Joseph Paik
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA
| | - Misa Graff
- Department of Epidemiology, University of North Carolina, Chapel Hill, Chapel Hill, NC 27599, USA
| | - Christopher Haiman
- Department of Preventative Medicine, University of Southern California, Los Angeles, CA 90089, USA
| | - T C Matise
- Department of Genetics, Rutgers University, Piscataway, NJ 08854, USA
| | - Kari E North
- Department of Epidemiology, University of North Carolina, Chapel Hill, Chapel Hill, NC 27599, USA
| | - Ulrike Peters
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Eimear Kenny
- The Center for Genomic Health, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA; The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA; Department of Medicine, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York City, NY 10029, USA
| | - Chris Gignoux
- Division of Biomedical Informatics and Personalized Medicine, University of Colorado, Denver, CO 80204, USA
| | - Genevieve Wojcik
- Department of Epidemiology, Johns Hopkins University, Baltimore, MD 21287, USA
| | - Lorin Crawford
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA; Department of Biostatistics, Brown University, Providence, RI 02906, USA; Microsoft Research New England, Cambridge, MA 02142, USA
| | - Sohini Ramachandran
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA; Department of Ecology, Evolution, and Organismal Biology, Brown University, Providence, RI 02912, USA; Data Science Initiative, Brown University, Providence, RI 02912, USA.
| |
Collapse
|
8
|
Traylor M, Persyn E, Tomppo L, Klasson S, Abedi V, Bakker MK, Torres N, Li L, Bell S, Rutten-Jacobs L, Tozer DJ, Griessenauer CJ, Zhang Y, Pedersen A, Sharma P, Jimenez-Conde J, Rundek T, Grewal RP, Lindgren A, Meschia JF, Salomaa V, Havulinna A, Kourkoulis C, Crawford K, Marini S, Mitchell BD, Kittner SJ, Rosand J, Dichgans M, Jern C, Strbian D, Fernandez-Cadenas I, Zand R, Ruigrok Y, Rost N, Lemmens R, Rothwell PM, Anderson CD, Wardlaw J, Lewis CM, Markus HS. Genetic basis of lacunar stroke: a pooled analysis of individual patient data and genome-wide association studies. Lancet Neurol 2021; 20:351-361. [PMID: 33773637 PMCID: PMC8062914 DOI: 10.1016/s1474-4422(21)00031-4] [Citation(s) in RCA: 103] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 11/06/2020] [Accepted: 01/15/2021] [Indexed: 01/07/2023]
Abstract
BACKGROUND The genetic basis of lacunar stroke is poorly understood, with a single locus on 16q24 identified to date. We sought to identify novel associations and provide mechanistic insights into the disease. METHODS We did a pooled analysis of data from newly recruited patients with an MRI-confirmed diagnosis of lacunar stroke and existing genome-wide association studies (GWAS). Patients were recruited from hospitals in the UK as part of the UK DNA Lacunar Stroke studies 1 and 2 and from collaborators within the International Stroke Genetics Consortium. Cases and controls were stratified by ancestry and two meta-analyses were done: a European ancestry analysis, and a transethnic analysis that included all ancestry groups. We also did a multi-trait analysis of GWAS, in a joint analysis with a study of cerebral white matter hyperintensities (an aetiologically related radiological trait), to find additional genetic associations. We did a transcriptome-wide association study (TWAS) to detect genes for which expression is associated with lacunar stroke; identified significantly enriched pathways using multi-marker analysis of genomic annotation; and evaluated cardiovascular risk factors causally associated with the disease using mendelian randomisation. FINDINGS Our meta-analysis comprised studies from Europe, the USA, and Australia, including 7338 cases and 254 798 controls, of which 2987 cases (matched with 29 540 controls) were confirmed using MRI. Five loci (ICA1L-WDR12-CARF-NBEAL1, ULK4, SPI1-SLC39A13-PSMC3-RAPSN, ZCCHC14, ZBTB14-EPB41L3) were found to be associated with lacunar stroke in the European or transethnic meta-analyses. A further seven loci (SLC25A44-PMF1-BGLAP, LOX-ZNF474-LOC100505841, FOXF2-FOXQ1, VTA1-GPR126, SH3PXD2A, HTRA1-ARMS2, COL4A2) were found to be associated in the multi-trait analysis with cerebral white matter hyperintensities (n=42 310). Two of the identified loci contain genes (COL4A2 and HTRA1) that are involved in monogenic lacunar stroke. The TWAS identified associations between the expression of six genes (SCL25A44, ULK4, CARF, FAM117B, ICA1L, NBEAL1) and lacunar stroke. Pathway analyses implicated disruption of the extracellular matrix, phosphatidylinositol 5 phosphate binding, and roundabout binding (false discovery rate <0·05). Mendelian randomisation analyses identified positive associations of elevated blood pressure, history of smoking, and type 2 diabetes with lacunar stroke. INTERPRETATION Lacunar stroke has a substantial heritable component, with 12 loci now identified that could represent future treatment targets. These loci provide insights into lacunar stroke pathogenesis, highlighting disruption of the vascular extracellular matrix (COL4A2, LOX, SH3PXD2A, GPR126, HTRA1), pericyte differentiation (FOXF2, GPR126), TGF-β signalling (HTRA1), and myelination (ULK4, GPR126) in disease risk. FUNDING British Heart Foundation.
Collapse
Affiliation(s)
- Matthew Traylor
- Clinical Pharmacology and The Barts Heart Centre and NIHR Barts Biomedical Research Centre, Barts Health NHS Trust, William Harvey Research Institute, Queen Mary University of London, London, UK
| | - Elodie Persyn
- Department of Medical and Molecular Genetics, King's College London, London, UK
| | - Liisa Tomppo
- Department of Neurology, Helsinki University Hospital, Helsinki, Finland
| | - Sofia Klasson
- Department of Laboratory Medicine, Institute of Biomedicine, Sahlgrenska Academy at University of Gothenburg, Gothenburg, Sweden
| | - Vida Abedi
- Department of Molecular and Functional Genomics, Weis Center for Research, Geisinger Health System, Danville, PA, USA
| | - Mark K Bakker
- Department of Neurology and Neurosurgery, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, Netherlands
| | - Nuria Torres
- Stroke Pharmacogenomics and Genetics, Sant Pau Institute of Research, Hospital de la Santa Creu I Sant Pau, Barcelona, Spain
| | - Linxin Li
- Centre for the Prevention of Stroke and Dementia, Nuffield Department of Clinical Neuroscience, University of Oxford, Oxford, UK
| | - Steven Bell
- Clinical Neurosciences, University of Cambridge, Cambridge, UK
| | - Loes Rutten-Jacobs
- Product Development Personalized Health Care, F Hoffmann-La Roche, Basel, Switzerland
| | - Daniel J Tozer
- Clinical Neurosciences, University of Cambridge, Cambridge, UK
| | - Christoph J Griessenauer
- Neuroscience Institute, Geisinger Health System, Danville, PA, USA; Institute of Neurointervention, Paracelsus Medical University, Salzburg, Austria
| | - Yanfei Zhang
- Genomic Medicine Institute, Geisinger Health System, Danville, PA, USA
| | - Annie Pedersen
- Department of Laboratory Medicine, Institute of Biomedicine, Sahlgrenska Academy at University of Gothenburg, Gothenburg, Sweden
| | - Pankaj Sharma
- Institute of Cardiovascular Research, Royal Holloway University of London, London, UK
| | - Jordi Jimenez-Conde
- Neurovascular Research Group, Department of Neurology of Hospital del Mar-IMIM (Institut Hospital del Mar d'Investigacions Mediques), Universitat Autonoma de Barcelona/DCEXS-Universitat Pompeu Fabra, Barcelona, Spain
| | - Tatjana Rundek
- Evelyn F McKnight Brain Institute, Department of Neurology, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Raji P Grewal
- Neuroscience Institute, Saint Francis Medical Center, School of Health and Medical Sciences, Seton Hall University, South Orange, NJ, USA
| | - Arne Lindgren
- Department of Neurology, Skane University Hospital, Lund, Sweden; Department of Clinical Sciences Lund, Neurology, Lund University, Lund, Sweden
| | | | - Veikko Salomaa
- Department of Public Health Solutions, Finnish Institute for Health and Welfare, Helsinki, Finland
| | - Aki Havulinna
- Department of Public Health Solutions, Finnish Institute for Health and Welfare, Helsinki, Finland; Institute for Molecular Medicine Finland (FIMM HiLIFE), Helsinki, Finland
| | - Christina Kourkoulis
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital, Boston, MA, USA; Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Katherine Crawford
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Sandro Marini
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Braxton D Mitchell
- Division of Endocrinology, Diabetes and Nutrition, Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA; Geriatrics Research and Education Clinical Center, Baltimore Veterans Administration Medical Center, Baltimore, MD, USA
| | - Steven J Kittner
- Department of Neurology, University of Maryland School of Medicine, Baltimore, MD, USA; Geriatrics Research and Education Clinical Center, Baltimore Veterans Administration Medical Center, Baltimore, MD, USA
| | - Jonathan Rosand
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital, Boston, MA, USA; Department of Neurology, Massachusetts General Hospital, Boston, MA, USA; Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Martin Dichgans
- Institute for Stroke and Dementia Research (ISD), LMU Munich, Munich, Germany; Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
| | - Christina Jern
- Department of Laboratory Medicine, Institute of Biomedicine, Sahlgrenska Academy at University of Gothenburg, Gothenburg, Sweden
| | - Daniel Strbian
- Department of Neurology, Helsinki University Hospital, Helsinki, Finland; Clinical Neurosciences, University of Helsinki, Helsinki, Finland
| | - Israel Fernandez-Cadenas
- Stroke Pharmacogenomics and Genetics, Sant Pau Institute of Research, Hospital de la Santa Creu I Sant Pau, Barcelona, Spain; Neurovascular Research Laboratory and Neurovascular Unit, Institut de Recerca, Hospital Vall d'Hebron, Universitat Autonoma de Barcelona, Barcelona, Spain
| | - Ramin Zand
- Neuroscience Institute, Geisinger Health System, Danville, PA, USA
| | - Ynte Ruigrok
- Department of Neurology and Neurosurgery, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, Netherlands
| | - Natalia Rost
- J Philip Kistler Stroke Research Center, Massachusetts General Hospital, Boston, MA, USA
| | - Robin Lemmens
- Experimental Neurology, Department of Neurosciences, KU Leuven, Leuven, Belgium; VIB Center for Brain & Disease Research, Department of Neurology, University Hospitals Leuven, Leuven, Belgium
| | - Peter M Rothwell
- Centre for the Prevention of Stroke and Dementia, Nuffield Department of Clinical Neuroscience, University of Oxford, Oxford, UK
| | - Christopher D Anderson
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Henry and Allison McCance Center for Brain Health, Massachusetts General Hospital, Boston, MA, USA; Department of Neurology, Massachusetts General Hospital, Boston, MA, USA; Program in Medical & Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Joanna Wardlaw
- Centre for Clinical Brain Sciences, UK Dementia Research Institute and Row Fogo Centre for Research into the Ageing Brain, University of Edinburgh, Edinburgh, UK
| | - Cathryn M Lewis
- Department of Medical and Molecular Genetics, King's College London, London, UK; Social, Genetic, and Developmental Psychiatry Centre, King's College London, London, UK
| | - Hugh S Markus
- Clinical Neurosciences, University of Cambridge, Cambridge, UK.
| |
Collapse
|
9
|
Hutchinson A, Asimit J, Wallace C. Fine-mapping genetic associations. Hum Mol Genet 2020; 29:R81-R88. [PMID: 32744321 PMCID: PMC7733401 DOI: 10.1093/hmg/ddaa148] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 06/04/2020] [Accepted: 07/09/2020] [Indexed: 02/07/2023] Open
Abstract
Whilst thousands of genetic variants have been associated with human traits, identifying the subset of those variants that are causal requires a further 'fine-mapping' step. We review the basic fine-mapping approach, which is computationally fast and requires only summary data, but depends on an assumption of a single causal variant per associated region which is recognized as biologically unrealistic. We discuss different ways that the approach has been built upon to accommodate multiple causal variants in a region and to incorporate additional layers of functional annotation data. We further review methods for simultaneous fine-mapping of multiple datasets, either exploiting different linkage disequilibrium (LD) structures across ancestries or borrowing information between distinct but related traits. Finally, we look to the future and the opportunities that will be offered by increasingly accurate maps of causal variants for a multitude of human traits.
Collapse
Affiliation(s)
- Anna Hutchinson
- MRC Biostatistics Unit, Cambridge Biomedical Campus, Cambridge Institute of Public Health, Cambridge CB2 0SR, UK
| | - Jennifer Asimit
- MRC Biostatistics Unit, Cambridge Biomedical Campus, Cambridge Institute of Public Health, Cambridge CB2 0SR, UK
| | - Chris Wallace
- MRC Biostatistics Unit, Cambridge Biomedical Campus, Cambridge Institute of Public Health, Cambridge CB2 0SR, UK
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, University of Cambridge, Cambridge, CB2 0AW, UK
- Department of Medicine, University of Cambridge School of Clinical Medicine, Cambridge Biomedical Campus, Cambridge, CB2 2QQ, UK
| |
Collapse
|
10
|
DeBoever C, Tanigawa Y, Aguirre M, McInnes G, Lavertu A, Rivas MA. Assessing Digital Phenotyping to Enhance Genetic Studies of Human Diseases. Am J Hum Genet 2020; 106:611-622. [PMID: 32275883 PMCID: PMC7212271 DOI: 10.1016/j.ajhg.2020.03.007] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Accepted: 03/11/2020] [Indexed: 12/17/2022] Open
Abstract
Population-scale biobanks that combine genetic data and high-dimensional phenotyping for a large number of participants provide an exciting opportunity to perform genome-wide association studies (GWAS) to identify genetic variants associated with diverse quantitative traits and diseases. A major challenge for GWAS in population biobanks is ascertaining disease cases from heterogeneous data sources such as hospital records, digital questionnaire responses, or interviews. In this study, we use genetic parameters, including genetic correlation, to evaluate whether GWAS performed using cases in the UK Biobank ascertained from hospital records, questionnaire responses, and family history of disease implicate similar disease genetics across a range of effect sizes. We find that hospital record and questionnaire GWAS largely identify similar genetic effects for many complex phenotypes and that combining together both phenotyping methods improves power to detect genetic associations. We also show that family history GWAS using cases ascertained on family history of disease agrees with combined hospital record and questionnaire GWAS and that family history GWAS has better power to detect genetic associations for some phenotypes. Overall, this work demonstrates that digital phenotyping and unstructured phenotype data can be combined with structured data such as hospital records to identify cases for GWAS in biobanks and improve the ability of such studies to identify genetic associations.
Collapse
Affiliation(s)
| | - Yosuke Tanigawa
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Matthew Aguirre
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Greg McInnes
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Adam Lavertu
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Manuel A Rivas
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
| |
Collapse
|