51
|
Lu ZH, Zhu H, Knickmeyer RC, Sullivan PF, Williams SN, Zou F. Multiple SNP Set Analysis for Genome-Wide Association Studies Through Bayesian Latent Variable Selection. Genet Epidemiol 2015; 39:664-77. [PMID: 26515609 DOI: 10.1002/gepi.21932] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Revised: 07/23/2015] [Accepted: 08/18/2015] [Indexed: 11/07/2022]
Abstract
The power of genome-wide association studies (GWAS) for mapping complex traits with single-SNP analysis (where SNP is single-nucleotide polymorphism) may be undermined by modest SNP effect sizes, unobserved causal SNPs, correlation among adjacent SNPs, and SNP-SNP interactions. Alternative approaches for testing the association between a single SNP set and individual phenotypes have been shown to be promising for improving the power of GWAS. We propose a Bayesian latent variable selection (BLVS) method to simultaneously model the joint association mapping between a large number of SNP sets and complex traits. Compared with single SNP set analysis, such joint association mapping not only accounts for the correlation among SNP sets but also is capable of detecting causal SNP sets that are marginally uncorrelated with traits. The spike-and-slab prior assigned to the effects of SNP sets can greatly reduce the dimension of effective SNP sets, while speeding up computation. An efficient Markov chain Monte Carlo algorithm is developed. Simulations demonstrate that BLVS outperforms several competing variable selection methods in some important scenarios.
Collapse
Affiliation(s)
- Zhao-Hua Lu
- Department of Biostatistics, University of North Carolina at Chapel Hill, North Carolina, United States of America
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, North Carolina, United States of America.,Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, North Carolina, United States of America
| | - Rebecca C Knickmeyer
- Department of Psychiatry, University of North Carolina at Chapel Hill, North Carolina, United States of America
| | - Patrick F Sullivan
- Department of Genetics, University of North Carolina at Chapel Hill, North Carolina, United States of America
| | - Stephanie N Williams
- Department of Genetics, University of North Carolina at Chapel Hill, North Carolina, United States of America
| | - Fei Zou
- Department of Biostatistics, University of North Carolina at Chapel Hill, North Carolina, United States of America
| | | |
Collapse
|
52
|
Rosenberger A, Friedrichs S, Amos CI, Brennan P, Fehringer G, Heinrich J, Hung RJ, Muley T, Müller-Nurasyid M, Risch A, Bickeböller H. META-GSA: Combining Findings from Gene-Set Analyses across Several Genome-Wide Association Studies. PLoS One 2015; 10:e0140179. [PMID: 26501144 PMCID: PMC4621033 DOI: 10.1371/journal.pone.0140179] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Accepted: 09/21/2015] [Indexed: 01/31/2023] Open
Abstract
INTRODUCTION Gene-set analysis (GSA) methods are used as complementary approaches to genome-wide association studies (GWASs). The single marker association estimates of a predefined set of genes are either contrasted with those of all remaining genes or with a null non-associated background. To pool the p-values from several GSAs, it is important to take into account the concordance of the observed patterns resulting from single marker association point estimates across any given gene set. Here we propose an enhanced version of Fisher's inverse χ2-method META-GSA, however weighting each study to account for imperfect correlation between association patterns. SIMULATION AND POWER We investigated the performance of META-GSA by simulating GWASs with 500 cases and 500 controls at 100 diallelic markers in 20 different scenarios, simulating different relative risks between 1 and 1.5 in gene sets of 10 genes. Wilcoxon's rank sum test was applied as GSA for each study. We found that META-GSA has greater power to discover truly associated gene sets than simple pooling of the p-values, by e.g. 59% versus 37%, when the true relative risk for 5 of 10 genes was assume to be 1.5. Under the null hypothesis of no difference in the true association pattern between the gene set of interest and the set of remaining genes, the results of both approaches are almost uncorrelated. We recommend not relying on p-values alone when combining the results of independent GSAs. APPLICATION We applied META-GSA to pool the results of four case-control GWASs of lung cancer risk (Central European Study and Toronto/Lunenfeld-Tanenbaum Research Institute Study; German Lung Cancer Study and MD Anderson Cancer Center Study), which had already been analyzed separately with four different GSA methods (EASE; SLAT, mSUMSTAT and GenGen). This application revealed the pathway GO0015291 "transmembrane transporter activity" as significantly enriched with associated genes (GSA-method: EASE, p = 0.0315 corrected for multiple testing). Similar results were found for GO0015464 "acetylcholine receptor activity" but only when not corrected for multiple testing (all GSA-methods applied; p ≈ 0.02).
Collapse
Affiliation(s)
- Albert Rosenberger
- Department of Genetic Epidemiology, University Medical Center, Georg-August University Göttingen, Göttingen, Germany
| | - Stefanie Friedrichs
- Department of Genetic Epidemiology, University Medical Center, Georg-August University Göttingen, Göttingen, Germany
| | - Christopher I. Amos
- Geisel School of Medicine, Dartmouth College, Lebanon, NH, United States of America
| | - Paul Brennan
- International Agency for Research on Cancer (IARC), Lyon, France
| | - Gordon Fehringer
- Prosserman Centre for Health Research, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Joachim Heinrich
- Institute of Epidemiology I, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
| | - Rayjean J. Hung
- Prosserman Centre for Health Research, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Thomas Muley
- Translational Lung Research Center Heidelberg (TLRC-H), Member of the German Center for Lung Research (DZL), Heidelberg, Germany
- Thoraxklinik at University of Heidelberg, Heidelberg, Germany
| | - Martina Müller-Nurasyid
- Department of Medicine I, Ludwig-Maximilians-University Munich, Munich, Germany
- Institute of Medical Informatics, Biometry and Epidemiology, Chair of Genetic Epidemiology, Ludwig-Maximilians-University, Munich, Germany
- Institute of Genetic Epidemiology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- DZHK (German Centre for Cardiovascular Research), partner site Munich Heart Alliance, Munich, Germany
| | - Angela Risch
- Translational Lung Research Center Heidelberg (TLRC-H), Member of the German Center for Lung Research (DZL), Heidelberg, Germany
- Division of Epigenomics and Cancer Risk Factors, German Cancer Research Center, Heidelberg, Germany
- Division of Molecular Biology, University Salzburg, Salzburg, Austria
| | - Heike Bickeböller
- Department of Genetic Epidemiology, University Medical Center, Georg-August University Göttingen, Göttingen, Germany
| |
Collapse
|
53
|
Mooney MA, Wilmot B. Gene set analysis: A step-by-step guide. Am J Med Genet B Neuropsychiatr Genet 2015; 168:517-27. [PMID: 26059482 PMCID: PMC4638147 DOI: 10.1002/ajmg.b.32328] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 05/20/2015] [Indexed: 12/21/2022]
Abstract
To maximize the potential of genome-wide association studies, many researchers are performing secondary analyses to identify sets of genes jointly associated with the trait of interest. Although methods for gene-set analyses (GSA), also called pathway analyses, have been around for more than a decade, the field is still evolving. There are numerous algorithms available for testing the cumulative effect of multiple SNPs, yet no real consensus in the field about the best way to perform a GSA. This paper provides an overview of the factors that can affect the results of a GSA, the lessons learned from past studies, and suggestions for how to make analysis choices that are most appropriate for different types of data. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Michael A. Mooney
- Department of Medical Informatics & Clinical Epidemiology, Division of Bioinformatics & Computational Biology, Oregon Health & Science University, Portland, Oregon,OHSU Knight Cancer Institute, Portland, Oregon
| | - Beth Wilmot
- Department of Medical Informatics & Clinical Epidemiology, Division of Bioinformatics & Computational Biology, Oregon Health & Science University, Portland, Oregon,OHSU Knight Cancer Institute, Portland, Oregon,Oregon Clinical and Translational Research Institute, Portland, Oregon,Correspondence to: Beth Wilmot, Department of Medical Informatics & Clinical Epidemiology, Division of Bioinformatics & Computational Biology, Oregon Health & Science University, Portland, OR 97239.
| |
Collapse
|
54
|
Kar SP, Tyrer JP, Li Q, Lawrenson K, Aben KKH, Anton-Culver H, Antonenkova N, Chenevix-Trench G, Baker H, Bandera EV, Bean YT, Beckmann MW, Berchuck A, Bisogna M, Bjørge L, Bogdanova N, Brinton L, Brooks-Wilson A, Butzow R, Campbell I, Carty K, Chang-Claude J, Chen YA, Chen Z, Cook LS, Cramer D, Cunningham JM, Cybulski C, Dansonka-Mieszkowska A, Dennis J, Dicks E, Doherty JA, Dörk T, du Bois A, Dürst M, Eccles D, Easton DF, Edwards RP, Ekici AB, Fasching PA, Fridley BL, Gao YT, Gentry-Maharaj A, Giles GG, Glasspool R, Goode EL, Goodman MT, Grownwald J, Harrington P, Harter P, Hein A, Heitz F, Hildebrandt MAT, Hillemanns P, Hogdall E, Hogdall CK, Hosono S, Iversen ES, Jakubowska A, Paul J, Jensen A, Ji BT, Karlan BY, Kjaer SK, Kelemen LE, Kellar M, Kelley J, Kiemeney LA, Krakstad C, Kupryjanczyk J, Lambrechts D, Lambrechts S, Le ND, Lee AW, Lele S, Leminen A, Lester J, Levine DA, Liang D, Lissowska J, Lu K, Lubinski J, Lundvall L, Massuger L, Matsuo K, McGuire V, McLaughlin JR, McNeish IA, Menon U, Modugno F, Moysich KB, Narod SA, Nedergaard L, Ness RB, Nevanlinna H, Odunsi K, Olson SH, Orlow I, Orsulic S, Weber RP, Pearce CL, Pejovic T, Pelttari LM, Permuth-Wey J, Phelan CM, Pike MC, Poole EM, Ramus SJ, Risch HA, Rosen B, Rossing MA, Rothstein JH, Rudolph A, Runnebaum IB, Rzepecka IK, Salvesen HB, Schildkraut JM, Schwaab I, Shu XO, Shvetsov YB, Siddiqui N, Sieh W, Song H, Southey MC, Sucheston-Campbell LE, Tangen IL, Teo SH, Terry KL, Thompson PJ, Timorek A, Tsai YY, Tworoger SS, van Altena AM, Van Nieuwenhuysen E, Vergote I, Vierkant RA, Wang-Gohrke S, Walsh C, Wentzensen N, Whittemore AS, Wicklund KG, Wilkens LR, Woo YL, Wu X, Wu A, Yang H, Zheng W, Ziogas A, Sellers TA, Monteiro ANA, Freedman ML, Gayther SA, Pharoah PDP. Network-Based Integration of GWAS and Gene Expression Identifies a HOX-Centric Network Associated with Serous Ovarian Cancer Risk. Cancer Epidemiol Biomarkers Prev 2015; 24:1574-84. [PMID: 26209509 PMCID: PMC4592449 DOI: 10.1158/1055-9965.epi-14-1270] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Accepted: 06/29/2015] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Genome-wide association studies (GWAS) have so far reported 12 loci associated with serous epithelial ovarian cancer (EOC) risk. We hypothesized that some of these loci function through nearby transcription factor (TF) genes and that putative target genes of these TFs as identified by coexpression may also be enriched for additional EOC risk associations. METHODS We selected TF genes within 1 Mb of the top signal at the 12 genome-wide significant risk loci. Mutual information, a form of correlation, was used to build networks of genes strongly coexpressed with each selected TF gene in the unified microarray dataset of 489 serous EOC tumors from The Cancer Genome Atlas. Genes represented in this dataset were subsequently ranked using a gene-level test based on results for germline SNPs from a serous EOC GWAS meta-analysis (2,196 cases/4,396 controls). RESULTS Gene set enrichment analysis identified six networks centered on TF genes (HOXB2, HOXB5, HOXB6, HOXB7 at 17q21.32 and HOXD1, HOXD3 at 2q31) that were significantly enriched for genes from the risk-associated end of the ranked list (P < 0.05 and FDR < 0.05). These results were replicated (P < 0.05) using an independent association study (7,035 cases/21,693 controls). Genes underlying enrichment in the six networks were pooled into a combined network. CONCLUSION We identified a HOX-centric network associated with serous EOC risk containing several genes with known or emerging roles in serous EOC development. IMPACT Network analysis integrating large, context-specific datasets has the potential to offer mechanistic insights into cancer susceptibility and prioritize genes for experimental characterization.
Collapse
Affiliation(s)
- Siddhartha P Kar
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom.
| | - Jonathan P Tyrer
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, United Kingdom
| | - Qiyuan Li
- Department of Medical Oncology, The Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, Massachusetts
| | - Kate Lawrenson
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, California
| | - Katja K H Aben
- Radboud University Medical Centre, Radboud Institute for Health Sciences, Nijmegen, the Netherlands. Comprehensive Cancer Center The Netherlands, Utrecht, the Netherlands
| | - Hoda Anton-Culver
- Department of Epidemiology, Director of Genetic Epidemiology Research Institute, School of Medicine, University of California Irvine, Irvine, California
| | - Natalia Antonenkova
- Byelorussian Institute for Oncology and Medical Radiology Aleksandrov N.N., Minsk, Belarus
| | | | - Helen Baker
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, United Kingdom
| | - Elisa V Bandera
- Cancer Prevention and Control, Rutgers Cancer Institute of New Jersey, New Brunswick, New Jersey
| | - Yukie T Bean
- Department of Obstetrics and Gynecology, Oregon Health and Science University, Portland, Oregon. Knight Cancer Institute, Oregon Health and Science University, Portland, Oregon
| | - Matthias W Beckmann
- University Hospital Erlangen, Department of Gynecology and Obstetrics, Friedrich-Alexander-University Erlangen-Nuremberg, Comprehensive Cancer Center Erlangen-EMN, Erlangen, Germany
| | - Andrew Berchuck
- Department of Obstetrics and Gynecology, Duke University Medical Center, Durham, North Carolina
| | - Maria Bisogna
- Gynecology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Line Bjørge
- Department of Gynecology and Obstetrics, Haukeland University Hospital, Bergen, Norway. Centre for Cancer Biomarkers, Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Natalia Bogdanova
- Gynaecology Research Unit, Hannover Medical School, Hannover, Germany
| | - Louise Brinton
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland
| | - Angela Brooks-Wilson
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada. Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Ralf Butzow
- Department of Obstetrics and Gynecology, University of Helsinki and Helsinki University Central Hospital, Helsinki, HUS, Finland. Department of Pathology, Helsinki University Central Hospital, Helsinki, Finland
| | - Ian Campbell
- Cancer Genetics Laboratory, Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia. Department of Pathology, University of Melbourne, Parkville, Victoria, Australia. Sir Peter MacCallum Department of Oncology, University of Melbourne, Parkville, Victoria, Australia
| | - Karen Carty
- Cancer Research UK Clinical Trials Unit, The Beatson West of Scotland Cancer Centre, Glasgow, United Kingdom
| | - Jenny Chang-Claude
- German Cancer Research Center (DKFZ), Division of Cancer Epidemiology, Heidelberg, Germany
| | - Yian Ann Chen
- Department of Biostatistics, Moffitt Cancer Center, Tampa, Florida
| | - Zhihua Chen
- Department of Biostatistics, Moffitt Cancer Center, Tampa, Florida
| | - Linda S Cook
- Division of Epidemiology and Biostatistics, Department of Internal Medicine, University of New Mexico, Albuquerque, New Mexico
| | - Daniel Cramer
- Obstetrics and Gynecology Center, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts. Harvard School of Public Health, Boston, Massachusetts
| | - Julie M Cunningham
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota
| | - Cezary Cybulski
- Department of Genetics and Pathology, Pomeranian Medical University, Szczecin, Poland
| | - Agnieszka Dansonka-Mieszkowska
- Department of Pathology and Laboratory Diagnostics, Maria Sklodowska-Curie Memorial Cancer Center and Institute of Oncology, Warsaw, Poland
| | - Joe Dennis
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Ed Dicks
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, United Kingdom
| | - Jennifer A Doherty
- Department of Community and Family Medicine, Section of Biostatistics & Epidemiology, The Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire
| | - Thilo Dörk
- Gynaecology Research Unit, Hannover Medical School, Hannover, Germany
| | - Andreas du Bois
- Department of Gynecology and Gynecologic Oncology, Kliniken Essen-Mitte, Essen, Germany. Department of Gynecology and Gynecologic Oncology, Dr. Horst Schmidt Kliniken Wiesbaden, Wiesbaden, Germany
| | - Matthias Dürst
- Department of Gynecology, Jena University Hospital, Friedrich Schiller University, Jena, Germany
| | - Diana Eccles
- Wessex Clinical Genetics Service, Princess Anne Hospital, Southampton, United Kingdom
| | - Douglas F Easton
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom. Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, United Kingdom
| | - Robert P Edwards
- Wessex Clinical Genetics Service, Princess Anne Hospital, Southampton, United Kingdom. Ovarian Cancer Center of Excellence, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Arif B Ekici
- University Hospital Erlangen, Institute of Human Genetics, Friedrich-Alexander-University Erlangen-Nuremberg, Erlangen, Germany
| | - Peter A Fasching
- University Hospital Erlangen, Department of Gynecology and Obstetrics, Friedrich-Alexander-University Erlangen-Nuremberg, Comprehensive Cancer Center Erlangen-EMN, Erlangen, Germany. University of California at Los Angeles, David Geffen School of Medicine, Department of Medicine, Division of Hematology and Oncology, Los Angeles, California
| | - Brooke L Fridley
- Biostatistics and Informatics Shared Resource, University of Kansas Medical Center, Kansas City, Kansas
| | | | - Aleksandra Gentry-Maharaj
- Women's Cancer, University College London Elizabeth Garrett Anderson Institute for Women's Health, London, United Kingdom
| | - Graham G Giles
- Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, Victoria, Australia
| | - Rosalind Glasspool
- Cancer Research UK Clinical Trials Unit, The Beatson West of Scotland Cancer Centre, Glasgow, United Kingdom
| | - Ellen L Goode
- Department of Health Science Research, Mayo Clinic, Rochester, Minnesota
| | - Marc T Goodman
- Cancer Prevention and Control, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, California. Community and Population Health Research Institute, Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, California
| | - Jacek Grownwald
- Department of Genetics and Pathology, Pomeranian Medical University, Szczecin, Poland
| | - Patricia Harrington
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, United Kingdom
| | - Philipp Harter
- Department of Gynecology and Gynecologic Oncology, Kliniken Essen-Mitte, Essen, Germany. Department of Gynecology and Gynecologic Oncology, Dr. Horst Schmidt Kliniken Wiesbaden, Wiesbaden, Germany
| | - Alexander Hein
- University Hospital Erlangen, Department of Gynecology and Obstetrics, Friedrich-Alexander-University Erlangen-Nuremberg, Comprehensive Cancer Center Erlangen-EMN, Erlangen, Germany
| | - Florian Heitz
- Department of Gynecology and Gynecologic Oncology, Kliniken Essen-Mitte, Essen, Germany. Department of Gynecology and Gynecologic Oncology, Dr. Horst Schmidt Kliniken Wiesbaden, Wiesbaden, Germany
| | | | - Peter Hillemanns
- Departments of Obstetrics and Gynaecology, Hannover Medical School, Hannover, Germany
| | - Estrid Hogdall
- Virus, Lifestyle, and Genes, Danish Cancer Society Research Center, Copenhagen, Denmark. Molecular Unit, Department of Pathology, Herlev Hospital, University of Copenhagen, Copenhagen, Denmark
| | - Claus K Hogdall
- Department of Gynecology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Satoyo Hosono
- Division of Epidemiology and Prevention, Aichi Cancer Center Research Institute, Nagoya, Aichi, Japan
| | - Edwin S Iversen
- Department of Statistical Science, Duke University, Durham, North Carolina
| | - Anna Jakubowska
- Department of Genetics and Pathology, Pomeranian Medical University, Szczecin, Poland
| | - James Paul
- Cancer Research UK Clinical Trials Unit, The Beatson West of Scotland Cancer Centre, Glasgow, United Kingdom
| | - Allan Jensen
- Virus, Lifestyle, and Genes, Danish Cancer Society Research Center, Copenhagen, Denmark
| | - Bu-Tian Ji
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland
| | - Beth Y Karlan
- Women's Cancer Program at the Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, California
| | - Susanne K Kjaer
- Virus, Lifestyle, and Genes, Danish Cancer Society Research Center, Copenhagen, Denmark. Department of Gynecology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Linda E Kelemen
- Department of Public Health Sciences, Medical University of South Carolina, Charleston, South Carolina
| | - Melissa Kellar
- Department of Obstetrics and Gynecology, Oregon Health and Science University, Portland, Oregon. Knight Cancer Institute, Oregon Health and Science University, Portland, Oregon
| | - Joseph Kelley
- Department of Obstetrics, Gynecology, and Reproductive Sciences, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Lambertus A Kiemeney
- Radboud University Medical Centre, Radboud Institute for Health Sciences, Nijmegen, the Netherlands
| | - Camilla Krakstad
- Department of Gynecology and Obstetrics, Haukeland University Hospital, Bergen, Norway. Centre for Cancer Biomarkers, Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Jolanta Kupryjanczyk
- Department of Pathology and Laboratory Diagnostics, Maria Sklodowska-Curie Memorial Cancer Center and Institute of Oncology, Warsaw, Poland
| | - Diether Lambrechts
- Vesalius Research Center, VIB, Leuven, Belgium. Laboratory for Translational Genetics, Department of Oncology, University of Leuven, Leuven, Belgium
| | - Sandrina Lambrechts
- Division of Gynecological Oncology, Department of Oncology, University Hospitals Leuven, Leuven, Belgium
| | - Nhu D Le
- Cancer Control Research, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Alice W Lee
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, California
| | - Shashi Lele
- Department of Cancer Prevention and Control, Roswell Park Cancer Institute, Buffalo, New York
| | - Arto Leminen
- Department of Obstetrics and Gynecology, University of Helsinki and Helsinki University Central Hospital, Helsinki, HUS, Finland
| | - Jenny Lester
- Women's Cancer Program at the Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, California
| | - Douglas A Levine
- Gynecology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Dong Liang
- College of Pharmacy and Health Sciences, Texas Southern University, Houston, Texas
| | - Jolanta Lissowska
- Department of Cancer Epidemiology and Prevention, Maria Sklodowska-Curie Memorial Cancer Center and Institute of Oncology, Warsaw, Poland
| | - Karen Lu
- Department of Gynecologic Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Jan Lubinski
- Department of Genetics and Pathology, Pomeranian Medical University, Szczecin, Poland
| | - Lene Lundvall
- Department of Gynecology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Leon Massuger
- Radboud University Medical Centre, Radboud Institute for Molecular Life Sciences, Nijmegen, the Netherlands
| | - Keitaro Matsuo
- Department of Preventive Medicine, Kyushu University Faculty of Medical Sciences, Fukuoka, Japan
| | - Valerie McGuire
- Department of Health Research and Policy-Epidemiology, Stanford University School of Medicine, Stanford, California
| | - John R McLaughlin
- Prosserman Centre for Health Research, Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Iain A McNeish
- Institute of Cancer Sciences, University of Glasgow, Wolfson Wohl Cancer Research Centre, Beatson Institute for Cancer Research, Glasgow, United Kingdom
| | - Usha Menon
- Women's Cancer, University College London Elizabeth Garrett Anderson Institute for Women's Health, London, United Kingdom
| | - Francesmary Modugno
- Ovarian Cancer Center of Excellence, University of Pittsburgh, Pittsburgh, Pennsylvania. Women's Cancer Research Program, Magee-Women's Research Institute and University of Pittsburgh Cancer Institute, Pittsburgh, Pennsylvania. Department of Obstetrics, Gynecology, and Reproductive Sciences, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania. Department of Epidemiology, University of Pittsburgh Graduate School of Public Health, Pittsburgh, Pennsylvania
| | - Kirsten B Moysich
- Department of Cancer Prevention and Control, Roswell Park Cancer Institute, Buffalo, New York
| | - Steven A Narod
- Women's College Research Institute, Toronto, Ontario, Canada
| | - Lotte Nedergaard
- Department of Pathology, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark
| | - Roberta B Ness
- The University of Texas School of Public Health, Houston, Texas
| | - Heli Nevanlinna
- Department of Obstetrics and Gynecology, University of Helsinki and Helsinki University Central Hospital, Helsinki, HUS, Finland
| | - Kunle Odunsi
- Department of Gynecological Oncology, Roswell Park Cancer Institute, Buffalo, New York
| | - Sara H Olson
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Irene Orlow
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Sandra Orsulic
- Women's Cancer Program at the Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, California
| | - Rachel Palmieri Weber
- Department of Community and Family Medicine, Duke University Medical Center, Durham, North Carolina
| | - Celeste Leigh Pearce
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, California
| | - Tanja Pejovic
- Department of Obstetrics and Gynecology, Oregon Health and Science University, Portland, Oregon. Knight Cancer Institute, Oregon Health and Science University, Portland, Oregon
| | - Liisa M Pelttari
- Department of Obstetrics and Gynecology, University of Helsinki and Helsinki University Central Hospital, Helsinki, HUS, Finland
| | | | - Catherine M Phelan
- Department of Cancer Epidemiology, Moffitt Cancer Center, Tampa, Florida
| | - Malcolm C Pike
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, California. Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, New York
| | - Elizabeth M Poole
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts. Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts
| | - Susan J Ramus
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, California
| | - Harvey A Risch
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, Connecticut
| | - Barry Rosen
- Department of Gynecologic-Oncology, Princess Margaret Hospital, University of Toronto, Toronto, Ontario, Canada. Department of Obstetrics and Gynecology, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Mary Anne Rossing
- Program in Epidemiology, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington. Department of Epidemiology, University of Washington, Seattle, Washington
| | - Joseph H Rothstein
- Department of Health Research and Policy-Epidemiology, Stanford University School of Medicine, Stanford, California
| | - Anja Rudolph
- German Cancer Research Center (DKFZ), Division of Cancer Epidemiology, Heidelberg, Germany
| | - Ingo B Runnebaum
- Department of Gynecology, Jena University Hospital, Friedrich Schiller University, Jena, Germany
| | - Iwona K Rzepecka
- Department of Pathology and Laboratory Diagnostics, Maria Sklodowska-Curie Memorial Cancer Center and Institute of Oncology, Warsaw, Poland
| | - Helga B Salvesen
- Department of Gynecology and Obstetrics, Haukeland University Hospital, Bergen, Norway. Centre for Cancer Biomarkers, Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Joellen M Schildkraut
- Department of Community and Family Medicine, Duke University Medical Center, Durham, North Carolina. Cancer Control and Population Sciences, Duke Cancer Institute, Durham, North Carolina
| | - Ira Schwaab
- Institut für Humangenetik Wiesbaden, Wiesbaden, Germany
| | - Xiao-Ou Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Yurii B Shvetsov
- Cancer Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii
| | - Nadeem Siddiqui
- Department of Gynaecological Oncology, Glasgow Royal Infirmary, Glasgow, United Kingdom
| | - Weiva Sieh
- Department of Health Research and Policy-Epidemiology, Stanford University School of Medicine, Stanford, California
| | - Honglin Song
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, United Kingdom
| | - Melissa C Southey
- Department of Pathology, University of Melbourne, Parkville, Victoria, Australia
| | | | - Ingvild L Tangen
- Department of Gynecology and Obstetrics, Haukeland University Hospital, Bergen, Norway. Centre for Cancer Biomarkers, Department of Clinical Science, University of Bergen, Bergen, Norway
| | - Soo-Hwang Teo
- Cancer Research Initiatives Foundation, Sime Darby Medical Centre, Subang Jaya, Malaysia. University Malaya Cancer Research Institute, Faculty of Medicine, University Malaya Medical Centre, University Malaya, Kuala Lumpur, Malaysia
| | - Kathryn L Terry
- Obstetrics and Gynecology Center, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts. Harvard School of Public Health, Boston, Massachusetts
| | - Pamela J Thompson
- Cancer Prevention and Control, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, California. Community and Population Health Research Institute, Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, California
| | - Agnieszka Timorek
- Department of Obstetrics, Gynecology, and Oncology, IInd Faculty of Medicine, Warsaw Medical University and Brodnowski Hospital, Warsaw, Poland
| | - Ya-Yu Tsai
- Department of Cancer Epidemiology, Moffitt Cancer Center, Tampa, Florida
| | - Shelley S Tworoger
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts. Department of Epidemiology, Harvard School of Public Health, Boston, Massachusetts
| | - Anne M van Altena
- Radboud University Medical Centre, Radboud Institute for Molecular Life Sciences, Nijmegen, the Netherlands
| | - Els Van Nieuwenhuysen
- Division of Gynecological Oncology, Department of Oncology, University Hospitals Leuven, Leuven, Belgium
| | - Ignace Vergote
- Division of Gynecological Oncology, Department of Oncology, University Hospitals Leuven, Leuven, Belgium
| | - Robert A Vierkant
- Department of Health Science Research, Mayo Clinic, Rochester, Minnesota
| | - Shan Wang-Gohrke
- Department of Obstetrics and Gynecology, University of Ulm, Ulm, Germany
| | - Christine Walsh
- Women's Cancer Program at the Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, California
| | - Nicolas Wentzensen
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland
| | - Alice S Whittemore
- Department of Health Research and Policy-Epidemiology, Stanford University School of Medicine, Stanford, California
| | - Kristine G Wicklund
- Program in Epidemiology, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Lynne R Wilkens
- Cancer Epidemiology Program, University of Hawaii Cancer Center, Honolulu, Hawaii
| | - Yin-Ling Woo
- University Malaya Cancer Research Institute, Faculty of Medicine, University Malaya Medical Centre, University Malaya, Kuala Lumpur, Malaysia. Department of Obstetrics and Gynaecology, University Malaya Medical Centre, University Malaya, Kuala Lumpur, Malaysia
| | - Xifeng Wu
- Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, Texas
| | - Anna Wu
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, California
| | - Hannah Yang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Argyrios Ziogas
- Department of Epidemiology, Director of Genetic Epidemiology Research Institute, School of Medicine, University of California Irvine, Irvine, California
| | - Thomas A Sellers
- Department of Cancer Epidemiology, Moffitt Cancer Center, Tampa, Florida
| | | | - Matthew L Freedman
- Department of Medical Oncology, The Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, Massachusetts
| | - Simon A Gayther
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California Norris Comprehensive Cancer Center, Los Angeles, California
| | - Paul D P Pharoah
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom. Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
55
|
Wang W, Mandel J, Bouaziz J, Commenges D, Nabirotchkine S, Chumakov I, Cohen D, Guedj M. A Multi-Marker Genetic Association Test Based on the Rasch Model Applied to Alzheimer's Disease. PLoS One 2015; 10:e0138223. [PMID: 26379234 PMCID: PMC4574966 DOI: 10.1371/journal.pone.0138223] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 08/27/2015] [Indexed: 12/28/2022] Open
Abstract
Results from Genome-Wide Association Studies (GWAS) have shown that the genetic basis of complex traits often include many genetic variants with small to moderate effects whose identification remains a challenging problem. In this context multi-marker analysis at the gene and pathway level can complement traditional point-wise approaches that treat the genetic markers individually. In this paper we propose a novel statistical approach for multi-marker analysis based on the Rasch model. The method summarizes the categorical genotypes of SNPs by a generalized logistic function into a genetic score that can be used for association analysis. Through different sets of simulations, the false-positive rate and power of the proposed approach are compared to a set of existing methods, and shows good performances. The application of the Rasch model on Alzheimer's Disease (AD) ADNI GWAS dataset also allows a coherent interpretation of the results. Our analysis supports the idea that APOE is a major susceptibility gene for AD. In the top genes selected by proposed method, several could be functionally linked to AD. In particular, a pathway analysis of these genes also highlights the metabolism of cholesterol, that is known to play a key role in AD pathogenesis. Interestingly, many of these top genes can be integrated in a hypothetic signalling network.
Collapse
Affiliation(s)
- Wenjia Wang
- Pharnext, Issy-les-Moulineaux, Ile de France, France
- Inserm U897, University of Bordeaux, Bordeaux, Aquitaine, France
| | - Jonas Mandel
- Pharnext, Issy-les-Moulineaux, Ile de France, France
| | - Jan Bouaziz
- Pharnext, Issy-les-Moulineaux, Ile de France, France
| | - Daniel Commenges
- Inserm U897, University of Bordeaux, Bordeaux, Aquitaine, France
| | | | - Ilya Chumakov
- Pharnext, Issy-les-Moulineaux, Ile de France, France
| | - Daniel Cohen
- Pharnext, Issy-les-Moulineaux, Ile de France, France
| | - Mickaël Guedj
- Pharnext, Issy-les-Moulineaux, Ile de France, France
| | | |
Collapse
|
56
|
Yi H, Wo H, Zhao Y, Zhang R, Dai J, Jin G, Ma H, Wu T, Hu Z, Lin D, Shen H, Chen F. Comparison of dimension reduction-based logistic regression models for case-control genome-wide association study: principal components analysis vs. partial least squares. J Biomed Res 2015; 29:298-307. [PMID: 26243516 PMCID: PMC4547378 DOI: 10.7555/jbr.29.20140043] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Revised: 09/29/2014] [Accepted: 01/15/2015] [Indexed: 12/18/2022] Open
Abstract
With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the performance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data.
Collapse
Affiliation(s)
- Honggang Yi
- Department of Epidemiology and Biostatistics, School of Public Health
| | - Hongmei Wo
- Department of Public Service Management, School of KangDa
| | - Yang Zhao
- Department of Epidemiology and Biostatistics, School of Public Health
| | - Ruyang Zhang
- Department of Epidemiology and Biostatistics, School of Public Health
| | - Junchen Dai
- Department of Epidemiology and Biostatistics, School of Public Health
| | - Guangfu Jin
- Department of Epidemiology and Biostatistics, School of Public Health.,Section of Clinical Epidemiology, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Cancer Center
| | - Hongxia Ma
- Department of Epidemiology and Biostatistics, School of Public Health.,Section of Clinical Epidemiology, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Cancer Center
| | - Tangchun Wu
- Institute of Occupational Medicine and Ministry of Education, Key Laboratory for Environment and Health, School of Public Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Zhibin Hu
- Department of Epidemiology and Biostatistics, School of Public Health.,Section of Clinical Epidemiology, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Cancer Center.,State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Dongxin Lin
- State Key Laboratory of Molecular Oncology and Department of Etiology and Carcinogenesis, Cancer Institute and Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, China
| | - Hongbing Shen
- Department of Epidemiology and Biostatistics, School of Public Health.,Section of Clinical Epidemiology, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Cancer Center.,State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Feng Chen
- Department of Epidemiology and Biostatistics, School of Public Health.
| |
Collapse
|
57
|
Edwards SM, Thomsen B, Madsen P, Sørensen P. Partitioning of genomic variance reveals biological pathways associated with udder health and milk production traits in dairy cattle. Genet Sel Evol 2015; 47:60. [PMID: 26169777 PMCID: PMC4499908 DOI: 10.1186/s12711-015-0132-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2014] [Accepted: 06/12/2015] [Indexed: 12/20/2022] Open
Abstract
Background We have used a linear mixed model (LMM) approach to examine the joint contribution of genetic markers associated with a biological pathway. However, with these markers being scattered throughout the genome, we are faced with the challenge of modelling the contribution from several, sometimes even all, chromosomes at once. Due to linkage disequilibrium (LD), all markers may be assumed to account for some genomic variance; but the question is whether random sets of markers account for the same genomic variance as markers associated with a biological pathway? Results We applied the LMM approach to identify biological pathways associated with udder health and milk production traits in dairy cattle. A random gene sampling procedure was applied to assess the biological pathways in a dataset that has an inherently complex genetic correlation pattern due to the population structure of dairy cattle, and to linkage disequilibrium within the bovine genome and within the genes associated to the biological pathway. Conclusions Several biological pathways that were significantly associated with health and production traits were identified in dairy cattle; i.e. the markers linked to these pathways explained more of the genomic variance and provided a better model fit than 95 % of the randomly sampled gene groups. Our results show that immune related pathways are associated with production traits, and that pathways that include a causal marker for production traits are identified with our procedure. We are confident that the LMM approach provides a general framework to exploit and integrate prior biological information and could potentially lead to improved understanding of the genetic architecture of complex traits and diseases. Electronic supplementary material The online version of this article (doi:10.1186/s12711-015-0132-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Stefan M Edwards
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Blichers Allé 20, P.O. Box 50, Tjele, DK-8830, Denmark.
| | - Bo Thomsen
- Department of Molecular Biology and Genetics, Aarhus University, Blichers Allé 20, P.O. Box 50, Tjele, DK-8830, Denmark.
| | - Per Madsen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Blichers Allé 20, P.O. Box 50, Tjele, DK-8830, Denmark.
| | - Peter Sørensen
- Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, Blichers Allé 20, P.O. Box 50, Tjele, DK-8830, Denmark.
| |
Collapse
|
58
|
Pan W, Kwak IY, Wei P. A Powerful Pathway-Based Adaptive Test for Genetic Association with Common or Rare Variants. Am J Hum Genet 2015; 97:86-98. [PMID: 26119817 DOI: 10.1016/j.ajhg.2015.05.018] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2015] [Accepted: 05/21/2015] [Indexed: 12/11/2022] Open
Abstract
In spite of the success of genome-wide association studies (GWASs), only a small proportion of heritability for each complex trait has been explained by identified genetic variants, mainly SNPs. Likely reasons include genetic heterogeneity (i.e., multiple causal genetic variants) and small effect sizes of causal variants, for which pathway analysis has been proposed as a promising alternative to the standard single-SNP-based analysis. A pathway contains a set of functionally related genes, each of which includes multiple SNPs. Here we propose a pathway-based test that is adaptive at both the gene and SNP levels, thus maintaining high power across a wide range of situations with varying numbers of the genes and SNPs associated with a trait. The proposed method is applicable to both common variants and rare variants and can incorporate biological knowledge on SNPs and genes to boost statistical power. We use extensively simulated data and a WTCCC GWAS dataset to compare our proposal with several existing pathway-based and SNP-set-based tests, demonstrating its promising performance and its potential use in practice.
Collapse
|
59
|
Abstract
PURPOSE OF REVIEW Inflammatory bowel disease (IBD) has long been known to have genetic risk factors because of increased prevalence in the relatives of affected individuals. However, genome-wide association studies have only explained limited heritability in IBD. The observed globally rising incidence of IBD has implicated the role of environmental factors. The hidden unexplained heritability remains to be explored. RECENT FINDINGS Recent aggregate evidence has highlighted the extent and nature of host genome-microbiome associations, a key next step in understanding the mechanisms of pathogenesis in IBD. An individual's gut microbiota is shaped not only by genetic but also by environmental factors like diet. Minimizing exposure of the intestinal lumen to selected food items has shown to prolong the remission state of IBD. Among a genetically susceptible host, the shift of gut microbiota (or 'dysbiosis') can lead to increasing the susceptibility to IBD. With the advances in high-throughput large-scale 'omics' technologies in combination with creative data mining and system biology-based network analyses, the complexity of biological functional networks behind the cause of IBD has become more approachable. Therefore, the hidden heritability in IBD has become more explainable, and can be attributable to the changing environmental factors, epigenetic modifications, and gene-host microbial ('in-vironmental') or gene-extrinsic environmental interactions. SUMMARY This review discusses the perspectives of relevance to clinical translation with emphasis on gene-environment interactions. No doubt, the use of system-based approaches will lead to the development of alternative, and hopefully better, diagnostic, prognostic, and monitoring tools in the management of IBD.
Collapse
|
60
|
Statistical and Computational Methods for Genetic Diseases: An Overview. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2015; 2015:954598. [PMID: 26106440 PMCID: PMC4464008 DOI: 10.1155/2015/954598] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 04/23/2015] [Indexed: 12/19/2022]
Abstract
The identification of causes of genetic diseases has been carried out by several approaches with increasing complexity. Innovation of genetic methodologies leads to the production of large amounts of data that needs the support of statistical and computational methods to be correctly processed. The aim of the paper is to provide an overview of statistical and computational methods paying attention to methods for the sequence analysis and complex diseases.
Collapse
|
61
|
Wojcik GL, Kao WHL, Duggal P. Relative performance of gene- and pathway-level methods as secondary analyses for genome-wide association studies. BMC Genet 2015; 16:34. [PMID: 25887572 PMCID: PMC4391470 DOI: 10.1186/s12863-015-0191-2] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2014] [Accepted: 03/19/2015] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Despite the success of genome-wide association studies (GWAS), there still remains "missing heritability" for many traits. One contributing factor may be the result of examining one marker at a time as opposed to a group of markers that are biologically meaningful in aggregate. To address this problem, a variety of gene- and pathway-level methods have been developed to identify putative biologically relevant associations. A simulation was conducted to systematically assess the performance of these methods. Using genetic data from 4,500 individuals in the Wellcome Trust Case Control Consortium (WTCCC), case-control status was simulated based on an additive polygenic model. We evaluated gene-level methods based on their sensitivity, specificity, and proportion of false positives. Pathway-level methods were evaluated on the relationship between proportion of causal genes within the pathway and the strength of association. RESULTS The gene-level methods had low sensitivity (20-63%), high specificity (89-100%), and low proportion of false positives (0.1-6%). The gene-level program VEGAS using only the top 10% of associated single nucleotide polymorphisms (SNPs) within the gene had the highest sensitivity (28.6%) with less than 1% false positives. The performance of the pathway-level methods depended on their reliance upon asymptotic distributions or if significance was estimated in a competitive manner. The pathway-level programs GenGen, GSA-SNP and MAGENTA had the best performance while accounting for potential confounders. CONCLUSIONS Novel genes and pathways can be identified using the gene and pathway-level methods. These methods may provide valuable insight into the "missing heritability" of traits and provide biological interpretations to GWAS findings.
Collapse
Affiliation(s)
- Genevieve L Wojcik
- Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA. .,Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.
| | - W H Linda Kao
- Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA.
| | - Priya Duggal
- Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA.
| |
Collapse
|
62
|
Cantu E, Shah RJ, Lin W, Daye ZJ, Diamond JM, Suzuki Y, Ellis JH, Borders CF, Andah GA, Beduhn B, Meyer NJ, Ruschefski M, Aplenc R, Feng R, Christie JD. Oxidant stress regulatory genetic variation in recipients and donors contributes to risk of primary graft dysfunction after lung transplantation. J Thorac Cardiovasc Surg 2015; 149:596-602. [PMID: 25439478 PMCID: PMC4346512 DOI: 10.1016/j.jtcvs.2014.09.077] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/09/2014] [Revised: 08/19/2014] [Accepted: 09/23/2014] [Indexed: 12/14/2022]
Abstract
OBJECTIVE Oxidant stress pathway activation during ischemia reperfusion injury may contribute to the development of primary graft dysfunction (PGD) after lung transplantation. We hypothesized that oxidant stress gene variation in recipients and donors is associated with PGD. METHODS Donors and recipients from the Lung Transplant Outcomes Group (LTOG) cohort were genotyped using the Illumina IBC chip filtered for oxidant stress pathway genes. Single nucleotide polymorphisms (SNPs) grouped into SNP sets based on haplotype blocks within 49 oxidant stress genes selected from gene ontology pathways and literature review were tested for PGD association using a sequencing kernel association test. Analyses were adjusted for clinical confounding variables and population stratification. RESULTS Three hundred ninety-two donors and 1038 recipients met genetic quality control standards. Thirty percent of patients developed grade 3 PGD within 72 hours. Donor NADPH oxidase 3 (NOX3) was associated with PGD (P = .01) with 5 individual significant loci (P values between .006 and .03). In recipients, variation in glutathione peroxidase (GPX1) and NRF-2 (NFE2L2) was significantly associated with PGD (P = .01 for both). The GPX1 association included 3 individual loci (P values between .006 and .049) and the NFE2L2 association included 2 loci (P = .03 and .05). Significant epistatic effects influencing PGD susceptibility were evident between 3 different donor blocks of NOX3 and recipient NFE2L2 (P = .026, P = .017, and P = .031). CONCLUSIONS Our study has prioritized GPX1, NOX3, and NFE2L2 genes for future research in PGD pathogenesis, and highlights a donor-recipient interaction of NOX3 and NFE2L2 that increases the risk of PGD.
Collapse
Affiliation(s)
- Edward Cantu
- Cardiovascular Surgery Division, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Rupal J. Shah
- Pulmonary, Allergy, and Critical Care Division, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA,Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Wei Lin
- Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Zhongyin J. Daye
- Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Joshua M. Diamond
- Pulmonary, Allergy, and Critical Care Division, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA,Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Yoshikazu Suzuki
- Cardiovascular Surgery Division, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - John H. Ellis
- Cardiovascular Surgery Division, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Catherine F. Borders
- Cardiovascular Surgery Division, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Gerald A. Andah
- Cardiovascular Surgery Division, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Ben Beduhn
- Cardiovascular Surgery Division, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Nuala J. Meyer
- Pulmonary, Allergy, and Critical Care Division, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA,Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Melanie Ruschefski
- Pulmonary, Allergy, and Critical Care Division, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA,Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Richard Aplenc
- Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA,Children’s Hospital of Philadelphia, Philadelphia, PA
| | - Rui Feng
- Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | - Jason D. Christie
- Pulmonary, Allergy, and Critical Care Division, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA,Center for Clinical Epidemiology and Biostatistics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA
| | | |
Collapse
|
63
|
Moura-Neto RS, Silva R, Mello IC, Nogueira T, Al-Deib AA, LaRue B, King J, Budowle B. Evaluation of a 49 InDel Marker HID panel in two specific populations of South America and one population of Northern Africa. Int J Legal Med 2014; 129:245-9. [DOI: 10.1007/s00414-014-1137-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2014] [Accepted: 12/08/2014] [Indexed: 01/08/2023]
|
64
|
Identification of possible pathogenic pathways in Behçet's disease using genome-wide association study data from two different populations. Eur J Hum Genet 2014; 23:678-87. [PMID: 25227143 DOI: 10.1038/ejhg.2014.158] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2013] [Revised: 07/08/2014] [Accepted: 07/10/2014] [Indexed: 12/25/2022] Open
Abstract
Behçet's disease (BD) is a multi-system inflammatory disorder of unknown etiology. Two recent genome-wide association studies (GWASs) of BD confirmed a strong association with the MHC class I region and identified two non-HLA common genetic variations. In complex diseases, multiple factors may target different sets of genes in the same pathway and thus may cause the same disease phenotype. We therefore hypothesized that identification of disease-associated pathways is critical to elucidate mechanisms underlying BD, and those pathways may be conserved within and across populations. To identify the disease-associated pathways, we developed a novel methodology that combines nominally significant evidence of genetic association with current knowledge of biochemical pathways, protein-protein interaction networks, and functional information of selected SNPs. Using this methodology, we searched for the disease-related pathways in two BD GWASs in Turkish and Japanese case-control groups. We found that 6 of the top 10 identified pathways in both populations were overlapping, even though there were few significantly conserved SNPs/genes within and between populations. The probability of random occurrence of such an event was 2.24E-39. These shared pathways were focal adhesion, MAPK signaling, TGF-β signaling, ECM-receptor interaction, complement and coagulation cascades, and proteasome pathways. Even though each individual has a unique combination of factors involved in their disease development, the targeted pathways are expected to be mostly the same. Hence, the identification of shared pathways between the Turkish and the Japanese patients using GWAS data may help further elucidate the inflammatory mechanisms in BD pathogenesis.
Collapse
|
65
|
Mooney MA, Nigg JT, McWeeney SK, Wilmot B. Functional and genomic context in pathway analysis of GWAS data. Trends Genet 2014; 30:390-400. [PMID: 25154796 DOI: 10.1016/j.tig.2014.07.004] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 07/18/2014] [Accepted: 07/18/2014] [Indexed: 02/07/2023]
Abstract
Gene set analysis (GSA) is a promising tool for uncovering the polygenic effects associated with complex diseases. However, the available techniques reflect a wide variety of hypotheses about how genetic effects interact to contribute to disease susceptibility. The lack of consensus about the best way to perform GSA has led to confusion in the field and has made it difficult to compare results across methods. A clear understanding of the various choices made during GSA - such as how gene sets are defined, how single-nucleotide polymorphisms (SNPs) are assigned to genes, and how individual SNP-level effects are aggregated to produce gene- or pathway-level effects - will improve the interpretability and comparability of results across methods and studies. In this review we provide an overview of the various data sources used to construct gene sets and the statistical methods used to test for gene set association, as well as provide guidelines for ensuring the comparability of results.
Collapse
Affiliation(s)
- Michael A Mooney
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; OHSU Knight Cancer Institute, Portland, OR, USA
| | - Joel T Nigg
- Division of Psychology, Department of Psychiatry, Oregon Health & Science University, Portland, OR, USA; Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, OR, USA
| | - Shannon K McWeeney
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; Oregon Clinical and Translational Research Institute, Portland, OR, USA; OHSU Knight Cancer Institute, Portland, OR, USA.
| | - Beth Wilmot
- Division of Bioinformatics and Computational Biology, Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA; Oregon Clinical and Translational Research Institute, Portland, OR, USA; OHSU Knight Cancer Institute, Portland, OR, USA
| |
Collapse
|
66
|
Vilor-Tejedor N, Calle ML. Global adaptive rank truncated product method for gene-set analysis in association studies. Biom J 2014; 56:901-11. [PMID: 25082012 DOI: 10.1002/bimj.201300192] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2013] [Revised: 02/18/2014] [Accepted: 04/18/2014] [Indexed: 11/10/2022]
Abstract
Gene set analysis (GSA) aims to assess the overall association of a set of genetic variants with a phenotype and has the potential to detect subtle effects of variants in a gene or a pathway that might be missed when assessed individually. We present a new implementation of the Adaptive Rank Truncated Product method (ARTP) for analyzing the association of a set of Single Nucleotide Polymorphisms (SNPs) in a gene or pathway. The new implementation, referred to as globalARTP, improves the original one by allowing the different SNPs in the set to have different modes of inheritance. We perform a simulation study for exploring the power of the proposed methodology in a set of scenarios with different numbers of causal SNPs with different effect sizes. Moreover, we show the advantage of using the gene set approach in the context of an Alzheimer's disease case-control study where we explore the endocytosis pathway. The new method is implemented in the R function globalARTP of the globalGSA package available at http://cran.r-project.org.
Collapse
Affiliation(s)
- Natalia Vilor-Tejedor
- Centre for Research in Environmental Epidemiology (CREAL), C. Doctor Aiguader, 88, 08003-Barcelona, Spain.,Department of Experimental and Health Sciences, Pompeu Fabra University (UPF), Barcelona, Spain.,CIBER Epidemiologia y Salud Publica (CIBERESP), Barcelona, Spain
| | - M Luz Calle
- Department of Systems Biology, Bioinformatics and Medical Statistics Group, Universitat de Vic - Universitat Central de Catalunya, C. Sagrada Familia, 7, 08570-Vic, Spain
| |
Collapse
|
67
|
Lee YH, Song GG. Genome-wide pathway analysis of a genome-wide association study on Alzheimer's disease. Neurol Sci 2014; 36:53-9. [PMID: 25037741 DOI: 10.1007/s10072-014-1885-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2014] [Accepted: 07/12/2014] [Indexed: 11/30/2022]
Abstract
The aims of this study were to identify candidate single nucleotide polymorphisms (SNPs) and mechanisms of Alzheimer's disease (AD) and to generate SNP to gene to pathway hypotheses. An AD genome-wide association study (GWAS) dataset that included 370,542 SNPs in 1,000 cases and 1,000 controls of European descent was used in this study. Identify Candidate Causal SNPs and Pathway (ICSNPathway) analysis was applied to the GWAS dataset. ICSNPathway analysis identified 3 candidate SNPs and 2 pathways, which provided 3 hypothetical biological mechanisms. The strongest hypothetical biological mechanism was rs8076604 [non-synonymous coding (deleterious)] to MYO18A to negative regulation of programmed cell death [nominal P < 0.001, false discovery rate (FDR) <0.043]. The second was rs2811226 (regulatory region) to ANXA1 to negative regulation of programmed cell death (nominal P < 0.001, FDR 0.043). The third was rs3734166 (non-synonymous coding) to CDC25C to M phase of the mitotic cell cycle (nominal P < 0.001, FDR 0.049). By applying the ICSNPathway analysis to the AD GWAS meta-analysis data, three candidate SNPs, three genes (MYO18A, ANXA1, CDC25C), 2 pathways involving negative regulation of programmed cell death and 1 pathway involving the M phase of the mitotic cell cycle were identified, which may contribute to AD susceptibility.
Collapse
Affiliation(s)
- Young Ho Lee
- Division of Rheumatology, Department of Internal Medicine Korea University Anam Hospital, Korea University College of Medicine, 126-1 5 ga, Anam-dong, Seongbuk-gu, Seoul, 136-705, Korea,
| | | |
Collapse
|
68
|
Kwon JS, Kim S. Gene-set based genome-wide association analysis for the speed of sound in two skeletal sites of Korean women. BMB Rep 2014; 47:348-53. [PMID: 24286325 PMCID: PMC4163867 DOI: 10.5483/bmbrep.2014.47.6.181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2013] [Indexed: 11/24/2022] Open
Abstract
The speed of sound (SOS) value is an indicator of bone mineral density (BMD). Previous genome-wide association (GWA) studies have identified a number of genes, whose variations may affect BMD levels. However, their biological implications have been elusive. We re-analyzed the GWA study dataset for the SOS values in skeletal sites of 4,659 Korean women, using a gene-set analysis software, GSA-SNP. We identified 10 common representative GO terms, and 17 candidate genes between these two traits (PGS < 0.05). Implication of these GO terms and genes in the bone mechanism is well supported by the literature survey. Interestingly, the significance levels of some member genes were inversely related, in several gene-sets that were shared between two skeletal sites. This implies that biological process, rather than SNP or gene, is the substantial unit of genetic association for SOS in bone. In conclusion, our findings may provide new insights into the biological mechanisms for BMD. [BMB Reports 2014; 47(6): 348-353]
Collapse
Affiliation(s)
| | - Sangsoo Kim
- Corresponding author. Tel: +82-2-820-0457; Fax: +82-2-824-4383; E-mail:
| |
Collapse
|
69
|
Greco B, Luedtke A, Hainline A, Alvarez C, Beck A, Tintle NL. Application of family-based tests of association for rare variants to pathways. BMC Proc 2014; 8:S105. [PMID: 25519359 PMCID: PMC4143675 DOI: 10.1186/1753-6561-8-s1-s105] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Pathway analysis approaches for sequence data typically either operate in a single stage (all variants within all genes in the pathway are combined into a single, very large set of variants that can then be analyzed using standard "gene-based" test statistics) or in 2-stages (gene-based p values are computed for all genes in the pathway, and then the gene-based p values are combined into a single pathway p value). To date, little consideration has been given to the performance of gene-based tests (typically designed for a smaller number of single-nucleotide variants [SNVs]) when the number of SNVs in the gene or in the pathway is very large and the genotypes come from sequence data organized in large pedigrees. We consider recently proposed gene-based tests for rare variants from complex pedigrees that test for association between a large set of SNVs and a qualitative phenotype of interest (1-stage analyses) as well as 2-stage approaches. We find that many of these methods show inflated type I errors when the number of SNVs in the gene or the pathway is large (>200 SNVs) and when using standard approaches to estimate the genotype covariance matrix. Alternative methods are needed when testing very large sets of SNVs in 1-stage approaches.
Collapse
Affiliation(s)
- Brian Greco
- Department of Mathematics and Statistics, Grinnell College, 1115 8th Ave, Grinnell, IA 50112, USA
| | - Alexander Luedtke
- Division of Biostatistics, UC Berkeley, 367 Evans Hall, Berkeley, CA 94720, USA
| | - Allison Hainline
- Department of Statistics, Baylor University, 1511 S. 5th St, Waco, TX 76798, USA
| | - Carolina Alvarez
- Department of Biostatistics, Florida International University, 11200 SW 8th St., Miami, FL 33199, USA
| | - Andrew Beck
- Department of Mathematics, Loyola University Chicago, 1052 W Loyola Ave, Chicago, IL 60660, USA
| | - Nathan L Tintle
- Department of Mathematics, Statistics and Computer Science, 498 4th Ave. NE, Dordt College, Sioux Center, IA 51250, USA
| |
Collapse
|
70
|
Koufariotis L, Chen YPP, Bolormaa S, Hayes BJ. Regulatory and coding genome regions are enriched for trait associated variants in dairy and beef cattle. BMC Genomics 2014; 15:436. [PMID: 24903263 PMCID: PMC4070550 DOI: 10.1186/1471-2164-15-436] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2013] [Accepted: 05/22/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In livestock, as in humans, the number of genetic variants that can be tested for association with complex quantitative traits, or used in genomic predictions, is increasing exponentially as whole genome sequencing becomes more common. The power to identify variants associated with traits, particularly those of small effects, could be increased if certain regions of the genome were known a priori to be enriched for associations. Here, we investigate whether twelve genomic annotation classes were enriched or depleted for significant associations in genome wide association studies for complex traits in beef and dairy cattle. We also describe a variance component approach to determine the proportion of genetic variance captured by each annotation class. RESULTS P-values from large GWAS using 700K SNP in both dairy and beef cattle were available for 11 and 10 traits respectively. We found significant enrichment for trait associated variants (SNP significant in the GWAS) in the missense class along with regions 5 kilobases upstream and downstream of coding genes. We found that the non-coding conserved regions (across mammals) were not enriched for trait associated variants. The results from the enrichment or depletion analysis were not in complete agreement with the results from variance component analysis, where the missense and synonymous classes gave the greatest increase in variance explained, while the upstream and downstream classes showed a more modest increase in the variance explained. CONCLUSION Our results indicate that functional annotations could assist in prioritization of variants to a subset more likely to be associated with complex traits; including missense variants, and upstream and downstream regions. The differences in two sets of results (GWAS enrichment depletion versus variance component approaches) might be explained by the fact that the variance component approach has greater power to capture the cumulative effect of mutations of small effect, while the enrichment or depletion approach only captures the variants that are significant in GWAS, which is restricted to a limited number of common variants of moderate effects.
Collapse
Affiliation(s)
- Lambros Koufariotis
- Faculty of Science, Technology and Engineering, La Trobe University, Melbourne, Victoria 3086, Australia.
| | | | | | | |
Collapse
|
71
|
Rodriguez-Fontenla C, Calaza M, Gonzalez A. Genetic distance as an alternative to physical distance for definition of gene units in association studies. BMC Genomics 2014; 15:408. [PMID: 24884992 PMCID: PMC4048458 DOI: 10.1186/1471-2164-15-408] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2013] [Accepted: 05/20/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Some association studies, as the implemented in VEGAS, ALIGATOR, i-GSEA4GWAS, GSA-SNP and other software tools, use genes as the unit of analysis. These genes include the coding sequence plus flanking sequences. Polymorphisms in the flanking sequences are of interest because they involve cis-regulatory elements or they inform on untyped genetic variants trough linkage disequilibrium. Gene extensions have customarily been defined as ±50 Kb. This approach is not fully satisfactory because genetic relationships between neighbouring sequences are a function of genetic distances, which are only poorly replaced by physical distances. RESULTS Standardized recombination rates (SRR) from the deCODE recombination map were used as units of genetic distances. We searched for a SRR producing flanking sequences near the ±50 Kb offset that has been common in previous studies. A SRR≥2 was selected because it led to gene extensions with median length=45.3 Kb and the simplicity of an integer value. As expected, boundaries of the genes defined with the ±50 Kb and with the SRR≥2 rules were rarely concordant. The impact of these differences was illustrated with the interpretation of top association signals from two large studies including many hits and their detailed analysis based in different criteria. The definition based in genetic distance was more concordant with the results of these studies than the based in physical distance. In the analysis of 18 top disease associated loci form the first study, the SRR≥2 genes led to a fully concordant interpretation in 17 loci; the ±50 Kb genes only in 6. Interpretation of the 43 putative functional genes of the second study based in the SRR≥2 definition only missed 4 of the genes, whereas the based in the ±50 Kb definition missed 10 genes. CONCLUSIONS A gene definition based on genetic distance led to results more concordant with expert detailed analyses than the commonly used based in physical distance. The genome coordinates for each gene are provided to maintain a simple use of the new definitions.
Collapse
Affiliation(s)
| | | | - Antonio Gonzalez
- Laboratorio de Investigacion 10 and Rheumatology Unit, Instituto de Investigacion Sanitaria - Hospital Clinico Universitario de Santiago, Santiago de Compostela, Spain.
| |
Collapse
|
72
|
Lee YH, Kim JH, Song GG. Genome-wide pathway analysis of breast cancer. Tumour Biol 2014; 35:7699-705. [PMID: 24805830 DOI: 10.1007/s13277-014-2027-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2014] [Accepted: 04/28/2014] [Indexed: 12/15/2022] Open
Abstract
The aim of this study was to identify candidate single-nucleotide polymorphisms (SNPs) that might affect susceptibility to breast cancer and then elucidate their potential mechanisms and generate SNP-to-gene-to-pathway hypotheses. A genome-wide association study (GWAS) dataset of breast cancer that included 453,852 SNPs from 1,145 breast cancer patients and 1,142 control subjects of European descent was used in this study. The identify candidate causal SNPs and pathways (ICSNPathway) method was applied to the GWAS dataset. ICSNPathway analysis identified 16 candidate SNPs, 13 genes, and 7 pathways, which together revealed 7 hypothetical biological mechanisms. The strongest hypothetical biological mechanism was that rs3168891 and rs2899849 alter the role of MBIP in the inactivation of mitogen-activated protein kinase (MAPK) (p < 0.001; false discovery rate (FDR) = 0.038). The second strongest mechanism was that rs2229714 modulates RPS6KA1 to affect its role in growth hormone signaling (p = 0.001; FDR = 0.039). The third strongest mechanism was that rs2230394 modulates ITGB1 to regulate the PTEN pathway and hsa04360 (axon guidance pathway) (p < 0.001; FDR = 0.039, 0.041). Use of the ICSNPathway to analyze breast cancer GWAS data identified 16 candidate SNPs, 13 genes (including MBIP, RPS6KA1, and ITGB1), and 7 pathways that might contribute to the susceptibility of patients to breast cancer.
Collapse
Affiliation(s)
- Young Ho Lee
- Division of Rheumatology, Department of Internal Medicine, Korea University Anam Hospital, Korea University College of Medicine, 126-1 5 ga, Anam-dong, Seongbuk-gu, Seoul, 136-705, Korea,
| | | | | |
Collapse
|
73
|
Ponzoni I, Nueda M, Tarazona S, Götz S, Montaner D, Dussaut J, Dopazo J, Conesa A. Pathway network inference from gene expression data. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 2:S7. [PMID: 25032889 PMCID: PMC4101702 DOI: 10.1186/1752-0509-8-s2-s7] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
74
|
Mukherjee S, Kim S, Ramanan VK, Gibbons LE, Nho K, Glymour MM, Ertekin-Taner N, Montine TJ, Saykin AJ, Crane PK. Gene-based GWAS and biological pathway analysis of the resilience of executive functioning. Brain Imaging Behav 2014; 8:110-8. [PMID: 24072271 PMCID: PMC3944472 DOI: 10.1007/s11682-013-9259-7] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Resilience in executive functioning (EF) is characterized by high EF measured by neuropsychological test performance despite structural brain damage from neurodegenerative conditions. We previously reported single nucleotide polymorphism (SNP) genome-wide association study (GWAS) results for EF resilience. Here, we report gene- and pathway-based analyses of the same resilience phenotype, using an optimal SNP-set (Sequence) Kernel Association Test (SKAT) for gene-based analyses (conservative threshold for genome-wide significance = 0.05/18,123 = 2.8 × 10(-6)) and the gene-set enrichment package GSA-SNP for biological pathway analyses (False discovery rate (FDR) < 0.05). Gene-based analyses found a genome-wide significant association between RNASE13 and EF resilience (p = 1.33 × 10(-7)). Genetic pathways involved with dendritic/neuron spine, presynaptic membrane, postsynaptic density, etc., were enriched with association to EF resilience. Although replication of these results is necessary, our findings indicate the potential value of gene- and pathway-based analyses in research on determinants of cognitive resilience.
Collapse
Affiliation(s)
- Shubhabrata Mukherjee
- Department of Medicine, University of Washington, Box 359780, 325 Ninth Avenue, Seattle, WA, 98104, USA,
| | | | | | | | | | | | | | | | | | | |
Collapse
|
75
|
Azencott CA, Grimm D, Sugiyama M, Kawahara Y, Borgwardt KM. Efficient network-guided multi-locus association mapping with graph cuts. Bioinformatics 2013; 29:i171-9. [PMID: 23812981 PMCID: PMC3694644 DOI: 10.1093/bioinformatics/btt238] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION As an increasing number of genome-wide association studies reveal the limitations of the attempt to explain phenotypic heritability by single genetic loci, there is a recent focus on associating complex phenotypes with sets of genetic loci. Although several methods for multi-locus mapping have been proposed, it is often unclear how to relate the detected loci to the growing knowledge about gene pathways and networks. The few methods that take biological pathways or networks into account are either restricted to investigating a limited number of predetermined sets of loci or do not scale to genome-wide settings. RESULTS We present SConES, a new efficient method to discover sets of genetic loci that are maximally associated with a phenotype while being connected in an underlying network. Our approach is based on a minimum cut reformulation of the problem of selecting features under sparsity and connectivity constraints, which can be solved exactly and rapidly. SConES outperforms state-of-the-art competitors in terms of runtime, scales to hundreds of thousands of genetic loci and exhibits higher power in detecting causal SNPs in simulation studies than other methods. On flowering time phenotypes and genotypes from Arabidopsis thaliana, SConES detects loci that enable accurate phenotype prediction and that are supported by the literature. AVAILABILITY Code is available at http://webdav.tuebingen.mpg.de/u/karsten/Forschung/scones/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chloé-Agathe Azencott
- Machine Learning and Computational Biology Research Group, Max Planck Institute for Developmental Biology & Max Planck Institute for Intelligent Systems Spemannstr 38, 72076 Tübingen, Germany.
| | | | | | | | | |
Collapse
|
76
|
Fernández RM, Bleda M, Luzón-Toro B, García-Alonso L, Arnold S, Sribudiani Y, Besmond C, Lantieri F, Doan B, Ceccherini I, Lyonnet S, Hofstra RMW, Chakravarti A, Antiñolo G, Dopazo J, Borrego S. Pathways systematically associated to Hirschsprung's disease. Orphanet J Rare Dis 2013; 8:187. [PMID: 24289864 PMCID: PMC3879038 DOI: 10.1186/1750-1172-8-187] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2013] [Accepted: 11/19/2013] [Indexed: 02/08/2023] Open
Abstract
Despite it has been reported that several loci are involved in Hirschsprung's disease, the molecular basis of the disease remains yet essentially unknown. The study of collective properties of modules of functionally-related genes provides an efficient and sensitive statistical framework that can overcome sample size limitations in the study of rare diseases. Here, we present the extension of a previous study of a Spanish series of HSCR trios to an international cohort of 162 HSCR trios to validate the generality of the underlying functional basis of the Hirschsprung's disease mechanisms previously found. The Pathway-Based Analysis (PBA) confirms a strong association of gene ontology (GO) modules related to signal transduction and its regulation, enteric nervous system (ENS) formation and other processes related to the disease. In addition, network analysis recovers sub-networks significantly associated to the disease, which contain genes related to the same functionalities, thus providing an independent validation of these findings. The functional profiles of association obtained for patients populations from different countries were compared to each other. While gene associations were different at each series, the main functional associations were identical in all the five populations. These observations would also explain the reported low reproducibility of associations of individual disease genes across populations.
Collapse
Affiliation(s)
- Raquel M Fernández
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University Hospital Virgen del Rocío/CSIC/University of Seville, Av. Manuel Siurot s/n, Seville, 41013, Spain
- Centre for Biomedical Network Research on Rare Diseases (CIBERER), Valencia, Spain
| | - Marta Bleda
- Centre for Biomedical Network Research on Rare Diseases (CIBERER), Valencia, Spain
- Department of Computational Genomics, Centro de Investigación Príncipe Felipe (CIPF), c/Eduardo Primo Yufera, 3, Valencia, 46012, Spain
| | - Berta Luzón-Toro
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University Hospital Virgen del Rocío/CSIC/University of Seville, Av. Manuel Siurot s/n, Seville, 41013, Spain
- Centre for Biomedical Network Research on Rare Diseases (CIBERER), Valencia, Spain
| | - Luz García-Alonso
- Department of Computational Genomics, Centro de Investigación Príncipe Felipe (CIPF), c/Eduardo Primo Yufera, 3, Valencia, 46012, Spain
| | - Stacey Arnold
- Center for Complex Disease Genomics, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Yunia Sribudiani
- Department of Medical Genetics, University of Groningen, Groningen, The Netherlands
| | - Claude Besmond
- INSERM U-781, AP-HP Hôpital Necker-Enfants Malades, Paris, France
| | | | - Betty Doan
- Center for Complex Disease Genomics, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | | | | | - Robert MW Hofstra
- Department of Medical Genetics, University of Groningen, Groningen, The Netherlands
| | - Aravinda Chakravarti
- Center for Complex Disease Genomics, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Guillermo Antiñolo
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University Hospital Virgen del Rocío/CSIC/University of Seville, Av. Manuel Siurot s/n, Seville, 41013, Spain
- Centre for Biomedical Network Research on Rare Diseases (CIBERER), Valencia, Spain
| | - Joaquín Dopazo
- Centre for Biomedical Network Research on Rare Diseases (CIBERER), Valencia, Spain
- Department of Computational Genomics, Centro de Investigación Príncipe Felipe (CIPF), c/Eduardo Primo Yufera, 3, Valencia, 46012, Spain
- Functional Genomics Node (INB), CIPF, Valencia, Spain
| | - Salud Borrego
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University Hospital Virgen del Rocío/CSIC/University of Seville, Av. Manuel Siurot s/n, Seville, 41013, Spain
- Centre for Biomedical Network Research on Rare Diseases (CIBERER), Valencia, Spain
| |
Collapse
|
77
|
Genome-wide pathway analysis in neuroblastoma. Tumour Biol 2013; 35:3471-85. [PMID: 24293394 DOI: 10.1007/s13277-013-1459-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Accepted: 11/19/2013] [Indexed: 01/16/2023] Open
Abstract
The aim of this study was to identify candidate single-nucleotide polymorphisms (SNPs) that might play a role in susceptibility to neuroblastoma, elucidate their potential mechanisms, and generate SNP-to-gene-to-pathway hypotheses. A genome-wide association study (GWAS) dataset of neuroblastoma that included 442,976 SNPs from 1,627 neuroblastoma patients and 3,254 control subjects of European descent was used in this study. The identify candidate causal SNPs and pathways (ICSNPathway) analysis was applied to the GWAS dataset. ICSNPathway analysis identified 15 candidate SNPs, 10 genes, and 31 pathways, which revealed 10 hypothetical biological mechanisms. The strongest hypothetical biological mechanism was one wherein SNPrs40401 modulates the role of IL3 in several pathways and conditions, including the stem pathway, asthma (hsa05310), the dendritic cell pathway, and development (0.001 < p < 0.004; 0.001 < FDR < 0.033). The second strongest mechanism identified was that in which rs1048108 and rs16852600 alter the function of BARD1, which negatively regulates developmental process and modulates processes including cell development and programmed cell death (0.001 < p < 0.004; 0.001 < FDR < 0.033). The third mechanism identified was one wherein rs1939212 modulated CFL1, resulting in negative regulation of development, cell death, neural crest cell migration, and apoptosis (0.001 < p < 0.004; 0.001 < FDR < 0.033). By using the ICSNPathway to analyze neuroblastoma GWAS data, 15 candidate SNPs, 10 genes including IL3, BARD1, and CFL, and 31 pathways were identified that might contribute to the susceptibility of patients to neuroblastoma.
Collapse
|
78
|
Silver M, Chen P, Li R, Cheng CY, Wong TY, Tai ES, Teo YY, Montana G. Pathways-driven sparse regression identifies pathways and genes associated with high-density lipoprotein cholesterol in two Asian cohorts. PLoS Genet 2013; 9:e1003939. [PMID: 24278029 PMCID: PMC3836716 DOI: 10.1371/journal.pgen.1003939] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2013] [Accepted: 09/11/2013] [Indexed: 01/11/2023] Open
Abstract
Standard approaches to data analysis in genome-wide association studies (GWAS) ignore any potential functional relationships between gene variants. In contrast gene pathways analysis uses prior information on functional structure within the genome to identify pathways associated with a trait of interest. In a second step, important single nucleotide polymorphisms (SNPs) or genes may be identified within associated pathways. The pathways approach is motivated by the fact that genes do not act alone, but instead have effects that are likely to be mediated through their interaction in gene pathways. Where this is the case, pathways approaches may reveal aspects of a trait's genetic architecture that would otherwise be missed when considering SNPs in isolation. Most pathways methods begin by testing SNPs one at a time, and so fail to capitalise on the potential advantages inherent in a multi-SNP, joint modelling approach. Here, we describe a dual-level, sparse regression model for the simultaneous identification of pathways and genes associated with a quantitative trait. Our method takes account of various factors specific to the joint modelling of pathways with genome-wide data, including widespread correlation between genetic predictors, and the fact that variants may overlap multiple pathways. We use a resampling strategy that exploits finite sample variability to provide robust rankings for pathways and genes. We test our method through simulation, and use it to perform pathways-driven gene selection in a search for pathways and genes associated with variation in serum high-density lipoprotein cholesterol levels in two separate GWAS cohorts of Asian adults. By comparing results from both cohorts we identify a number of candidate pathways including those associated with cardiomyopathy, and T cell receptor and PPAR signalling. Highlighted genes include those associated with the L-type calcium channel, adenylate cyclase, integrin, laminin, MAPK signalling and immune function.
Collapse
Affiliation(s)
- Matt Silver
- Statistics Section, Department of Mathematics, Imperial College, London, United Kingdom
- MRC International Nutrition Group, London School of Hygiene and Tropical Medicine, London, United Kingdom
- * E-mail:
| | - Peng Chen
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore
| | - Ruoying Li
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Ching-Yu Cheng
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore
- Department of Ophthalmology, National University of Singapore, Singapore
- Singapore Eye Research Institute, Singapore National Eye Center, Singapore
| | - Tien-Yin Wong
- Department of Ophthalmology, National University of Singapore, Singapore
- Singapore Eye Research Institute, Singapore National Eye Center, Singapore
| | - E-Shyong Tai
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Yik-Ying Teo
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore
- NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore
- Life Sciences Institute, National University of Singapore, Singapore
- Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore
- Department of Statistics and Applied Probability, National University of Singapore, Singapore
| | - Giovanni Montana
- Statistics Section, Department of Mathematics, Imperial College, London, United Kingdom
| |
Collapse
|
79
|
Nelson RM, Pettersson ME, Carlborg Ö. A century after Fisher: time for a new paradigm in quantitative genetics. Trends Genet 2013; 29:669-76. [PMID: 24161664 DOI: 10.1016/j.tig.2013.09.006] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2013] [Revised: 09/17/2013] [Accepted: 09/19/2013] [Indexed: 10/26/2022]
Abstract
Quantitative genetics traces its roots back through more than a century of theory, largely formed in the absence of directly observable genotype data, and has remained essentially unchanged for decades. By contrast, molecular genetics arose from direct observations and is currently undergoing rapid changes, making the amount of available data ever greater. Thus, the two disciplines are disparate both in their origins and their current states, yet they address the same fundamental question: how does the genotype affect the phenotype? The rapidly accumulating genomic data necessitate sophisticated analysis, but many of the current tools are adaptations of methods designed during the early days of quantitative genetics. We argue here that the present analysis paradigm in quantitative genetics is at its limits in regards to unraveling complex traits and it is necessary to re-evaluate the direction that genetic research is taking for the field to realize its full potential.
Collapse
Affiliation(s)
- Ronald M Nelson
- Swedish University of Agricultural Sciences, Department of Clinical Sciences, Division of Computational Genetics, Box 7078, SE-750 07 Uppsala, Sweden.
| | | | | |
Collapse
|
80
|
Bergen WG, Burnett DD. Topics in transcriptional control of lipid metabolism: from transcription factors to gene-promoter polymorphisms. J Genomics 2013; 1:13-21. [PMID: 25031651 PMCID: PMC4091433 DOI: 10.7150/jgen.3741] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The central dogma of biology (DNA>>RNA>>Protein) has remained as an extremely useful scaffold to guide the study of molecular regulation of cellular metabolism. Molecular regulation of cellular metabolism has been pursued from an individual enzyme to a global assessment of protein function at the genomic (DNA), transcriptomic (RNA) and translation (Protein) levels. Details of a key role by inhibitory small RNAs and post-translational processing of cellular proteins on a whole cell/global basis are now just emerging. Below we emphasize the role of transcription factors (TF) in regulation of adipogenesis and lipogenesis. Additionally we have also focused on emerging additional TF that may also have hitherto unrecognized roles in adipogenesis and lipogenesis as compared to our present understanding. It is generally recognized that SNPs in structural genes can affect the final structure/function of a given protein. The implications of SNPs located in the non-transcribed promoter region on transcription have not been examined as extensively at this time. Here we have also summarized some emerging results on promoter SNPs for lipid metabolism and related cellular processes.
Collapse
Affiliation(s)
- Werner G Bergen
- Program in Cellular and Molecular Biosciences, Department of Animal Sciences, Auburn University, Alabama, 36849-5415, USA
| | - Derris D Burnett
- Program in Cellular and Molecular Biosciences, Department of Animal Sciences, Auburn University, Alabama, 36849-5415, USA
| |
Collapse
|
81
|
Winham SJ, Biernacka JM. Gene-environment interactions in genome-wide association studies: current approaches and new directions. J Child Psychol Psychiatry 2013; 54:1120-34. [PMID: 23808649 PMCID: PMC3829379 DOI: 10.1111/jcpp.12114] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/03/2013] [Indexed: 01/20/2023]
Abstract
BACKGROUND Complex psychiatric traits have long been thought to be the result of a combination of genetic and environmental factors, and gene-environment interactions are thought to play a crucial role in behavioral phenotypes and the susceptibility and progression of psychiatric disorders. Candidate gene studies to investigate hypothesized gene-environment interactions are now fairly common in human genetic research, and with the shift toward genome-wide association studies, genome-wide gene-environment interaction studies are beginning to emerge. METHODS We summarize the basic ideas behind gene-environment interaction, and provide an overview of possible study designs and traditional analysis methods in the context of genome-wide analysis. We then discuss novel approaches beyond the traditional strategy of analyzing the interaction between the environmental factor and each polymorphism individually. RESULTS Two-step filtering approaches that reduce the number of polymorphisms tested for interactions can substantially increase the power of genome-wide gene-environment studies. New analytical methods including data-mining approaches, and gene-level and pathway-level analyses, also have the capacity to improve our understanding of how complex genetic and environmental factors interact to influence psychologic and psychiatric traits. Such methods, however, have not yet been utilized much in behavioral and mental health research. CONCLUSIONS Although methods to investigate gene-environment interactions are available, there is a need for further development and extension of these methods to identify gene-environment interactions in the context of genome-wide association studies. These novel approaches need to be applied in studies of psychology and psychiatry.
Collapse
Affiliation(s)
- Stacey J Winham
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester MN 55905
| | - Joanna M. Biernacka
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester MN 55905,Department of Psychiatry and Psychology, Mayo Clinic, Rochester MN 55905
| |
Collapse
|
82
|
Verschuren JJW, Trompet S, Sampietro ML, Heijmans BT, Koch W, Kastrati A, Houwing-Duistermaat JJ, Slagboom PE, Quax PHA, Jukema JW. Pathway analysis using genome-wide association study data for coronary restenosis--a potential role for the PARVB gene. PLoS One 2013; 8:e70676. [PMID: 23950981 PMCID: PMC3739784 DOI: 10.1371/journal.pone.0070676] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2013] [Accepted: 06/21/2013] [Indexed: 12/20/2022] Open
Abstract
Background Coronary restenosis after percutaneous coronary intervention (PCI) still remains a significant limitation of the procedure. The causative mechanisms of restenosis have not yet been fully identified. The goal of the current study was to perform gene-set analysis of biological pathways related to inflammation, proliferation, vascular function and transcriptional regulation on coronary restenosis to identify novel genes and pathways related to this condition. Methods The GENetic DEterminants of Restenosis (GENDER) databank contains genotypic data of 556,099SNPs of 295 cases with restenosis and 571 matched controls. Fifty-four pathways, related to known restenosis-related processes, were selected. Gene-set analysis was performed using PLINK, GRASS and ALIGATOR software. Pathways with a p<0.01 were fine-mapped and significantly associated SNPs were analyzed in an independent replication cohort. Results Six pathways (cell-extracellular matrix (ECM) interactions pathway, IL2 signaling pathway, IL6 signaling pathway, platelet derived growth factor pathway, vitamin D receptor pathway and the mitochondria pathway) were significantly associated in one or two of the software packages. Two SNPs in the cell-ECM interactions pathway were replicated in an independent restenosis cohort. No replication was obtained for the other pathways. Conclusion With these results we demonstrate a potential role of the cell-ECM interactions pathway in the development of coronary restenosis. These findings contribute to the increasing knowledge of the genetic etiology of restenosis formation and could serve as a hypothesis-generating effort for further functional studies.
Collapse
Affiliation(s)
| | - Stella Trompet
- Department of Cardiology, Leiden University Medical Center, Leiden, The Netherlands
- Department of Gerontology and Geriatrics, Leiden University Medical Center, Leiden, The Netherlands
| | - M. Lourdes Sampietro
- Department Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
- Interuniversity Cardiology Institute of the Netherlands (ICIN), Utrecht, The Netherlands
| | - Bastiaan T. Heijmans
- Molecular Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
- Netherlands Consortium for Healthy Ageing, Leiden University Medical Center, Leiden, The Netherlands
| | - Werner Koch
- Deutsches Herzzentrum München, Technische Universität München, Munich, Germany
| | - Adnan Kastrati
- Deutsches Herzzentrum München, Technische Universität München, Munich, Germany
| | | | - P. Eline Slagboom
- Molecular Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
- Netherlands Consortium for Healthy Ageing, Leiden University Medical Center, Leiden, The Netherlands
| | - Paul H. A. Quax
- Department of Vascular Surgery, Leiden University Medical Center, Leiden, The Netherlands
| | - J. Wouter Jukema
- Department of Cardiology, Leiden University Medical Center, Leiden, The Netherlands
- Netherlands Consortium for Healthy Ageing, Leiden University Medical Center, Leiden, The Netherlands
- Durrer Center for Cardiogenetic Research, Amsterdam, The Netherlands
- Interuniversity Cardiology Institute of the Netherlands (ICIN), Utrecht, The Netherlands
- * E-mail:
| |
Collapse
|
83
|
Genome-wide pathway analysis of memory impairment in the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort implicates gene candidates, canonical pathways, and networks. Brain Imaging Behav 2013; 6:634-48. [PMID: 22865056 DOI: 10.1007/s11682-012-9196-x] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Memory deficits are prominent features of mild cognitive impairment (MCI) and Alzheimer's disease (AD). The genetic architecture underlying these memory deficits likely involves the combined effects of multiple genetic variants operative within numerous biological pathways. In order to identify functional pathways associated with memory impairment, we performed a pathway enrichment analysis on genome-wide association data from 742 Alzheimer's Disease Neuroimaging Initiative (ADNI) participants. A composite measure of memory was generated as the phenotype for this analysis by applying modern psychometric theory to item-level data from the ADNI neuropsychological test battery. Using the GSA-SNP software tool, we identified 27 canonical, expertly-curated pathways with enrichment (FDR-corrected p-value < 0.05) against this composite memory score. Processes classically understood to be involved in memory consolidation, such as neurotransmitter receptor-mediated calcium signaling and long-term potentiation, were highly represented among the enriched pathways. In addition, pathways related to cell adhesion, neuronal differentiation and guided outgrowth, and glucose- and inflammation-related signaling were also enriched. Among genes that were highly-represented in these enriched pathways, we found indications of coordinated relationships, including one large gene set that is subject to regulation by the SP1 transcription factor, and another set that displays co-localized expression in normal brain tissue along with known AD risk genes. These results 1) demonstrate that psychometrically-derived composite memory scores are an effective phenotype for genetic investigations of memory impairment and 2) highlight the promise of pathway analysis in elucidating key mechanistic targets for future studies and for therapeutic interventions.
Collapse
|
84
|
Genome-Wide Pathway Analysis in Major Depressive Disorder. J Mol Neurosci 2013; 51:428-36. [DOI: 10.1007/s12031-013-0047-z] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2013] [Accepted: 06/06/2013] [Indexed: 01/23/2023]
|
85
|
Brookes K. The VNTR in complex disorders: The forgotten polymorphisms? A functional way forward? Genomics 2013; 101:273-81. [DOI: 10.1016/j.ygeno.2013.03.003] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2013] [Revised: 03/08/2013] [Accepted: 03/11/2013] [Indexed: 12/16/2022]
|
86
|
Zhang X, Yang X, Yuan Z, Liu Y, Li F, Peng B, Zhu D, Zhao J, Xue F. A PLSPM-based test statistic for detecting gene-gene co-association in genome-wide association study with case-control design. PLoS One 2013; 8:e62129. [PMID: 23620809 PMCID: PMC3631168 DOI: 10.1371/journal.pone.0062129] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2012] [Accepted: 03/19/2013] [Indexed: 12/22/2022] Open
Abstract
For genome-wide association data analysis, two genes in any pathway, two SNPs in the two linked gene regions respectively or in the two linked exons respectively within one gene are often correlated with each other. We therefore proposed the concept of gene-gene co-association, which refers to the effects not only due to the traditional interaction under nearly independent condition but the correlation between two genes. Furthermore, we constructed a novel statistic for detecting gene-gene co-association based on Partial Least Squares Path Modeling (PLSPM). Through simulation, the relationship between traditional interaction and co-association was highlighted under three different types of co-association. Both simulation and real data analysis demonstrated that the proposed PLSPM-based statistic has better performance than single SNP-based logistic model, PCA-based logistic model, and other gene-based methods.
Collapse
Affiliation(s)
- Xiaoshuai Zhang
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China
| | - Xiaowei Yang
- Hunter College - School of Public Health, City University of New York, New York City, New York, United States of America
- Bayessoft, Inc., Davis, California, United States of America
| | - Zhongshang Yuan
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China
| | - Yanxun Liu
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China
| | - Fangyu Li
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China
| | - Bin Peng
- School of Public Health, Chongqing Medical University, Chongqing, China
| | - Dianwen Zhu
- Hunter College - School of Public Health, City University of New York, New York City, New York, United States of America
| | - Jinghua Zhao
- MRC Epidemiology Unit and Institute of Metabolic Science, Cambridge, United Kingdom
| | - Fuzhong Xue
- Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan, China
- * E-mail:
| |
Collapse
|
87
|
Jones-Davis DM, Yang M, Rider E, Osbun NC, da Gente GJ, Li J, Katz AM, Weber MD, Sen S, Crawley J, Sherr EH. Quantitative trait loci for interhemispheric commissure development and social behaviors in the BTBR T⁺ tf/J mouse model of autism. PLoS One 2013; 8:e61829. [PMID: 23613947 PMCID: PMC3626795 DOI: 10.1371/journal.pone.0061829] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2012] [Accepted: 03/18/2013] [Indexed: 12/21/2022] Open
Abstract
Background Autism and Agenesis of the Corpus Callosum (AgCC) are interrelated behavioral and anatomic phenotypes whose genetic etiologies are incompletely understood. We used the BTBR T+tf/J (BTBR) strain, exhibiting fully penetrant AgCC, a diminished hippocampal commissure, and abnormal behaviors that may have face validity to autism, to study the genetic basis of these disorders. Methods We generated 410 progeny from an F2 intercross between the BTBR and C57BL/6J strains. The progeny were phenotyped for social behaviors (as juveniles and adults) and commisural morphology, and genotyped using 458 markers. Quantitative trait loci (QTL) were identified using genome scans; significant loci were fine-mapped, and the BTBR genome was sequenced and analyzed to identify candidate genes. Results Six QTL meeting genome-wide significance for three autism-relevant behaviors in BTBR were identified on chromosomes 1, 3, 9, 10, 12, and X. Four novel QTL for commissural morphology on chromosomes 4, 6, and 12 were also identified. We identified a highly significant QTL (LOD score = 20.2) for callosal morphology on the distal end of chromosome 4. Conclusions We identified several QTL and candidate genes for both autism-relevant traits and commissural morphology in the BTBR mouse. Twenty-nine candidate genes were associated with synaptic activity, axon guidance, and neural development. This is consistent with a role for these processes in modulating white matter tract development and aspects of autism-relevant behaviors in the BTBR mouse. Our findings reveal candidate genes in a mouse model that will inform future human and preclinical studies of autism and AgCC.
Collapse
Affiliation(s)
- Dorothy M. Jones-Davis
- Department of Neurology, University of California San Francisco, San Francisco, California, United States of America
| | - Mu Yang
- Laboratory of Behavioral Neuroscience, Intramural Research Program, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Eric Rider
- Department of Neurology, University of California San Francisco, San Francisco, California, United States of America
| | - Nathan C. Osbun
- Department of Neurology, University of California San Francisco, San Francisco, California, United States of America
| | - Gilberto J. da Gente
- Department of Neurology, University of California San Francisco, San Francisco, California, United States of America
| | - Jiang Li
- Department of Neurology, University of California San Francisco, San Francisco, California, United States of America
| | - Adam M. Katz
- Laboratory of Behavioral Neuroscience, Intramural Research Program, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Michael D. Weber
- Laboratory of Behavioral Neuroscience, Intramural Research Program, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Saunak Sen
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California, United States of America
| | - Jacqueline Crawley
- Laboratory of Behavioral Neuroscience, Intramural Research Program, National Institute of Mental Health, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Elliott H. Sherr
- Department of Neurology, University of California San Francisco, San Francisco, California, United States of America
- * E-mail:
| |
Collapse
|
88
|
Genome-wide gene-set analysis for identification of pathways associated with alcohol dependence. Int J Neuropsychopharmacol 2013; 16:271-8. [PMID: 22717047 PMCID: PMC3854955 DOI: 10.1017/s1461145712000375] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
It is believed that multiple genetic variants with small individual effects contribute to the risk of alcohol dependence. Such polygenic effects are difficult to detect in genome-wide association studies that test for association of the phenotype with each single nucleotide polymorphism (SNP) individually. To overcome this challenge, gene-set analysis (GSA) methods that jointly test for the effects of pre-defined groups of genes have been proposed. Rather than testing for association between the phenotype and individual SNPs, these analyses evaluate the global evidence of association with a set of related genes enabling the identification of cellular or molecular pathways or biological processes that play a role in development of the disease. It is hoped that by aggregating the evidence of association for all available SNPs in a group of related genes, these approaches will have enhanced power to detect genetic associations with complex traits. We performed GSA using data from a genome-wide study of 1165 alcohol-dependent cases and 1379 controls from the Study of Addiction: Genetics and Environment (SAGE), for all 200 pathways listed in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Results demonstrated a potential role of the 'synthesis and degradation of ketone bodies' pathway. Our results also support the potential involvement of the 'neuroactive ligand-receptor interaction' pathway, which has previously been implicated in addictive disorders. These findings demonstrate the utility of GSA in the study of complex disease, and suggest specific directions for further research into the genetic architecture of alcohol dependence.
Collapse
|
89
|
Deelen J, Uh HW, Monajemi R, van Heemst D, Thijssen PE, Böhringer S, van den Akker EB, de Craen AJM, Rivadeneira F, Uitterlinden AG, Westendorp RGJ, Goeman JJ, Slagboom PE, Houwing-Duistermaat JJ, Beekman M. Gene set analysis of GWAS data for human longevity highlights the relevance of the insulin/IGF-1 signaling and telomere maintenance pathways. AGE (DORDRECHT, NETHERLANDS) 2013; 35:235-49. [PMID: 22113349 PMCID: PMC3543749 DOI: 10.1007/s11357-011-9340-3] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2011] [Accepted: 10/28/2011] [Indexed: 05/22/2023]
Abstract
In genome-wide association studies (GWAS) of complex traits, single SNP analysis is still the most applied approach. However, the identified SNPs have small effects and provide limited biological insight. A more appropriate approach to interpret GWAS data of complex traits is to analyze the combined effect of a SNP set grouped per pathway or gene region. We used this approach to study the joint effect on human longevity of genetic variation in two candidate pathways, the insulin/insulin-like growth factor (IGF-1) signaling (IIS) pathway and the telomere maintenance (TM) pathway. For the analyses, we used genotyped GWAS data of 403 unrelated nonagenarians from long-lived sibships collected in the Leiden Longevity Study and 1,670 younger population controls. We analyzed 1,021 SNPs in 68 IIS pathway genes and 88 SNPs in 13 TM pathway genes using four self-contained pathway tests (PLINK set-based test, Global test, GRASS and SNP ratio test). Although we observed small differences between the results of the different pathway tests, they showed consistent significant association of the IIS and TM pathway SNP sets with longevity. Analysis of gene SNP sets from these pathways indicates that the association of the IIS pathway is scattered over several genes (AKT1, AKT3, FOXO4, IGF2, INS, PIK3CA, SGK, SGK2, and YWHAG), while the association of the TM pathway seems to be mainly determined by one gene (POT1). In conclusion, this study shows that genetic variation in genes involved in the IIS and TM pathways is associated with human longevity.
Collapse
Affiliation(s)
- Joris Deelen
- Section of Molecular Epidemiology, Leiden University Medical Center, Leiden, The Netherlands.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
90
|
Chen YC, Carter H, Parla J, Kramer M, Goes FS, Pirooznia M, Zandi PP, McCombie WR, Potash JB, Karchin R. A hybrid likelihood model for sequence-based disease association studies. PLoS Genet 2013; 9:e1003224. [PMID: 23358228 PMCID: PMC3554549 DOI: 10.1371/journal.pgen.1003224] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Accepted: 11/21/2012] [Indexed: 11/18/2022] Open
Abstract
In the past few years, case-control studies of common diseases have shifted their focus from single genes to whole exomes. New sequencing technologies now routinely detect hundreds of thousands of sequence variants in a single study, many of which are rare or even novel. The limitation of classical single-marker association analysis for rare variants has been a challenge in such studies. A new generation of statistical methods for case-control association studies has been developed to meet this challenge. A common approach to association analysis of rare variants is the burden-style collapsing methods to combine rare variant data within individuals across or within genes. Here, we propose a new hybrid likelihood model that combines a burden test with a test of the position distribution of variants. In extensive simulations and on empirical data from the Dallas Heart Study, the new model demonstrates consistently good power, in particular when applied to a gene set (e.g., multiple candidate genes with shared biological function or pathway), when rare variants cluster in key functional regions of a gene, and when protective variants are present. When applied to data from an ongoing sequencing study of bipolar disorder (191 cases, 107 controls), the model identifies seven gene sets with nominal p-values0.05, of which one MAPK signaling pathway (KEGG) reaches trend-level significance after correcting for multiple testing. Inexpensive, high-throughput sequencing has transformed the field of case-control association studies. For the first time, it may be possible to identify the genetic underpinnings of complex diseases, by sequencing the DNA of hundreds (even thousands) of cases and controls and comparing patterns of DNA sequence variation. However, complex diseases are likely to be caused by many variants, some of which are very rare. Taken one at a time, the association between variant and disease phenotype may not be detectable by current statistical methods. One strategy is to identify regions where important variants occur by “collapsing” variants into groups. Here, we present a new collapsing approach, capable of detecting subtle genetic differences between cases and controls. We show, in extensive simulations and using a benchmark set of genes involved in human triglyceride levels, that the approach is potentially more powerful than existing methods. We apply the new method to an ongoing sequencing study of bipolar cases and controls and identify a set of genes found in neuronal synapses, which may be implicated in bipolar disorder.
Collapse
Affiliation(s)
- Yun-Ching Chen
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Hannah Carter
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Jennifer Parla
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Melissa Kramer
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Fernando S. Goes
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - Mehdi Pirooznia
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - Peter P. Zandi
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - W. Richard McCombie
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - James B. Potash
- Department of Psychiatry, University of Iowa, Iowa City, Iowa, United States of America
| | - Rachel Karchin
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
- * E-mail:
| |
Collapse
|
91
|
Cicek MS, Maurer MJ, Goode EL. Survival prediction based on inherited gene variation analysis. Methods Mol Biol 2013; 1049:53-64. [PMID: 23913208 DOI: 10.1007/978-1-62703-547-7_5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
There is a significant variation of outcome among ovarian cases. Clinical features such as age, stage, comorbidities, or degree of debulking are known prognostic factors for the disease. However, additional variation remains unexplained, some of which may be due to inherited factors. Here, we describe identification of survival-associated inherited variants in ovarian cancer that can enhance our current prognostic capabilities.
Collapse
Affiliation(s)
- Mine S Cicek
- Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, MN, USA
| | | | | |
Collapse
|
92
|
Genome-wide pathway analysis of a genome-wide association study on multiple sclerosis. Mol Biol Rep 2012; 40:2557-64. [DOI: 10.1007/s11033-012-2341-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2012] [Accepted: 12/09/2012] [Indexed: 11/27/2022]
|
93
|
Granfors M, Karypidis H, Hosseini F, Skjöldebrand-Sparre L, Stavreus-Evers A, Bremme K, Landgren BM, Sundström-Poromaa I, Wikström AK, Åkerud H. Phosphodiesterase 8B gene polymorphism in women with recurrent miscarriage: a retrospective case control study. BMC MEDICAL GENETICS 2012; 13:121. [PMID: 23237535 PMCID: PMC3556309 DOI: 10.1186/1471-2350-13-121] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 04/19/2012] [Accepted: 12/10/2012] [Indexed: 11/10/2022]
Abstract
Background Recurrent miscarriage affects approximately 1% of all couples. There is a known relation between hypothyroidism and recurrent miscarriage. Phosphodiesterase 8B (PDE8B) is a regulator of cyclic adenosine monophosphate (cAMP) with important influence on human thyroid metabolism. Single nucleotide polymorphism (SNP) rs 4704397 in the PDE8B gene has been shown to be associated with variations in serum Thyroid Stimulating Hormone (TSH) and thyroxine (T4) levels. The aim of this study was to investigate whether there is an association between the SNP rs 4704397 in the PDE8B gene and recurrent miscarriage. Methods The study was designed as a retrospective case control study. 188 cases with recurrent miscarriage were included and compared with 391 controls who had delivered at least once and with no history of miscarriage or assisted reproduction. Results No difference between cases and controls concerning age was found. Bivariate associations between homozygous A/A (OR 1.57, 95% CI 0.98-2.52) as well as G/G carriers (OR 1.52, 95% CI 1.02-2.25) of SNP rs 4704397 in PDE8B and recurrent miscarriage were verified (test for trend across all 3 genotypes, p = 0.059). After adjustment for known confounders such as age, BMI and smoking the association between homozygous A/A (AOR 1.63, 95% CI 1.01 - 2.64, p = 0.045) and G/G (AOR 1.52, 95% CI 1.02 - 2.27, p = 0.039) carriers of SNP rs 4704397 in PDE8B and recurrent miscarriage remained. Conclusions Our findings suggest that there is an association between homozygous A/A as well as homozygous G/G carriers of SNP rs 4704397 in PDE8B and recurrent miscarriage.
Collapse
Affiliation(s)
- Michaela Granfors
- Department of Women's and Children's Health, Uppsala University, Uppsala, Sweden.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
94
|
Abstract
OBJECTIVE The aims of this study were to identify the candidate causal single nucleotide polymorphisms (SNPs) and candidate causal mechanisms of asthma and to generate SNP to gene to pathway hypotheses. METHODS SNPs that met a threshold of p ≤ 0.001 in a genome-wide association study (GWAS) dataset of asthma, which included 292,443 SNPs in 473 asthma cases and 1892 controls, were used in the present study. Identify candidate causal SNPs and pathway (ICSNPathway) analysis was applied to this dataset. RESULTS ICSNPathway analysis identified four candidate causal SNPs, four genes, and 21 candidate causal pathways, which in total provided four hypothetical biologic mechanisms: (1) rs7192 (nonsynonymous coding) to HLA-DRA to 21 pathways, such as, the role of eosinophils in the chemokine network of allergy, Th1/Th2 differentiation, and asthma (nominal p ≤ 0.001, FDR p ≤ 0.01); (2) rs20541 (nonsynonymous coding) to IL13 to asthma and cytokines and inflammatory response (nominal p<0.001, FDR p ≤ 0.008); (3) rs1058808 (frameshift coding) to ERBB2 to transmembrane receptor activity (nominal p=0.001, FDR p=0.01); (4) rs17350764 (nonsynonymous coding (deleterious)) to OR52J3 to transmembrane receptor activity (nominal p=0.001, FDR p=0.01). CONCLUSION By applying ICSNPathway analysis to asthma GWAS data, we found four candidate causal SNPs, four genes involving HLA-DRA and IL-13, and four hypotheses, which may contribute to asthma susceptibility.
Collapse
|
95
|
Silver M, Janousova E, Hua X, Thompson PM, Montana G. Identification of gene pathways implicated in Alzheimer's disease using longitudinal imaging phenotypes with sparse regression. Neuroimage 2012; 63:1681-94. [PMID: 22982105 PMCID: PMC3549495 DOI: 10.1016/j.neuroimage.2012.08.002] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2012] [Revised: 08/01/2012] [Accepted: 08/03/2012] [Indexed: 02/04/2023] Open
Abstract
We present a new method for the detection of gene pathways associated with a multivariate quantitative trait, and use it to identify causal pathways associated with an imaging endophenotype characteristic of longitudinal structural change in the brains of patients with Alzheimer's disease (AD). Our method, known as pathways sparse reduced-rank regression (PsRRR), uses group lasso penalised regression to jointly model the effects of genome-wide single nucleotide polymorphisms (SNPs), grouped into functional pathways using prior knowledge of gene-gene interactions. Pathways are ranked in order of importance using a resampling strategy that exploits finite sample variability. Our application study uses whole genome scans and MR images from 99 probable AD patients and 164 healthy elderly controls in the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. 66,182 SNPs are mapped to 185 gene pathways from the KEGG pathway database. Voxel-wise imaging signatures characteristic of AD are obtained by analysing 3D patterns of structural change at 6, 12 and 24 months relative to baseline. High-ranking, AD endophenotype-associated pathways in our study include those describing insulin signalling, vascular smooth muscle contraction and focal adhesion. All of these have been previously implicated in AD biology. In a secondary analysis, we investigate SNPs and genes that may be driving pathway selection. High ranking genes include a number previously linked in gene expression studies to β-amyloid plaque formation in the AD brain (PIK3R3,PIK3CG,PRKCAandPRKCB), and to AD related changes in hippocampal gene expression (ADCY2, ACTN1, ACACA, and GNAI1). Other high ranking previously validated AD endophenotype-related genes include CR1, TOMM40 and APOE.
Collapse
Affiliation(s)
- Matt Silver
- Statistics Section, Department of Mathematics, Imperial College London, UK
| | - Eva Janousova
- Statistics Section, Department of Mathematics, Imperial College London, UK
- Institute of Biostatistics and Analyses, Masaryk University, Brno, Czech Republic
| | - Xue Hua
- Laboratory of Neuro Imaging, Department of Neurology, UCLA School of Medicine, Los Angeles, CA, USA
| | - Paul M. Thompson
- Laboratory of Neuro Imaging, Department of Neurology, UCLA School of Medicine, Los Angeles, CA, USA
| | - Giovanni Montana
- Statistics Section, Department of Mathematics, Imperial College London, UK
- Corresponding author.
| | | |
Collapse
|
96
|
Wu C, Li S, Cui Y. Genetic association studies: an information content perspective. Curr Genomics 2012; 13:566-73. [PMID: 23633916 PMCID: PMC3468889 DOI: 10.2174/138920212803251382] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2012] [Revised: 06/04/2012] [Accepted: 06/18/2012] [Indexed: 01/02/2023] Open
Abstract
The availability of high-density single nucleotide polymorphisms (SNPs) data has made the human genetic association studies possible to identify common and rare variants underlying complex diseases in a genome-wide scale. A handful of novel genetic variants have been identified, which gives much hope and prospects for the future of genetic association studies. In this process, statistical and computational methods play key roles, among which information-based association tests have gained large popularity. This paper is intended to give a comprehensive review of the current literature in genetic association analysis casted in the framework of information theory. We focus our review on the following topics: (1) information theoretic approaches in genetic linkage and association studies; (2) entropy-based strategies for optimal SNP subset selection; and (3) the usage of theoretic information criteria in gene clustering and gene regulatory network construction.
Collapse
Affiliation(s)
- Cen Wu
- Department of Statistics and Probability, Michigan State University, East Lansing, Michigan 48824
| | - Shaoyu Li
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN 38105
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, Michigan 48824
- Center for Computational Biology, Beijing Forestry University, Beijing, China 100083
| |
Collapse
|
97
|
Uncovering networks from genome-wide association studies via circular genomic permutation. G3-GENES GENOMES GENETICS 2012; 2:1067-75. [PMID: 22973544 PMCID: PMC3429921 DOI: 10.1534/g3.112.002618] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2012] [Accepted: 06/29/2012] [Indexed: 11/24/2022]
Abstract
Genome-wide association studies (GWAS) aim to detect single nucleotide polymorphisms (SNP) associated with trait variation. However, due to the large number of tests, standard analysis techniques impose highly stringent significance thresholds, leaving potentially associated SNPs undetected, and much of the trait genetic variation unexplained. Pathway- and network-based methodologies applied to GWAS aim to detect associations missed by standard single-marker approaches. The complex and non-random architecture of the genome makes it a challenge to derive an appropriate testing framework for such methodologies. We developed a rapid and simple permutation approach that uses GWAS SNP association results to establish the significance of pathway associations while accounting for the linkage disequilibrium structure of SNPs and the clustering of functionally related elements in the genome. All SNPs used in the GWAS are placed in a “circular genome” according to their location. Then the complete set of SNP association P values are permuted by rotation with respect to the genomic locations of the SNPs. Once these “simulated” P values are assigned, the joint gene P values are calculated using Fisher’s combination test, and the association of pathways is tested using the hypergeometric test. The circular genomic permutation approach was applied to a human genome-wide association dataset. The data consists of 719 individuals from the ORCADES study genotyped for ∼300,000 SNPs and measured for 51 traits ranging from physical to biochemical measurements. KEGG pathways (n = 225) were used as the sets of pathways to be tested. Our results demonstrate that the circular genomic permutations provide robust association P values. The non-permuted hypergeometric analysis generates ∼1400 pathway-trait combination results with an association P value more significant than P ≤ 0.05, whereas applying circular genomic permutation reduces the number of significant results to a more credible 40% of that value. The circular permutation software (“genomicper”) is available as an R package at http://cran.r-project.org/.
Collapse
|
98
|
Abo R, Jenkins GD, Wang L, Fridley BL. Identifying the genetic variation of gene expression using gene sets: application of novel gene Set eQTL approach to PharmGKB and KEGG. PLoS One 2012; 7:e43301. [PMID: 22905253 PMCID: PMC3419168 DOI: 10.1371/journal.pone.0043301] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2012] [Accepted: 07/19/2012] [Indexed: 11/18/2022] Open
Abstract
Genetic variation underlying the regulation of mRNA gene expression in humans may provide key insights into the molecular mechanisms of human traits and complex diseases. Current statistical methods to map genetic variation associated with mRNA gene expression have typically applied standard linkage and/or association methods; however, when genome-wide SNP and mRNA expression data are available performing all pair wise comparisons is computationally burdensome and may not provide optimal power to detect associations. Consideration of different approaches to account for the high dimensionality and multiple testing issues may provide increased efficiency and statistical power. Here we present a novel approach to model and test the association between genetic variation and mRNA gene expression levels in the context of gene sets (GSs) and pathways, referred to as gene set - expression quantitative trait loci analysis (GS-eQTL). The method uses GSs to initially group SNPs and mRNA expression, followed by the application of principal components analysis (PCA) to collapse the variation and reduce the dimensionality within the GSs. We applied GS-eQTL to assess the association between SNP and mRNA expression level data collected from a cell-based model system using PharmGKB and KEGG defined GSs. We observed a large number of significant GS-eQTL associations, in which the most significant associations arose between genetic variation and mRNA expression from the same GS. However, a number of associations involving genetic variation and mRNA expression from different GSs were also identified. Our proposed GS-eQTL method effectively addresses the multiple testing limitations in eQTL studies and provides biological context for SNP-expression associations.
Collapse
Affiliation(s)
- Ryan Abo
- Division of Clinical Pharmacology, Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Gregory D. Jenkins
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Liewei Wang
- Division of Clinical Pharmacology, Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Brooke L. Fridley
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America
- * E-mail:
| |
Collapse
|
99
|
Systematic testing of literature reported genetic variation associated with coronary restenosis: results of the GENDER Study. PLoS One 2012; 7:e42401. [PMID: 22879966 PMCID: PMC3411750 DOI: 10.1371/journal.pone.0042401] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2012] [Accepted: 07/05/2012] [Indexed: 12/13/2022] Open
Abstract
Background Coronary restenosis after percutaneous coronary intervention still remains a significant problem, despite all medical advances. Unraveling the mechanisms leading to restenosis development remains challenging. Many studies have identified genetic markers associated with restenosis, but consistent replication of the reported markers is scarce. The aim of the current study was to analyze the joined effect of previously in literature reported candidate genes for restenosis in the GENetic DEterminants of Restenosis (GENDER) databank. Methodology/Principal Findings Candidate genes were selected using a MEDLINE search including the terms ‘genetic polymorphism’ and ‘coronary restenosis’. The final set included 36 genes. Subsequently, all single nucleotide polymorphisms (SNPs) in the genomic region of these genes were analyzed in GENDER using set-based analysis in PLINK. The GENDER databank contains genotypic data of 2,571,586 SNPs of 295 cases with restenosis and 571 matched controls. The set, including all 36 literature reported genes, was, indeed, significantly associated with restenosis, p = 0.024 in the GENDER study. Subsequent analyses of the individual genes demonstrated that the observed association of the complete set was determined by 6 of the 36 genes. Conclusion Despite overt inconsistencies in literature, with regard to individual candidate gene studies, this is the first study demonstrating that the joint effect of all these genes together, indeed, is associated with restenosis.
Collapse
|
100
|
Curjuric I, Imboden M, Nadif R, Kumar A, Schindler C, Haun M, Kronenberg F, Künzli N, Phuleria H, Postma DS, Russi EW, Rochat T, Demenais F, Probst-Hensch NM. Different genes interact with particulate matter and tobacco smoke exposure in affecting lung function decline in the general population. PLoS One 2012; 7:e40175. [PMID: 22792237 PMCID: PMC3391223 DOI: 10.1371/journal.pone.0040175] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2011] [Accepted: 06/06/2012] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Oxidative stress related genes modify the effects of ambient air pollution or tobacco smoking on lung function decline. The impact of interactions might be substantial, but previous studies mostly focused on main effects of single genes. OBJECTIVES We studied the interaction of both exposures with a broad set of oxidative-stress related candidate genes and pathways on lung function decline and contrasted interactions between exposures. METHODS For 12679 single nucleotide polymorphisms (SNPs), change in forced expiratory volume in one second (FEV(1)), FEV(1) over forced vital capacity (FEV(1)/FVC), and mean forced expiratory flow between 25 and 75% of the FVC (FEF(25-75)) was regressed on interval exposure to particulate matter <10 µm in diameter (PM10) or packyears smoked (a), additive SNP effects (b), and interaction terms between (a) and (b) in 669 adults with GWAS data. Interaction p-values for 152 genes and 14 pathways were calculated by the adaptive rank truncation product (ARTP) method, and compared between exposures. Interaction effect sizes were contrasted for the strongest SNPs of nominally significant genes (p(interaction)<0.05). Replication was attempted for SNPs with MAF>10% in 3320 SAPALDIA participants without GWAS. RESULTS On the SNP-level, rs2035268 in gene SNCA accelerated FEV(1)/FVC decline by 3.8% (p(interaction) = 2.5×10(-6)), and rs12190800 in PARK2 attenuated FEV1 decline by 95.1 ml p(interaction) = 9.7×10(-8)) over 11 years, while interacting with PM10. Genes and pathways nominally interacting with PM10 and packyears exposure differed substantially. Gene CRISP2 presented a significant interaction with PM10 (p(interaction) = 3.0×10(-4)) on FEV(1)/FVC decline. Pathway interactions were weak. Replications for the strongest SNPs in PARK2 and CRISP2 were not successful. CONCLUSIONS Consistent with a stratified response to increasing oxidative stress, different genes and pathways potentially mediate PM10 and tobacco smoke effects on lung function decline. Ignoring environmental exposures would miss these patterns, but achieving sufficient sample size and comparability across study samples is challenging.
Collapse
Affiliation(s)
- Ivan Curjuric
- Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute SwissTPH, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Medea Imboden
- Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute SwissTPH, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Rachel Nadif
- INSERM, U1018, CESP Centre for research in Epidemiology and Population Health, Respiratory and Environmental Epidemiology Team, Villejuif, France
- Université Paris-Sud 11, UMRS 1018, Villejuif, France
| | - Ashish Kumar
- Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute SwissTPH, Basel, Switzerland
- University of Basel, Basel, Switzerland
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| | - Christian Schindler
- Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute SwissTPH, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Margot Haun
- Division of Genetic Epidemiology, Department of Medical Genetics, Molecular and Clinical Pharmacology, Innsbruck Medical University, Innsbruck, Austria
| | - Florian Kronenberg
- Division of Genetic Epidemiology, Department of Medical Genetics, Molecular and Clinical Pharmacology, Innsbruck Medical University, Innsbruck, Austria
| | - Nino Künzli
- Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute SwissTPH, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Harish Phuleria
- Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute SwissTPH, Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Dirkje S. Postma
- Department of Pulmonary Medicine and Tuberculosis, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Erich W. Russi
- Division of Pulmonary Medicine, University Hospital Zürich, Zürich, Switzerland
| | - Thierry Rochat
- Division of Pulmonary Medicine, Geneva University Hospitals, Geneva, Switzerland
| | - Florence Demenais
- INSERM, U946, Genetic Variation and Human Diseases Unit, Paris, France
- Fondation Jean Dausset - Centre d’Etude du Polymorphisme Humain (CEPH), Paris, France
- Université Paris Diderot, Sorbonne Paris Cité, Institut Universitaire d’Hématologie, Paris, France
| | - Nicole M. Probst-Hensch
- Department of Epidemiology and Public Health, Swiss Tropical and Public Health Institute SwissTPH, Basel, Switzerland
- University of Basel, Basel, Switzerland
| |
Collapse
|