1
|
Hall MA, Wallace J, Lucas AM, Bradford Y, Verma SS, Müller-Myhsok B, Passero K, Zhou J, McGuigan J, Jiang B, Pendergrass SA, Zhang Y, Peissig P, Brilliant M, Sleiman P, Hakonarson H, Harley JB, Kiryluk K, Van Steen K, Moore JH, Ritchie MD. Novel EDGE encoding method enhances ability to identify genetic interactions. PLoS Genet 2021; 17:e1009534. [PMID: 34086673 PMCID: PMC8208534 DOI: 10.1371/journal.pgen.1009534] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 06/16/2021] [Accepted: 04/06/2021] [Indexed: 11/26/2022] Open
Abstract
Assumptions are made about the genetic model of single nucleotide polymorphisms (SNPs) when choosing a traditional genetic encoding: additive, dominant, and recessive. Furthermore, SNPs across the genome are unlikely to demonstrate identical genetic models. However, running SNP-SNP interaction analyses with every combination of encodings raises the multiple testing burden. Here, we present a novel and flexible encoding for genetic interactions, the elastic data-driven genetic encoding (EDGE), in which SNPs are assigned a heterozygous value based on the genetic model they demonstrate in a dataset prior to interaction testing. We assessed the power of EDGE to detect genetic interactions using 29 combinations of simulated genetic models and found it outperformed the traditional encoding methods across 10%, 30%, and 50% minor allele frequencies (MAFs). Further, EDGE maintained a low false-positive rate, while additive and dominant encodings demonstrated inflation. We evaluated EDGE and the traditional encodings with genetic data from the Electronic Medical Records and Genomics (eMERGE) Network for five phenotypes: age-related macular degeneration (AMD), age-related cataract, glaucoma, type 2 diabetes (T2D), and resistant hypertension. A multi-encoding genome-wide association study (GWAS) for each phenotype was performed using the traditional encodings, and the top results of the multi-encoding GWAS were considered for SNP-SNP interaction using the traditional encodings and EDGE. EDGE identified a novel SNP-SNP interaction for age-related cataract that no other method identified: rs7787286 (MAF: 0.041; intergenic region of chromosome 7)–rs4695885 (MAF: 0.34; intergenic region of chromosome 4) with a Bonferroni LRT p of 0.018. A SNP-SNP interaction was found in data from the UK Biobank within 25 kb of these SNPs using the recessive encoding: rs60374751 (MAF: 0.030) and rs6843594 (MAF: 0.34) (Bonferroni LRT p: 0.026). We recommend using EDGE to flexibly detect interactions between SNPs exhibiting diverse action. Although traditional genetic encodings are widely implemented in genetics research, including in genome-wide association studies (GWAS) and epistasis, each method makes assumptions that may not reflect the underlying etiology. Here, we introduce a novel encoding method that estimates and assigns an individualized data-driven encoding for each single nucleotide polymorphism (SNP): the elastic data-driven genetic encoding (EDGE). With simulations, we demonstrate that this novel method is more accurate and robust than traditional encoding methods in estimating heterozygous genotype values, reducing the type I error, and detecting SNP-SNP interactions. We further applied the traditional encodings and EDGE to biomedical data from the Electronic Medical Records and Genomics (eMERGE) Network for five phenotypes, and EDGE identified a novel interaction for age-related cataract not detected by traditional methods, which replicated in data from the UK Biobank. EDGE provides an alternative approach to understanding and modeling diverse SNP models and is recommended for studying complex genetics in common human phenotypes.
Collapse
Affiliation(s)
- Molly A. Hall
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Penn State Cancer Institute, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail:
| | - John Wallace
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Anastasia M. Lucas
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Yuki Bradford
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Shefali S. Verma
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Bertram Müller-Myhsok
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
- Institute of Translational Medicine, University of Liverpool, Liverpool, United Kingdom
| | - Kristin Passero
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Jiayan Zhou
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - John McGuigan
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Beibei Jiang
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
- Institute of Translational Medicine, University of Liverpool, Liverpool, United Kingdom
| | | | - Yanfei Zhang
- Genomic Medicine Institute, Geisinger Health System, Danville, Pennsylvania, United States of America
| | - Peggy Peissig
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States of America
| | - Murray Brilliant
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States of America
| | - Patrick Sleiman
- Department of Pediatrics, Center for Applied Genomics, Children’s Hospital of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Hakon Hakonarson
- Department of Pediatrics, Center for Applied Genomics, Children’s Hospital of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - John B. Harley
- Center for Autoimmune Genomics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
- United States Department of Veterans Affairs Medical Center, Cincinnati, Ohio, United States of America
| | - Krzysztof Kiryluk
- Division of Nephrology, Department of Medicine, College of Physicians and Surgeons, Columbia University, New York, New York, United States of America
| | - Kristel Van Steen
- WELBIO, GIGA-R Medical Genomics-BIO3, University of Liège, Liège, Belgium
- Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Jason H. Moore
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Marylyn D. Ritchie
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|