1
|
Missing Causality and Heritability of Autoimmune Hepatitis. Dig Dis Sci 2022; 68:1585-1604. [PMID: 36261672 DOI: 10.1007/s10620-022-07728-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Accepted: 10/10/2022] [Indexed: 12/09/2022]
Abstract
BACKGROUND Autoimmune hepatitis has an unknown cause and genetic associations that are not disease-specific or always present. Clarification of its missing causality and heritability could improve prevention and management strategies. AIMS Describe the key epigenetic and genetic mechanisms that could account for missing causality and heritability in autoimmune hepatitis; indicate the prospects of these mechanisms as pivotal factors; and encourage investigations of their pathogenic role and therapeutic potential. METHODS English abstracts were identified in PubMed using multiple key search phases. Several hundred abstracts and 210 full-length articles were reviewed. RESULTS Environmental induction of epigenetic changes is the prime candidate for explaining the missing causality of autoimmune hepatitis. Environmental factors (diet, toxic exposures) can alter chromatin structure and the production of micro-ribonucleic acids that affect gene expression. Epistatic interaction between unsuspected genes is the prime candidate for explaining the missing heritability. The non-additive, interactive effects of multiple genes could enhance their impact on the propensity and phenotype of autoimmune hepatitis. Transgenerational inheritance of acquired epigenetic marks constitutes another mechanism of transmitting parental adaptations that could affect susceptibility. Management strategies could range from lifestyle adjustments and nutritional supplements to precision editing of the epigenetic landscape. CONCLUSIONS Autoimmune hepatitis has a missing causality that might be explained by epigenetic changes induced by environmental factors and a missing heritability that might reflect epistatic gene interactions or transgenerational transmission of acquired epigenetic marks. These unassessed or under-evaluated areas warrant investigation.
Collapse
|
2
|
Duroux D, Climente-González H, Azencott CA, Van Steen K. Interpretable network-guided epistasis detection. Gigascience 2022; 11:6521880. [PMID: 35134928 PMCID: PMC8848319 DOI: 10.1093/gigascience/giab093] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 10/12/2021] [Accepted: 12/13/2021] [Indexed: 11/15/2022] Open
Abstract
Background Detecting epistatic interactions at the gene level is essential to understanding the biological mechanisms of complex diseases. Unfortunately, genome-wide interaction association studies involve many statistical challenges that make such detection hard. We propose a multi-step protocol for epistasis detection along the edges of a gene-gene co-function network. Such an approach reduces the number of tests performed and provides interpretable interactions while keeping type I error controlled. Yet, mapping gene interactions into testable single-nucleotide polymorphism (SNP)-interaction hypotheses, as well as computing gene pair association scores from SNP pair ones, is not trivial. Results Here we compare 3 SNP-gene mappings (positional overlap, expression quantitative trait loci, and proximity in 3D structure) and use the adaptive truncated product method to compute gene pair scores. This method is non-parametric, does not require a known null distribution, and is fast to compute. We apply multiple variants of this protocol to a genome-wide association study dataset on inflammatory bowel disease. Different configurations produced different results, highlighting that various mechanisms are implicated in inflammatory bowel disease, while at the same time, results overlapped with known disease characteristics. Importantly, the proposed pipeline also differs from a conventional approach where no network is used, showing the potential for additional discoveries when prior biological knowledge is incorporated into epistasis detection.
Collapse
Affiliation(s)
- Diane Duroux
- BIO3 - Systems Genetics, GIGA-R Medical Genomics, University of Liège, 4000 Liège, Belgium, 11 Liège 4000, Belgium
| | - Héctor Climente-González
- Institut Curie, PSL Research University, F-75005 Paris, France.,INSERM, U900, F-75005 Paris, France.,CBIO-Centre for Computational Biology, Mines ParisTech, PSL Research University, 75006 Paris, France.,High-Dimensional Statistical Modeling Team, RIKEN Center for Advanced Intelligence Project, Chuo-ku, Tokyo 103-0027, Japan
| | - Chloé-Agathe Azencott
- Institut Curie, PSL Research University, F-75005 Paris, France.,INSERM, U900, F-75005 Paris, France.,CBIO-Centre for Computational Biology, Mines ParisTech, PSL Research University, 75006 Paris, France
| | - Kristel Van Steen
- BIO3 - Systems Genetics, GIGA-R Medical Genomics, University of Liège, 4000 Liège, Belgium, 11 Liège 4000, Belgium.,BIO3 - Systems Medicine, Department of Human Genetics, KU Leuven, 3000 Leuven, Belgium, 49 3000 Leuven, Belgium
| |
Collapse
|
3
|
Hall MA, Wallace J, Lucas AM, Bradford Y, Verma SS, Müller-Myhsok B, Passero K, Zhou J, McGuigan J, Jiang B, Pendergrass SA, Zhang Y, Peissig P, Brilliant M, Sleiman P, Hakonarson H, Harley JB, Kiryluk K, Van Steen K, Moore JH, Ritchie MD. Novel EDGE encoding method enhances ability to identify genetic interactions. PLoS Genet 2021; 17:e1009534. [PMID: 34086673 PMCID: PMC8208534 DOI: 10.1371/journal.pgen.1009534] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 06/16/2021] [Accepted: 04/06/2021] [Indexed: 11/26/2022] Open
Abstract
Assumptions are made about the genetic model of single nucleotide polymorphisms (SNPs) when choosing a traditional genetic encoding: additive, dominant, and recessive. Furthermore, SNPs across the genome are unlikely to demonstrate identical genetic models. However, running SNP-SNP interaction analyses with every combination of encodings raises the multiple testing burden. Here, we present a novel and flexible encoding for genetic interactions, the elastic data-driven genetic encoding (EDGE), in which SNPs are assigned a heterozygous value based on the genetic model they demonstrate in a dataset prior to interaction testing. We assessed the power of EDGE to detect genetic interactions using 29 combinations of simulated genetic models and found it outperformed the traditional encoding methods across 10%, 30%, and 50% minor allele frequencies (MAFs). Further, EDGE maintained a low false-positive rate, while additive and dominant encodings demonstrated inflation. We evaluated EDGE and the traditional encodings with genetic data from the Electronic Medical Records and Genomics (eMERGE) Network for five phenotypes: age-related macular degeneration (AMD), age-related cataract, glaucoma, type 2 diabetes (T2D), and resistant hypertension. A multi-encoding genome-wide association study (GWAS) for each phenotype was performed using the traditional encodings, and the top results of the multi-encoding GWAS were considered for SNP-SNP interaction using the traditional encodings and EDGE. EDGE identified a novel SNP-SNP interaction for age-related cataract that no other method identified: rs7787286 (MAF: 0.041; intergenic region of chromosome 7)–rs4695885 (MAF: 0.34; intergenic region of chromosome 4) with a Bonferroni LRT p of 0.018. A SNP-SNP interaction was found in data from the UK Biobank within 25 kb of these SNPs using the recessive encoding: rs60374751 (MAF: 0.030) and rs6843594 (MAF: 0.34) (Bonferroni LRT p: 0.026). We recommend using EDGE to flexibly detect interactions between SNPs exhibiting diverse action. Although traditional genetic encodings are widely implemented in genetics research, including in genome-wide association studies (GWAS) and epistasis, each method makes assumptions that may not reflect the underlying etiology. Here, we introduce a novel encoding method that estimates and assigns an individualized data-driven encoding for each single nucleotide polymorphism (SNP): the elastic data-driven genetic encoding (EDGE). With simulations, we demonstrate that this novel method is more accurate and robust than traditional encoding methods in estimating heterozygous genotype values, reducing the type I error, and detecting SNP-SNP interactions. We further applied the traditional encodings and EDGE to biomedical data from the Electronic Medical Records and Genomics (eMERGE) Network for five phenotypes, and EDGE identified a novel interaction for age-related cataract not detected by traditional methods, which replicated in data from the UK Biobank. EDGE provides an alternative approach to understanding and modeling diverse SNP models and is recommended for studying complex genetics in common human phenotypes.
Collapse
Affiliation(s)
- Molly A. Hall
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Penn State Cancer Institute, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail:
| | - John Wallace
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Anastasia M. Lucas
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Yuki Bradford
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Shefali S. Verma
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Bertram Müller-Myhsok
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
- Institute of Translational Medicine, University of Liverpool, Liverpool, United Kingdom
| | - Kristin Passero
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Jiayan Zhou
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - John McGuigan
- Department of Veterinary and Biomedical Sciences, College of Agricultural Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Beibei Jiang
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
- Institute of Translational Medicine, University of Liverpool, Liverpool, United Kingdom
| | | | - Yanfei Zhang
- Genomic Medicine Institute, Geisinger Health System, Danville, Pennsylvania, United States of America
| | - Peggy Peissig
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States of America
| | - Murray Brilliant
- Center for Precision Medicine Research, Marshfield Clinic Research Institute, Marshfield, Wisconsin, United States of America
| | - Patrick Sleiman
- Department of Pediatrics, Center for Applied Genomics, Children’s Hospital of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Hakon Hakonarson
- Department of Pediatrics, Center for Applied Genomics, Children’s Hospital of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - John B. Harley
- Center for Autoimmune Genomics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
- United States Department of Veterans Affairs Medical Center, Cincinnati, Ohio, United States of America
| | - Krzysztof Kiryluk
- Division of Nephrology, Department of Medicine, College of Physicians and Surgeons, Columbia University, New York, New York, United States of America
| | - Kristel Van Steen
- WELBIO, GIGA-R Medical Genomics-BIO3, University of Liège, Liège, Belgium
- Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Jason H. Moore
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Marylyn D. Ritchie
- Department of Genetics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
4
|
Gola D, König IR. Empowering individual trait prediction using interactions for precision medicine. BMC Bioinformatics 2021; 22:74. [PMID: 33602124 PMCID: PMC7890638 DOI: 10.1186/s12859-021-04011-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Accepted: 02/08/2021] [Indexed: 11/11/2022] Open
Abstract
Background One component of precision medicine is to construct prediction models with their predicitve ability as high as possible, e.g. to enable individual risk prediction. In genetic epidemiology, complex diseases like coronary artery disease, rheumatoid arthritis, and type 2 diabetes, have a polygenic basis and a common assumption is that biological and genetic features affect the outcome under consideration via interactions. In the case of omics data, the use of standard approaches such as generalized linear models may be suboptimal and machine learning methods are appealing to make individual predictions. However, most of these algorithms focus mostly on main or marginal effects of the single features in a dataset. On the other hand, the detection of interacting features is an active area of research in the realm of genetic epidemiology. One big class of algorithms to detect interacting features is based on the multifactor dimensionality reduction (MDR). Here, we further develop the model-based MDR (MB-MDR), a powerful extension of the original MDR algorithm, to enable interaction empowered individual prediction. Results Using a comprehensive simulation study we show that our new algorithm (median AUC: 0.66) can use information hidden in interactions and outperforms two other state-of-the-art algorithms, namely the Random Forest (median AUC: 0.54) and Elastic Net (median AUC: 0.50), if interactions are present in a scenario of two pairs of two features having small effects. The performance of these algorithms is comparable if no interactions are present. Further, we show that our new algorithm is applicable to real data by comparing the performance of the three algorithms on a dataset of rheumatoid arthritis cases and healthy controls. As our new algorithm is not only applicable to biological/genetic data but to all datasets with discrete features, it may have practical implications in other research fields where interactions between features have to be considered as well, and we made our method available as an R package (https://github.com/imbs-hl/MBMDRClassifieR). Conclusions The explicit use of interactions between features can improve the prediction performance and thus should be included in further attempts to move precision medicine forward.
Collapse
Affiliation(s)
- Damian Gola
- Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein, Campus Lübeck, Lübeck, Germany
| | - Inke R König
- Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein, Campus Lübeck, Lübeck, Germany.
| |
Collapse
|
5
|
Ponomarenko I, Reshetnikov E, Polonikov A, Verzilina I, Sorokina I, Yermachenko A, Dvornyk V, Churnosov M. Candidate Genes for Age at Menarche Are Associated With Uterine Leiomyoma. Front Genet 2021; 11:512940. [PMID: 33552117 PMCID: PMC7863975 DOI: 10.3389/fgene.2020.512940] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 10/14/2020] [Indexed: 12/23/2022] Open
Abstract
Age at menarche (AAM) is an important marker of the pubertal development and function of the hypothalamic-pituitary-ovarian system. It was reported as a possible factor for a risk of uterine leiomyoma (UL). However, while more than 350 loci for AAM have been determined by genome-wide association studies (GWASs) to date, no studies of these loci for their association with UL have been conducted so far. In this study, we analyzed 52 candidate loci for AAM for possible association with UL in a sample of 569 patients and 981 controls. The results of the study suggested that 23 out of the 52 studied polymorphisms had association with UL. Locus rs7759938 LIN28B was individually associated with the disease according to the dominant model. Twenty loci were associated with UL within 11 most significant models of intergenic interactions. Nine loci involved in 16 most significant models of interactions between single-nucleotide polymorphism (SNP), induced abortions, and chronic endometritis were associated with UL. Among the 23 loci associated with UL, 16 manifested association also with either AAM (7 SNPs) or height and/or body mass index (BMI) (13 SNPs). The above 23 SNPs and 514 SNPs linked to them have non-synonymous, regulatory, and expression quantitative trait locus (eQTL) significance for 35 genes, which play roles in the pathways related to development of the female reproductive organs and hormone-mediated signaling [false discovery rate (FDR) ≤ 0.05]. This is the first study reporting associations of candidate genes for AAM with UL.
Collapse
Affiliation(s)
- Irina Ponomarenko
- Department of Medical Biological Disciplines, Belgorod State University, Belgorod, Russia
| | - Evgeny Reshetnikov
- Department of Medical Biological Disciplines, Belgorod State University, Belgorod, Russia
| | - Alexey Polonikov
- Department of Biology, Medical Genetics and Ecology, Kursk State Medical University, Kursk, Russia
| | - Irina Verzilina
- Department of Medical Biological Disciplines, Belgorod State University, Belgorod, Russia
| | - Inna Sorokina
- Department of Social Epidemiology, Pierre Louis Institute of Epidemiology and Public Health, Sorbonne Universités, Paris, France
| | - Anna Yermachenko
- Department of Social Epidemiology, Pierre Louis Institute of Epidemiology and Public Health, Sorbonne Universités, Paris, France
| | - Volodymyr Dvornyk
- Department of Life Sciences, College of Science and General Studies, Alfaisal University, Riyadh, Saudi Arabia
| | - Mikhail Churnosov
- Department of Medical Biological Disciplines, Belgorod State University, Belgorod, Russia
| |
Collapse
|
6
|
Candidate genes for age at menarche are associated with endometriosis. Reprod Biomed Online 2020; 41:943-956. [DOI: 10.1016/j.rbmo.2020.04.016] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 02/25/2020] [Accepted: 04/21/2020] [Indexed: 01/08/2023]
|
7
|
Joiret M, Mahachie John JM, Gusareva ES, Van Steen K. Confounding of linkage disequilibrium patterns in large scale DNA based gene-gene interaction studies. BioData Min 2019; 12:11. [PMID: 31198442 PMCID: PMC6558841 DOI: 10.1186/s13040-019-0199-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2019] [Accepted: 05/09/2019] [Indexed: 01/07/2023] Open
Abstract
Background In Genome-Wide Association Studies (GWAS), the concept of linkage disequilibrium is important as it allows identifying genetic markers that tag the actual causal variants. In Genome-Wide Association Interaction Studies (GWAIS), similar principles hold for pairs of causal variants. However, Linkage Disequilibrium (LD) may also interfere with the detection of genuine epistasis signals in that there may be complete confounding between Gametic Phase Disequilibrium (GPD) and interaction. GPD may involve unlinked genetic markers, even residing on different chromosomes. Often GPD is eliminated in GWAIS, via feature selection schemes or so-called pruning algorithms, to obtain unconfounded epistasis results. However, little is known about the optimal degree of GPD/LD-pruning that gives a balance between false positive control and sufficient power of epistasis detection statistics. Here, we focus on Model-Based Multifactor Dimensionality Reduction as one large-scale epistasis detection tool. Its performance has been thoroughly investigated in terms of false positive control and power, under a variety of scenarios involving different trait types and study designs, as well as error-free and noisy data, but never with respect to multicollinear SNPs. Results Using real-life human LD patterns from a homogeneous subpopulation of British ancestry, we investigated the impact of LD-pruning on the statistical sensitivity of MB-MDR. We considered three different non-fully penetrant epistasis models with varying effect sizes. There is a clear advantage in pre-analysis pruning using sliding windows at r2 of 0.75 or lower, but using a threshold of 0.20 has a detrimental effect on the power to detect a functional interactive SNP pair (power < 25%). Signal sensitivity, directly using LD-block information to determine whether an epistasis signal is present or not, benefits from LD-pruning as well (average power across scenarios: 87%), but is largely hampered by functional loci residing at the boundaries of an LD-block. Conclusions Our results confirm that LD patterns and the position of causal variants in LD blocks do have an impact on epistasis detection, and that pruning strategies and LD-blocks definitions combined need careful attention, if we wish to maximize the power of large-scale epistasis screenings.
Collapse
Affiliation(s)
- Marc Joiret
- BIO3, GIGA-R Medical Genomics, Avenue de l'Hôpital 1-B34-CHU, Liège, 4000 Belgium.,Biomechanics Research Unit, GIGA-R in-silico medicine, Liège, Avenue de l'Hôpital 1-B34-CHU, Liège, 4000 Belgium
| | | | - Elena S Gusareva
- BIO3, GIGA-R Medical Genomics, Avenue de l'Hôpital 1-B34-CHU, Liège, 4000 Belgium
| | - Kristel Van Steen
- BIO3, GIGA-R Medical Genomics, Avenue de l'Hôpital 1-B34-CHU, Liège, 4000 Belgium.,WELBIO researcher, Avenue de l'Hôpital 1-B34-CHU, Liège, 4000 Belgium
| |
Collapse
|
8
|
Ritchie MD, Van Steen K. The search for gene-gene interactions in genome-wide association studies: challenges in abundance of methods, practical considerations, and biological interpretation. ANNALS OF TRANSLATIONAL MEDICINE 2018; 6:157. [PMID: 29862246 DOI: 10.21037/atm.2018.04.05] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
One of the primary goals in this era of precision medicine is to understand the biology of human diseases and their treatment, such that each individual patient receives the best possible treatment for their disease based on their genetic and environmental exposures. One way to work towards achieving this goal is to identify the environmental exposures and genetic variants that are relevant to each disease in question, as well as the complex interplay between genes and environment. Genome-wide association studies (GWAS) have allowed for a greater understanding of the genetic component of many complex traits. However, these genetic effects are largely small and thus, our ability to use these GWAS finding for precision medicine is limited. As more and more GWAS have been performed, rather than focusing only on common single nucleotide polymorphisms (SNPs) and additive genetic models, many researchers have begun to explore alternative heritable components of complex traits including rare variants, structural variants, epigenetics, and genetic interactions. While genetic interactions are a plausible reality that could explain some of the heritabliy that has not yet been identified, especially when one considers the identification of genetic interactions in model organisms as well as our understanding of biological complexity, still there are significant challenges and considerations in identifying these genetic interactions. Broadly, these can be summarized in three categories: abundance of methods, practical considerations, and biological interpretation. In this review, we will discuss these important elements in the search for genetic interactions along with some potential solutions. While genetic interactions are theoretically understood to be important for complex human disease, the body of evidence is still building to support this component of the underlying genetic architecture of complex human traits. Our hope is that more sophisticated modeling approaches and more robust computational techniques will enable the community to identify these important genetic interactions and improve our ability to implement precision medicine in the future.
Collapse
Affiliation(s)
- Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Kristel Van Steen
- WELBIO, GIGA-R Medical Genomics Unit - BIO3, University of Liège, Liège, Belgium.,Department of Human Genetics, University of Leuven, Leuven, Belgium
| |
Collapse
|
9
|
Gola D, König IR. Identification of interactions using model-based multifactor dimensionality reduction. BMC Proc 2016; 10:135-139. [PMID: 27980625 PMCID: PMC5133504 DOI: 10.1186/s12919-016-0019-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Common complex traits may involve multiple genetic and environmental factors and their interactions. Many methods have been proposed to identify these interaction effects, among them several machine learning and data mining methods. These are attractive for identifying interactions because they do not rely on specific genetic model assumptions. To handle the computational burden arising from an exhaustive search, including all possible combinations of factors, filter methods try to select promising factors in advance. METHODS Model-based multifactor dimensionality reduction (MB-MDR), a semiparametric machine learning method allowing adjustment for confounding variables and lower level effects, is applied to Genetic Analysis Workshop 19 (GAW19) data to identify interaction effects on different traits. Several filtering methods based on the nearest neighbor algorithm are assessed in terms of compatibility with MB-MDR. RESULTS Single nucleotide polymorphism (SNP) rs859400 shows a significant interaction effect (corrected p value <0.05) with age on systolic blood pressure (SBP). We identified 23 SNP-SNP interaction effects on hypertension status (HS), 42 interaction effects on SBP, and 26 interaction effects on diastolic blood pressure (DBP). Several of these SNPs are in strong linkage disequilibrium (LD). Three of the interaction effects on HS are identified in filtered subsets. CONCLUSIONS The considered filtering methods seem not to be appropriate to use with MB-MDR. LD pruning is further quality control to be incorporated, which can reduce the combinatorial burden by removing redundant SNPs.
Collapse
Affiliation(s)
- Damian Gola
- Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein — Campus Lübeck, Ratzeburger Allee 160, Lübeck, 23562 Germany
| | - Inke R. König
- Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein — Campus Lübeck, Ratzeburger Allee 160, Lübeck, 23562 Germany
| |
Collapse
|
10
|
Lishout FV, Gadaleta F, Moore JH, Wehenkel L, Steen KV. gammaMAXT: a fast multiple-testing correction algorithm. BioData Min 2015; 8:36. [PMID: 26594243 PMCID: PMC4654922 DOI: 10.1186/s13040-015-0069-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2015] [Accepted: 11/08/2015] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The purpose of the MaxT algorithm is to provide a significance test algorithm that controls the family-wise error rate (FWER) during simultaneous hypothesis testing. However, the requirements in terms of computing time and memory of this procedure are proportional to the number of investigated hypotheses. The memory issue has been solved in 2013 by Van Lishout's implementation of MaxT, which makes the memory usage independent from the size of the dataset. This algorithm is implemented in MBMDR-3.0.3, a software that is able to identify genetic interactions, for a variety of SNP-SNP based epistasis models effectively. On the other hand, that implementation turned out to be less suitable for genome-wide interaction analysis studies, due to the prohibitive computational burden. RESULTS In this work we introduce gammaMAXT, a novel implementation of the maxT algorithm for multiple testing correction. The algorithm was implemented in software MBMDR-4.2.2, as part of the MB-MDR framework to screen for SNP-SNP, SNP-environment or SNP-SNP-environment interactions at a genome-wide level. We show that, in the absence of interaction effects, test-statistics produced by the MB-MDR methodology follow a mixture distribution with a point mass at zero and a shifted gamma distribution for the top 10 % of the strictly positive values. We show that the gammaMAXT algorithm has a power comparable to MaxT and maintains FWER, but requires less computational resources and time. We analyze a dataset composed of 10(6) SNPs and 1000 individuals within one day on a 256-core computer cluster. The same analysis would take about 10(4) times longer with MBMDR-3.0.3. CONCLUSIONS These results are promising for future GWAIs. However, the proposed gammaMAXT algorithm offers a general significance assessment and multiple testing approach, applicable to any context that requires performing hundreds of thousands of tests. It offers new perspectives for fast and efficient permutation-based significance assessment in large-scale (integrated) omics studies.
Collapse
Affiliation(s)
- François Van Lishout
- Systems and Modeling Unit, Montefiore Institute, University of Liège, Allée de la découverte 10, Liège, 4000 Belgium ; Bioinformatics and Modeling, GIGA-R, Avenue de l'Hôpital 1, Sart-Tilman, 4000 Belgium
| | - Francesco Gadaleta
- Systems and Modeling Unit, Montefiore Institute, University of Liège, Allée de la découverte 10, Liège, 4000 Belgium ; Bioinformatics and Modeling, GIGA-R, Avenue de l'Hôpital 1, Sart-Tilman, 4000 Belgium
| | - Jason H Moore
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104-6021 PA USA
| | - Louis Wehenkel
- Systems and Modeling Unit, Montefiore Institute, University of Liège, Allée de la découverte 10, Liège, 4000 Belgium ; Bioinformatics and Modeling, GIGA-R, Avenue de l'Hôpital 1, Sart-Tilman, 4000 Belgium
| | - Kristel Van Steen
- Systems and Modeling Unit, Montefiore Institute, University of Liège, Allée de la découverte 10, Liège, 4000 Belgium ; Bioinformatics and Modeling, GIGA-R, Avenue de l'Hôpital 1, Sart-Tilman, 4000 Belgium
| |
Collapse
|
11
|
Fouladi R, Bessonov K, Van Lishout F, Van Steen K. Model-Based Multifactor Dimensionality Reduction for Rare Variant Association Analysis. Hum Hered 2015. [PMID: 26201701 DOI: 10.1159/000381286] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Genome-wide association studies have revealed a vast amount of common loci associated to human complex diseases. Still, a large proportion of heritability remains unexplained. The extent to which rare genetic variants (RVs) are able to explain a relevant portion of the genetic heritability for complex traits leaves room for several debates and paves the way to the collection of RV databases and the development of novel analytic tools to analyze these. To date, several statistical methods have been proposed to uncover the association of RVs with complex diseases, but none of them is the clear winner in all possible scenarios of study design and assumed underlying disease model. The latter may involve differences in the distributions of effect sizes, proportions of causal variants, and ratios of protective to deleterious variants at distinct regions throughout the genome. Therefore, there is a need for robust scalable methods with acceptable overall performance in terms of power and type I error under various realistic scenarios. In this paper, we propose a novel RV association analysis strategy, which satisfies several of the desired properties that a RV analysis tool should exhibit.
Collapse
Affiliation(s)
- Ramouna Fouladi
- Systems and Modeling Unit, Montefiore Institute, and Bioinformatics and Modeling, GIGA-R, University of Liège, Liège, Belgium
| | | | | | | |
Collapse
|
12
|
Bessonov K, Gusareva ES, Van Steen K. A cautionary note on the impact of protocol changes for genome-wide association SNP × SNP interaction studies: an example on ankylosing spondylitis. Hum Genet 2015; 134:761-73. [DOI: 10.1007/s00439-015-1560-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2015] [Accepted: 04/26/2015] [Indexed: 12/11/2022]
|
13
|
Gusareva ES, Carrasquillo MM, Bellenguez C, Cuyvers E, Colon S, Graff-Radford NR, Petersen RC, Dickson DW, Mahachie John JM, Bessonov K, Van Broeckhoven C, Harold D, Williams J, Amouyel P, Sleegers K, Ertekin-Taner N, Lambert JC, Van Steen K. Genome-wide association interaction analysis for Alzheimer's disease. Neurobiol Aging 2014; 35:2436-2443. [PMID: 24958192 PMCID: PMC4370231 DOI: 10.1016/j.neurobiolaging.2014.05.014] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2013] [Revised: 05/19/2014] [Accepted: 05/21/2014] [Indexed: 12/23/2022]
Abstract
We propose a minimal protocol for exhaustive genome-wide association interaction analysis that involves screening for epistasis over large-scale genomic data combining strengths of different methods and statistical tools. The different steps of this protocol are illustrated on a real-life data application for Alzheimer's disease (AD) (2259 patients and 6017 controls from France). Particularly, in the exhaustive genome-wide epistasis screening we identified AD-associated interacting SNPs-pair from chromosome 6q11.1 (rs6455128, the KHDRBS2 gene) and 13q12.11 (rs7989332, the CRYL1 gene) (p = 0.006, corrected for multiple testing). A replication analysis in the independent AD cohort from Germany (555 patients and 824 controls) confirmed the discovered epistasis signal (p = 0.036). This signal was also supported by a meta-analysis approach in 5 independent AD cohorts that was applied in the context of epistasis for the first time. Transcriptome analysis revealed negative correlation between expression levels of KHDRBS2 and CRYL1 in both the temporal cortex (β = -0.19, p = 0.0006) and cerebellum (β = -0.23, p < 0.0001) brain regions. This is the first time a replicable epistasis associated with AD was identified using a hypothesis free screening approach.
Collapse
Affiliation(s)
- Elena S Gusareva
- Systems and Modeling Unit, Montefiore Institute, University of Liege, Belgium; Bioinformatics and Modeling, GIGA-R, University of Liege, Belgium.
| | | | - Céline Bellenguez
- INSERM U744, Lille, France; Department of Public Health and Molecular Epidemiology of Aging Related Diseases, Institut Pasteur de Lille, Lille, France; Universite de Lille Nord de France, Lille, France
| | - Elise Cuyvers
- Department of Molecular Genetics, VIB, Antwerp, Belgium; Department of Neurology, Institute Born-Bunge, University of Antwerp, Antwerp, Belgium
| | - Samuel Colon
- Department of Neuroscience, Mayo Clinic Florida, Jacksonville, FL, USA
| | | | | | - Dennis W Dickson
- Department of Neuroscience, Mayo Clinic Florida, Jacksonville, FL, USA
| | - Jestinah M Mahachie John
- Systems and Modeling Unit, Montefiore Institute, University of Liege, Belgium; Bioinformatics and Modeling, GIGA-R, University of Liege, Belgium
| | - Kyrylo Bessonov
- Systems and Modeling Unit, Montefiore Institute, University of Liege, Belgium; Bioinformatics and Modeling, GIGA-R, University of Liege, Belgium
| | - Christine Van Broeckhoven
- Department of Molecular Genetics, VIB, Antwerp, Belgium; Department of Neurology, Institute Born-Bunge, University of Antwerp, Antwerp, Belgium
| | - Denise Harold
- Medical Research Council Centre for Neuropsychiatric Genetics and Genomics, Institute of Psychological Medicine and Clinical Neurosciences, Cardiff University School of Medicine, Cardiff, UK
| | - Julie Williams
- Medical Research Council Centre for Neuropsychiatric Genetics and Genomics, Institute of Psychological Medicine and Clinical Neurosciences, Cardiff University School of Medicine, Cardiff, UK
| | - Philippe Amouyel
- INSERM U744, Lille, France; Department of Public Health and Molecular Epidemiology of Aging Related Diseases, Institut Pasteur de Lille, Lille, France; Universite de Lille Nord de France, Lille, France
| | - Kristel Sleegers
- Department of Molecular Genetics, VIB, Antwerp, Belgium; Department of Neurology, Institute Born-Bunge, University of Antwerp, Antwerp, Belgium
| | - Nilüfer Ertekin-Taner
- Department of Neuroscience, Mayo Clinic Florida, Jacksonville, FL, USA; Department of Neurology, Mayo Clinic Florida, Jacksonville, FL, USA
| | - Jean-Charles Lambert
- INSERM U744, Lille, France; Department of Public Health and Molecular Epidemiology of Aging Related Diseases, Institut Pasteur de Lille, Lille, France; Universite de Lille Nord de France, Lille, France
| | - Kristel Van Steen
- Systems and Modeling Unit, Montefiore Institute, University of Liege, Belgium; Bioinformatics and Modeling, GIGA-R, University of Liege, Belgium
| |
Collapse
|
14
|
Wang GN, Zhang JS, Cao WJ, Sun H, Zhang J, Wang Y, Xiao H. Association of ALOX5, LTA4H and LTC4S gene polymorphisms with ischemic stroke risk in a cohort of Chinese in east China. World J Emerg Med 2014; 4:32-7. [PMID: 25215090 DOI: 10.5847/wjem.j.issn.1920-8642.2013.01.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2012] [Accepted: 12/03/2012] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Genetic variations of the 5-lipoxygenase activating protein and leukotriene A4 hydrolase genes that confer an increased risk of ischemic stroke have implicated the family of leukotrienes as potential mediators of ischemic stroke. This study aimed to explore the association of ALOX5, LTA4H and LTC4S gene polymorphisms with ischemic stroke risk in a cohort of Chinese in east China. METHODS This case-control study consisted of 690 patients with ischemic stroke and 690 controls. Polymorphisms of ALOX5 rs2029253 A/G, LTA4H rs6538697 T/C, and LTC4S rs730012 A/C were genotyped by the polymerase chain reaction and restriction fragment length polymorphism (PCR-RFLP) analysis. The multivariate logistic regression model was used to exclude the effects of conventional risk factors on ischemic stroke. RESULTS Carriers of C allele in rs730012 were more susceptible to ischemic stroke (OR: 1.37; 95%CI: 1.08-1.73; P=0.009). The rs2029253 GG genotype showed a risk-reducing effect on ischemic stroke (OR: 0.72; 95%CI: 0.55-0.93; P=0.013) while the rs6538697 CC genotype had an increased risk of ischemic stroke (OR: 1.77; 95%CI: 1.09-2.89; P=0.022). The rs730012 variant was not associated with ischemic stroke risk after adjusting confounding factors (P>0.05). CONCLUSION The present study suggested that gene polymorphisms in the leukotrienes pathway may exert influences, with independent genetic effects, on ischemic stroke susceptibility in a cohort of Chinese in east China.
Collapse
Affiliation(s)
- Gan-Nan Wang
- Emergency Department, First Affiliated Hospital of Nanjing Medical University, Nanjing 210029, China
| | - Jin-Song Zhang
- Emergency Department, First Affiliated Hospital of Nanjing Medical University, Nanjing 210029, China
| | - Wei-Juan Cao
- Emergency Department, First Affiliated Hospital of Nanjing Medical University, Nanjing 210029, China
| | - Hao Sun
- Emergency Department, First Affiliated Hospital of Nanjing Medical University, Nanjing 210029, China
| | - Jing Zhang
- Emergency Department, First Affiliated Hospital of Nanjing Medical University, Nanjing 210029, China
| | - Yao Wang
- Emergency Department, First Affiliated Hospital of Nanjing Medical University, Nanjing 210029, China
| | - Hang Xiao
- Laboratory of Neurotoxicology, School of Public Health, Nanjing Medical University, Nanjing 210029, China
| |
Collapse
|
15
|
Gusareva ES, Van Steen K. Practical aspects of genome-wide association interaction analysis. Hum Genet 2014; 133:1343-58. [DOI: 10.1007/s00439-014-1480-y] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Accepted: 08/18/2014] [Indexed: 12/31/2022]
|
16
|
Hoefkens E, Nys K, John JM, Van Steen K, Arijs I, Van der Goten J, Van Assche G, Agostinis P, Rutgeerts P, Vermeire S, Cleynen I. Genetic association and functional role of Crohn disease risk alleles involved in microbial sensing, autophagy, and endoplasmic reticulum (ER) stress. Autophagy 2013; 9:2046-55. [PMID: 24247223 DOI: 10.4161/auto.26337] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Genome-wide association studies have identified several genes implicated in autophagy (ATG16L1, IRGM, ULK1, LRRK2, and MTMR3), intracellular bacterial sensing (NOD2), and endoplasmic reticulum (ER) stress (XBP1 and ORMDL3) to be associated with Crohn disease (CD). We studied the known CD-associated variants in these genes in a large cohort of 3451 individuals (1744 CD patients, 793 ulcerative colitis (UC) patients and 914 healthy controls). We also investigated the functional phenotype linked to these genetic variants. Association with CD was confirmed for NOD2, ATG16L1, IRGM, MTMR3, and ORMDL3. The risk for developing CD increased with an increasing number of risk alleles for these genes (P<0.001, OR 1.26 [1.20 to 1.32]). Three times as many (34.8%) CD patients carried a risk allele in all three pathways, in contrast to 13.3% of the controls (P<0.0001, OR = 3.46 [2.77 to 4.32]). For UC, no significant association for one single nucleotide polymorphism (SNP) was found, but the risk for development of UC increased with an increasing total number of risk alleles (P = 0.001, OR = 1.10 [1.04 to 1.17]). We found a genetic interaction between reference SNP (rs)2241880 (ATG16L1) and rs10065172 (IRGM) in CD. Functional experiments hinted toward an association between an increased genetic risk and an augmented inflammatory status, highlighting the relevance of the genetic findings.
Collapse
Affiliation(s)
- Eveline Hoefkens
- Department of Clinical and Experimental Medicine; Translational Research Center for Gastrointestinal Disorders (TARGID); KU Leuven; Leuven, Belgium
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Mahachie John JM, Van Lishout F, Gusareva ES, Van Steen K. A robustness study of parametric and non-parametric tests in model-based multifactor dimensionality reduction for epistasis detection. BioData Min 2013; 6:9. [PMID: 23618370 PMCID: PMC3668290 DOI: 10.1186/1756-0381-6-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2012] [Accepted: 04/20/2013] [Indexed: 11/10/2022] Open
Abstract
Background Applying a statistical method implies identifying underlying (model) assumptions and checking their validity in the particular context. One of these contexts is association modeling for epistasis detection. Here, depending on the technique used, violation of model assumptions may result in increased type I error, power loss, or biased parameter estimates. Remedial measures for violated underlying conditions or assumptions include data transformation or selecting a more relaxed modeling or testing strategy. Model-Based Multifactor Dimensionality Reduction (MB-MDR) for epistasis detection relies on association testing between a trait and a factor consisting of multilocus genotype information. For quantitative traits, the framework is essentially Analysis of Variance (ANOVA) that decomposes the variability in the trait amongst the different factors. In this study, we assess through simulations, the cumulative effect of deviations from normality and homoscedasticity on the overall performance of quantitative Model-Based Multifactor Dimensionality Reduction (MB-MDR) to detect 2-locus epistasis signals in the absence of main effects. Methodology Our simulation study focuses on pure epistasis models with varying degrees of genetic influence on a quantitative trait. Conditional on a multilocus genotype, we consider quantitative trait distributions that are normal, chi-square or Student’s t with constant or non-constant phenotypic variances. All data are analyzed with MB-MDR using the built-in Student’s t-test for association, as well as a novel MB-MDR implementation based on Welch’s t-test. Traits are either left untransformed or are transformed into new traits via logarithmic, standardization or rank-based transformations, prior to MB-MDR modeling. Results Our simulation results show that MB-MDR controls type I error and false positive rates irrespective of the association test considered. Empirically-based MB-MDR power estimates for MB-MDR with Welch’s t-tests are generally lower than those for MB-MDR with Student’s t-tests. Trait transformations involving ranks tend to lead to increased power compared to the other considered data transformations. Conclusions When performing MB-MDR screening for gene-gene interactions with quantitative traits, we recommend to first rank-transform traits to normality and then to apply MB-MDR modeling with Student’s t-tests as internal tests for association.
Collapse
|
18
|
Van Lishout F, Mahachie John JM, Gusareva ES, Urrea V, Cleynen I, Théâtre E, Charloteaux B, Calle ML, Wehenkel L, Van Steen K. An efficient algorithm to perform multiple testing in epistasis screening. BMC Bioinformatics 2013; 14:138. [PMID: 23617239 PMCID: PMC3648350 DOI: 10.1186/1471-2105-14-138] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2012] [Accepted: 04/12/2013] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. Gene-gene interaction studies will require a memory proportional to the squared number of SNPs. A genome-wide epistasis search would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MBMDR-3.0.3. We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn's disease. RESULTS In the case of a binary (affected/unaffected) trait, the parallel workflow of MBMDR-3.0.3 analyzes all gene-gene interactions with a dataset of 100,000 SNPs typed on 1000 individuals within 4 days and 9 hours, using 999 permutations of the trait to assess statistical significance, on a cluster composed of 10 blades, containing each four Quad-Core AMD Opteron(tm) Processor 2352 2.1 GHz. In the case of a continuous trait, a similar run takes 9 days. Our program found 14 SNP-SNP interactions with a multiple-testing corrected p-value of less than 0.05 on real-life Crohn's disease (CD) data. CONCLUSIONS Our software is the first implementation of the MB-MDR methodology able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory, while adequately controlling the type I error rates. A new implementation to reach genome-wide epistasis screening is under construction. In the context of Crohn's disease, MBMDR-3.0.3 could identify epistasis involving regions that are well known in the field and could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype higher-order associations.
Collapse
Affiliation(s)
- François Van Lishout
- Systems and Modeling Unit, Montefiore Institute, University of Liège, 4000 Liège, Belgium.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Braem MGM, Voorhuis M, van der Schouw YT, Peeters PHM, Schouten LJ, Eijkemans MJC, Broekmans FJ, Onland-Moret NC. Interactions between genetic variants in AMH and AMHR2 may modify age at natural menopause. PLoS One 2013; 8:e59819. [PMID: 23544102 PMCID: PMC3609726 DOI: 10.1371/journal.pone.0059819] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2012] [Accepted: 02/19/2013] [Indexed: 01/10/2023] Open
Abstract
The onset of menopause has important implications on women’s fertility and health. We previously identified genetic variants in genes involved in initial follicle recruitment as potential modifiers of age at natural menopause. The objective of this study was to extend our previous study, by searching for pairwise interactions between tagging single nucleotide polymorphisms (tSNPs) in the 5 genes previously selected (AMH, AMHR2, BMP15, FOXL2, GDF9). We performed a cross-sectional study among 3445 women with a natural menopause participating in the Prospect-EPIC study, a population-based prospective cohort study, initiated between 1993 and 1997. Based on the model-based multifactor dimensionality reduction (MB-MDR) test with a permutation-based maxT correction for multiple testing, we found a statistically significant interaction between rs10407022 in AMH and rs11170547 in AMHR2 (p = 0.019) associated with age at natural menopause. Rs10407022 did not have a statistically significant main effect. However, rs10407022 is an eQTL SNP that has been shown to influence mRNA expression levels in lymphoblastoid cell lines. This study provides additional insights into the genetic background of age at natural menopause and suggests a role of the AMH signaling pathway in the onset of natural menopause. However, these results remain suggestive and replication by independent studies is necessary.
Collapse
|
20
|
De Lobel L, Thijs L, Kouznetsova T, Staessen JA, Van Steen K. A family-based association test to detect gene-gene interactions in the presence of linkage. Eur J Hum Genet 2012; 20:973-80. [PMID: 22419171 DOI: 10.1038/ejhg.2012.45] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
For many complex diseases, quantitative traits contain more information than dichotomous traits. One of the approaches used to analyse these traits in family-based association studies is the quantitative transmission disequilibrium test (QTDT). The QTDT is a regression-based approach that models simultaneously linkage and association. It splits up the association effect in a between- and a within-family genetic component to adjust and test for population stratification and includes a variance components method to model linkage. We extend this approach to detect gene-gene interactions between two unlinked QTLs by adjusting the definition of the between- and within-family component and the variance components included in the model. We simulate data to investigate the influence of the epistasis model, linkage disequilibrium patterns between the markers and the QTLs, and allele frequencies on the power and type I error rates of the approach. Results show that for some of the investigated settings, power gains are obtained in comparison with FAM-MDR. We conclude that our approach shows promising results for candidate-gene studies where too few markers are available to correct for population stratification using standard methods (for example EIGENSTRAT). The proposed method is applied to real-life data on hypertension from the FLEMENGHO study.
Collapse
Affiliation(s)
- Lizzy De Lobel
- Department of Applied Mathematics and Computer Science, Ghent University, Ghent, Belgium.
| | | | | | | | | |
Collapse
|