1
|
Abegaz F, Van Lishout F, Mahachie John JM, Chiachoompu K, Bhardwaj A, Gusareva ES, Wei Z, Hakonarson H, Van Steen K. Epistasis Detection in Genome-Wide Screening for Complex Human Diseases in Structured Populations. SYSTEMS MEDICINE 2019. [DOI: 10.1089/sysm.2019.0003] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Affiliation(s)
- Fentaw Abegaz
- GIGA-R, Medical Genomics—BIO3, University of Liege, Liege, Belgium
| | | | | | | | - Archana Bhardwaj
- GIGA-R, Medical Genomics—BIO3, University of Liege, Liege, Belgium
| | | | - Zhi Wei
- Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey
| | - Hakon Hakonarson
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, Pennsylvania
- Division of Human Genetics, Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Kristel Van Steen
- GIGA-R, Medical Genomics—BIO3, University of Liege, Liege, Belgium
- WELBIO (Walloon Excellence in Lifesciences and Biotechnology), University of Liege, Liege, Belgium
| | | |
Collapse
|
2
|
Van Steen K, Moore JH. How to increase our belief in discovered statistical interactions via large-scale association studies? Hum Genet 2019; 138:293-305. [PMID: 30840129 PMCID: PMC6483943 DOI: 10.1007/s00439-019-01987-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 02/20/2019] [Indexed: 12/31/2022]
Abstract
The understanding that differences in biological epistasis may impact disease risk, diagnosis, or disease management stands in wide contrast to the unavailability of widely accepted large-scale epistasis analysis protocols. Several choices in the analysis workflow will impact false-positive and false-negative rates. One of these choices relates to the exploitation of particular modelling or testing strategies. The strengths and limitations of these need to be well understood, as well as the contexts in which these hold. This will contribute to determining the potentially complementary value of epistasis detection workflows and is expected to increase replication success with biological relevance. In this contribution, we take a recently introduced regression-based epistasis detection tool as a leading example to review the key elements that need to be considered to fully appreciate the value of analytical epistasis detection performance assessments. We point out unresolved hurdles and give our perspectives towards overcoming these.
Collapse
Affiliation(s)
- K Van Steen
- WELBIO, GIGA-R Medical Genomics-BIO3, University of Liège, Liege, Belgium.
- Department of Human Genetics, University of Leuven, Leuven, Belgium.
| | - J H Moore
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, USA
| |
Collapse
|
3
|
Ritchie MD, Van Steen K. The search for gene-gene interactions in genome-wide association studies: challenges in abundance of methods, practical considerations, and biological interpretation. ANNALS OF TRANSLATIONAL MEDICINE 2018; 6:157. [PMID: 29862246 DOI: 10.21037/atm.2018.04.05] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
One of the primary goals in this era of precision medicine is to understand the biology of human diseases and their treatment, such that each individual patient receives the best possible treatment for their disease based on their genetic and environmental exposures. One way to work towards achieving this goal is to identify the environmental exposures and genetic variants that are relevant to each disease in question, as well as the complex interplay between genes and environment. Genome-wide association studies (GWAS) have allowed for a greater understanding of the genetic component of many complex traits. However, these genetic effects are largely small and thus, our ability to use these GWAS finding for precision medicine is limited. As more and more GWAS have been performed, rather than focusing only on common single nucleotide polymorphisms (SNPs) and additive genetic models, many researchers have begun to explore alternative heritable components of complex traits including rare variants, structural variants, epigenetics, and genetic interactions. While genetic interactions are a plausible reality that could explain some of the heritabliy that has not yet been identified, especially when one considers the identification of genetic interactions in model organisms as well as our understanding of biological complexity, still there are significant challenges and considerations in identifying these genetic interactions. Broadly, these can be summarized in three categories: abundance of methods, practical considerations, and biological interpretation. In this review, we will discuss these important elements in the search for genetic interactions along with some potential solutions. While genetic interactions are theoretically understood to be important for complex human disease, the body of evidence is still building to support this component of the underlying genetic architecture of complex human traits. Our hope is that more sophisticated modeling approaches and more robust computational techniques will enable the community to identify these important genetic interactions and improve our ability to implement precision medicine in the future.
Collapse
Affiliation(s)
- Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Kristel Van Steen
- WELBIO, GIGA-R Medical Genomics Unit - BIO3, University of Liège, Liège, Belgium.,Department of Human Genetics, University of Leuven, Leuven, Belgium
| |
Collapse
|
4
|
Sarbakhsh P, Mehrabi Y, Daneshpour MS, Zayeri F, Zarkesh M. Logic regression analysis of association of gene polymorphisms with low HDL: Tehran Lipid and Glucose Study. Gene 2013; 513:278-81. [DOI: 10.1016/j.gene.2012.10.084] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2012] [Revised: 10/02/2012] [Accepted: 10/21/2012] [Indexed: 10/27/2022]
|
5
|
Abstract
Over the last few years, main effect genetic association analysis has proven to be a successful tool to unravel genetic risk components to a variety of complex diseases. In the quest for disease susceptibility factors and the search for the 'missing heritability', supplementary and complementary efforts have been undertaken. These include the inclusion of several genetic inheritance assumptions in model development, the consideration of different sources of information, and the acknowledgement of disease underlying pathways of networks. The search for epistasis or gene-gene interaction effects on traits of interest is marked by an exponential growth, not only in terms of methodological development, but also in terms of practical applications, translation of statistical epistasis to biological epistasis and integration of omics information sources. The current popularity of the field, as well as its attraction to interdisciplinary teams, each making valuable contributions with sometimes rather unique viewpoints, renders it impossible to give an exhaustive review of to-date available approaches for epistasis screening. The purpose of this work is to give a perspective view on a selection of currently active analysis strategies and concerns in the context of epistasis detection, and to provide an eye to the future of gene-gene interaction analysis.
Collapse
Affiliation(s)
- Kristel Van Steen
- Department of Electrical Engineering and Computer Science (Montefiore Institute), Grande Traverse, Bioinformatique 4000 Liège 1, Belgium.
| |
Collapse
|
6
|
Black MH, Watanabe RM. A principal components-based clustering method to identify variants associated with complex traits. Hum Hered 2011; 71:50-8. [PMID: 21389731 DOI: 10.1159/000323567] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2010] [Accepted: 12/13/2010] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Multivariate methods ranging from joint SNP to principal components analysis (PCA) have been developed for testing multiple markers in a region for association with disease and disease-related traits. However, these methods suffer from low power and/or the inability to identify the subset of markers contributing to evidence for association under various scenarios. METHODS We introduce orthoblique principal components-based clustering (OPCC) as an alternative approach to identify specific subsets of markers showing association with a quantitative outcome of interest. We demonstrate the utility of OPCC using simulation studies and an example from the literature on type 2 diabetes. RESULTS Compared to traditional methods, OPCC has similar or improved power under various scenarios of linkage disequilibrium structure and genotype availability. Most importantly, our simulations show how OPCC accurately parses large numbers of markers to a subset containing the causal variant or its proxy. CONCLUSION OPCC is a powerful and efficient data reduction method for detecting associations between gene variants and disease-related traits. Unlike alternative methodologies, OPCC has the ability to isolate the effect of causal SNP(s) from among large sets of markers in a candidate region. Therefore, OPCC is an improvement over PCA for testing multiple SNP associations with phenotypes of interest.
Collapse
Affiliation(s)
- Mary Helen Black
- Department of Preventive Medicine, Keck School of Medicine of USC, Los Angeles, CA 90089-9011, USA
| | | |
Collapse
|
7
|
Wolf BJ, Hill EG, Slate EH. Logic Forest: an ensemble classifier for discovering logical combinations of binary markers. ACTA ACUST UNITED AC 2010; 26:2183-9. [PMID: 20628070 DOI: 10.1093/bioinformatics/btq354] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Highly sensitive and specific screening tools may reduce disease -related mortality by enabling physicians to diagnose diseases in asymptomatic patients or at-risk individuals. Diagnostic tests based on multiple biomarkers may achieve the needed sensitivity and specificity to realize this clinical gain. RESULTS Logic regression, a multivariable regression method predicting an outcome using logical combinations of binary predictors, yields interpretable models of the complex interactions in biologic systems. However, its performance degrades in noisy data. We extend logic regression for classification to an ensemble of logic trees (Logic Forest, LF). We conduct simulation studies comparing the ability of logic regression and LF to identify variable interactions predictive of disease status. Our findings indicate LF is superior to logic regression for identifying important predictors. We apply our method to single nucleotide polymorphism data to determine associations of genetic and health factors with periodontal disease. AVAILABILITY LF code is publicly available on CRAN, http://cran.r-project.org/.
Collapse
Affiliation(s)
- Bethany J Wolf
- Division of Biostatistics and Epidemiology, Medical University of South Carolina, 135 Cannon St., Charleston, SC, USA.
| | | | | |
Collapse
|
8
|
Black MH, Watanabe RM. A principal-components-based clustering method to identify multiple variants associated with rheumatoid arthritis and arthritis-related autoantibodies. BMC Proc 2009; 3 Suppl 7:S129. [PMID: 20017995 PMCID: PMC2795902 DOI: 10.1186/1753-6561-3-s7-s129] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Multivariate techniques are an important area of investigation for studying contributions of multiple genetic variants to disease onset and pathology. We analyzed the Genetic Analysis Workshop 16 North American Rheumatoid Arthritis Consortium (NARAC) data using a principal-components analysis (PCA) with an orthoblique rotation to identify specific subsets of single-nucleotide polymorphisms (SNP) in the major histocompatibility complex (MHC) region associated with rheumatoid arthritis (RA) and rheumatoid factor IgM (RFUW), and compared this method with a traditional PC approach. Using the orthoblique PC-based clustering method, we identified new clusters of SNPs across the MHC region associated with RA and RFUW, and replicated known SNP cluster associations with RA, such as those in the HLA-DRB region.
Collapse
Affiliation(s)
- Mary Helen Black
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, 1540 Alcazar Street, CHP 222-V, Los Angeles, California 90089, USA.
| | | |
Collapse
|
9
|
A conditional synergy index to assess biological interaction. Eur J Epidemiol 2009; 24:485-94. [PMID: 19669411 DOI: 10.1007/s10654-009-9378-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2009] [Accepted: 07/21/2009] [Indexed: 10/20/2022]
Abstract
In genetic studies of complex diseases, a crucial task is to identify and quantify gene-gene interactions which are often defined as deviance from genetic additive effects. This statistical definition, however, does not need to reflect the biological interactions of genes. We propose a new method to detect gene-gene interactions. This new approach exploits the concept of synergy and antagonism that is appropriate to capture biological relationships. The conditional synergy index (CSI) describes the extent of interaction on the penetrance scale. We develop the CSI for two-locus disease models and cohort data. The index assumes genotypes to be dichotomized into risk-genotypes (exposed) and non-risk-genotypes (unexposed) but it does not assume the loci to be in linkage equilibrium. We investigate the performance of the CSI and compare it to classical epidemiological interaction measures like Rothman's synergy index (S) and the attributable proportion due to interaction (AP). In addition, the performance of an estimator of this new parameter is illustrated in a practical example.
Collapse
|
10
|
Branicki W, Brudnik U, Wojas-Pelc A. Interactions between HERC2, OCA2 and MC1R may influence human pigmentation phenotype. Ann Hum Genet 2009; 73:160-70. [PMID: 19208107 DOI: 10.1111/j.1469-1809.2009.00504.x] [Citation(s) in RCA: 85] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Human pigmentation is a polygenic trait which may be shaped by different kinds of gene-gene interactions. Recent studies have revealed that interactive effects between HERC2 and OCA2 may be responsible for blue eye colour determination in humans. Here we performed a population association study, examining important polymorphisms within the HERC2 and OCA2 genes. Furthermore, pooling these results with genotyping data for MC1R, ASIP and SLC45A2 obtained for the same population sample we also analysed potential genetic interactions affecting variation in eye, hair and skin colour. Our results confirmed the association of HERC2 rs12913832 with eye colour and showed that this SNP is also significantly associated with skin and hair colouration. It is also concluded that OCA2 rs1800407 is independently associated with eye colour. Finally, using various approaches we were able to show that there is an interaction between MC1R and HERC2 in determination of skin and hair colour in the studied population sample.
Collapse
Affiliation(s)
- Wojciech Branicki
- Institute of Forensic Research, Section of Forensic Genetics, Westerplatte 9, Krakow, Poland.
| | | | | |
Collapse
|
11
|
Wang K. Genetic association tests in the presence of epistasis or gene-environment interaction. Genet Epidemiol 2009; 32:606-14. [PMID: 18435472 DOI: 10.1002/gepi.20336] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
A genetic variant is very likely to manifest its effect on disease through its main effect as well as through its interaction with other genetic variants or environmental factors. Power to detect genetic variants can be greatly improved by modeling their main effects and their interaction effects through a common set of parameters or "generalized association parameters" (Chatterjee et al. [2006] Am. J. Hum. Genet. 79:1002-1016) because of the reduced number of degrees of freedom. Following this idea, I propose two models that extend the work by Chatterjee and colleagues. Particularly, I consider not only the case of relatively weak interaction effect compared to the main effect but also the case of relatively weak main effect. This latter case is perhaps more relevant to genetic association studies. The proposed methods are invariant to the choice of the allele for scoring genotypes or the choice of the reference genotype score. For each model, the asymptotic distribution of the likelihood ratio statistic is derived. Simulation studies suggest that the proposed methods are more powerful than existing ones under certain circumstances.
Collapse
Affiliation(s)
- Kai Wang
- Department of Biostatistics, College of Public Health, The University of Iowa, Iowa City, Iowa 52242, USA.
| |
Collapse
|
12
|
Interactions among genetic variants from contractile pathway of vascular smooth muscle cell in essential hypertension susceptibility of Chinese Han population. Pharmacogenet Genomics 2008; 18:459-66. [DOI: 10.1097/fpc.0b013e3282f97fb2] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
|