Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Chanda P, Sucheston L, Zhang A, Ramanathan M. The interaction index, a novel information-theoretic metric for prioritizing interacting genetic variations and environmental factors. Eur J Hum Genet 2009;17:1274-86. [PMID: 19293841 DOI: 10.1038/ejhg.2009.38] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open

For:	Chanda P, Sucheston L, Zhang A, Ramanathan M. The interaction index, a novel information-theoretic metric for prioritizing interacting genetic variations and environmental factors. Eur J Hum Genet 2009;17:1274-86. [PMID: 19293841 DOI: 10.1038/ejhg.2009.38] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open

Number

Cited by Other Article(s)

Kunert-Graf JM, Sakhanenko NA, Galas DJ. Optimized permutation testing for information theoretic measures of multi-gene interactions. BMC Bioinformatics 2021;22:180. [PMID: 33827420 PMCID: PMC8028212 DOI: 10.1186/s12859-021-04107-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 03/29/2021] [Indexed: 11/17/2022] Open

Abstract

Background

Permutation testing is often considered the “gold standard” for multi-test significance analysis, as it is an exact test requiring few assumptions about the distribution being computed. However, it can be computationally very expensive, particularly in its naive form in which the full analysis pipeline is re-run after permuting the phenotype labels. This can become intractable in multi-locus genome-wide association studies (GWAS), in which the number of potential interactions to be tested is combinatorially large.

Results

In this paper, we develop an approach for permutation testing in multi-locus GWAS, specifically focusing on SNP–SNP-phenotype interactions using multivariable measures that can be computed from frequency count tables, such as those based in Information Theory. We find that the computational bottleneck in this process is the construction of the count tables themselves, and that this step can be eliminated at each iteration of the permutation testing by transforming the count tables directly. This leads to a speed-up by a factor of over 10³ for a typical permutation test compared to the naive approach. Additionally, this approach is insensitive to the number of samples making it suitable for datasets with large number of samples.

Conclusions

The proliferation of large-scale datasets with genotype data for hundreds of thousands of individuals enables new and more powerful approaches for the detection of multi-locus genotype-phenotype interactions. Our approach significantly improves the computational tractability of permutation testing for these studies. Moreover, our approach is insensitive to the large number of samples in these modern datasets. The code for performing these computations and replicating the figures in this paper is freely available at https://github.com/kunert/permute-counts.

Collapse

Magaña J, Contreras MG, Keys KL, Risse-Adams O, Goddard PC, Zeiger AM, Mak ACY, Elhawary JR, Samedy-Bates LA, Lee E, Thakur N, Hu D, Eng C, Salazar S, Huntsman S, Hu T, Burchard EG, White MJ. An epistatic interaction between pre-natal smoke exposure and socioeconomic status has a significant impact on bronchodilator drug response in African American youth with asthma. BioData Min 2020;13:7. [PMID: 32636926 PMCID: PMC7333373 DOI: 10.1186/s13040-020-00218-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 06/23/2020] [Indexed: 11/30/2022] Open

Abstract

BACKGROUND

Asthma is one of the leading chronic illnesses among children in the United States. Asthma prevalence is higher among African Americans (11.2%) compared to European Americans (7.7%). Bronchodilator medications are part of the first-line therapy, and the rescue medication, for acute asthma symptoms. Bronchodilator drug response (BDR) varies substantially among different racial/ethnic groups. Asthma prevalence in African Americans is only 3.5% higher than that of European Americans, however, asthma mortality among African Americans is four times that of European Americans; variation in BDR may play an important role in explaining this health disparity. To improve our understanding of disparate health outcomes in complex phenotypes such as BDR, it is important to consider interactions between environmental and biological variables.

RESULTS

We evaluated the impact of pairwise and three-variable interactions between environmental, social, and biological variables on BDR in 233 African American youth with asthma using Visualization of Statistical Epistasis Networks (ViSEN). ViSEN is a non-parametric entropy-based approach able to quantify interaction effects using an information-theory metric known as Information Gain (IG). We performed analyses in the full dataset and in sex-stratified subsets. Our analyses identified several interaction models significantly, and suggestively, associated with BDR. The strongest interaction significantly associated with BDR was a pairwise interaction between pre-natal smoke exposure and socioeconomic status (full dataset IG: 2.78%, p = 0.001; female IG: 7.27%, p = 0.004)). Sex-stratified analyses yielded divergent results for females and males, indicating the presence of sex-specific effects.

CONCLUSIONS

Our study identified novel interaction effects significantly, and suggestively, associated with BDR in African American children with asthma. Notably, we found that all of the interactions identified by ViSEN were "pure" interaction effects, in that they were not the result of strong main effects on BDR, highlighting the complexity of the network of biological and environmental factors impacting this phenotype. Several associations uncovered by ViSEN would not have been detected using regression-based methods, thus emphasizing the importance of employing statistical methods optimized to detect both additive and non-additive interaction effects when studying complex phenotypes such as BDR. The information gained in this study increases our understanding and appreciation of the complex nature of the interactions between environmental and health-related factors that influence BDR and will be invaluable to biomedical researchers designing future studies.

Collapse

Affiliation(s)

J. Magaña Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA
M. G. Contreras Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA Department of Biology, San Francisco State University, San Francisco, CA USA
K. L. Keys Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA Berkeley Institute for Data Science, University of California, Berkeley, CA USA
O. Risse-Adams Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA Lowell Science Research Program, Lowell High School, San Francisco, CA USA Department of Biology, University of California, Santa Cruz, CA USA
P. C. Goddard Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA Department of Genetics, Stanford University, Stanford, CA USA
A. M. Zeiger Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA USA
A. C. Y. Mak Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA
J. R. Elhawary Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA
L. A. Samedy-Bates Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA USA
E. Lee Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA
N. Thakur Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA
D. Hu Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA
C. Eng Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA
S. Salazar Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA
S. Huntsman Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA
T. Hu School of Computing, Queen’s University, Kingston, ON Canada
E. G. Burchard Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA USA
M. J. White Department of Medicine, University of California, 1550 4th Street, UCSF Rock Hall, Box 2911, San Francisco, CA 94158 USA

Collapse

Chanda P, Costa E, Hu J, Sukumar S, Van Hemert J, Walia R. Information Theory in Computational Biology: Where We Stand Today. ENTROPY (BASEL, SWITZERLAND) 2020;22:E627. [PMID: 33286399 PMCID: PMC7517167 DOI: 10.3390/e22060627] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 05/31/2020] [Accepted: 06/03/2020] [Indexed: 12/30/2022]

Mi K, Jiang Y, Chen J, Lv D, Qian Z, Sun H, Shang D. Construction and Analysis of Human Diseases and Metabolites Network. Front Bioeng Biotechnol 2020;8:398. [PMID: 32426349 PMCID: PMC7203444 DOI: 10.3389/fbioe.2020.00398] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 04/08/2020] [Indexed: 11/13/2022] Open

Wang S, Jeong HH, Sohn KA. ClearF: a supervised feature scoring method to find biomarkers using class-wise embedding and reconstruction. BMC Med Genomics 2019;12:95. [PMID: 31296201 PMCID: PMC6624178 DOI: 10.1186/s12920-019-0512-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Feature selection or scoring methods for the detection of biomarkers are essential in bioinformatics. Various feature selection methods have been developed for the detection of biomarkers, and several studies have employed information-theoretic approaches. However, most of these methods generally require a long processing time. In addition, information-theoretic methods discretize continuous features, which is a drawback that can lead to the loss of information.

RESULTS

In this paper, a novel supervised feature scoring method named ClearF is proposed. The proposed method is suitable for continuous-valued data, which is similar to the principle of feature selection using mutual information, with the added advantage of a reduced computation time. The proposed score calculation is motivated by the association between the reconstruction error and the information-theoretic measurement. Our method is based on class-wise low-dimensional embedding and the resulting reconstruction error. Given multi-class datasets such as a case-control study dataset, low-dimensional embedding is first applied to each class to obtain a compressed representation of the class, and also for the entire dataset. Reconstruction is then performed to calculate the error of each feature and the final score for each feature is defined in terms of the reconstruction errors. The correlation between the information theoretic measurement and the proposed method is demonstrated using a simulation. For performance validation, we compared the classification performance of the proposed method with those of various algorithms on benchmark datasets.

CONCLUSIONS

The proposed method showed higher accuracy and lower execution time than the other established methods. Moreover, an experiment was conducted on the TCGA breast cancer dataset, and it was confirmed that the genes with the highest scores were highly associated with subtypes of breast cancer.

Collapse

Ferrario PG, König IR. Transferring entropy to the realm of GxG interactions. Brief Bioinform 2018;19:136-147. [PMID: 27769993 PMCID: PMC5862307 DOI: 10.1093/bib/bbw086] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2016] [Indexed: 01/18/2023] Open

Yang CH, Weng ZJ, Chuang LY, Yang CS. Identification of SNP-SNP interaction for chronic dialysis patients. Comput Biol Med 2017;83:94-101. [DOI: 10.1016/j.compbiomed.2017.02.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Revised: 02/14/2017] [Accepted: 02/15/2017] [Indexed: 01/10/2023]

Lee W, Sjölander A, Pawitan Y. A Critical Look at Entropy-Based Gene-Gene Interaction Measures. Genet Epidemiol 2016;40:416-24. [PMID: 27229752 DOI: 10.1002/gepi.21974] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2015] [Revised: 02/28/2015] [Accepted: 03/17/2016] [Indexed: 11/12/2022]

CINOEDV: a co-information based method for detecting and visualizing n-order epistatic interactions. BMC Bioinformatics 2016;17:214. [PMID: 27184783 PMCID: PMC4869388 DOI: 10.1186/s12859-016-1076-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2015] [Accepted: 05/07/2016] [Indexed: 12/13/2022] Open

Abstract

BACKGROUND

Detecting and visualizing nonlinear interaction effects of single nucleotide polymorphisms (SNPs) or epistatic interactions are important topics in bioinformatics since they play an important role in unraveling the mystery of "missing heritability". However, related studies are almost limited to pairwise epistatic interactions due to their methodological and computational challenges.

RESULTS

We develop CINOEDV (Co-Information based N-Order Epistasis Detector and Visualizer) for the detection and visualization of epistatic interactions of their orders from 1 to n (n ≥ 2). CINOEDV is composed of two stages, namely, detecting stage and visualizing stage. In detecting stage, co-information based measures are employed to quantify association effects of n-order SNP combinations to the phenotype, and two types of search strategies are introduced to identify n-order epistatic interactions: an exhaustive search and a particle swarm optimization based search. In visualizing stage, all detected n-order epistatic interactions are used to construct a hypergraph, where a real vertex represents the main effect of a SNP and a virtual vertex denotes the interaction effect of an n-order epistatic interaction. By deeply analyzing the constructed hypergraph, some hidden clues for better understanding the underlying genetic architecture of complex diseases could be revealed.

CONCLUSIONS

Experiments of CINOEDV and its comparison with existing state-of-the-art methods are performed on both simulation data sets and a real data set of age-related macular degeneration. Results demonstrate that CINOEDV is promising in detecting and visualizing n-order epistatic interactions. CINOEDV is implemented in R and is freely available from R CRAN: http://cran.r-project.org and https://sourceforge.net/projects/cinoedv/files/ .

Collapse

An Improved Opposition-Based Learning Particle Swarm Optimization for the Detection of SNP-SNP Interactions. BIOMED RESEARCH INTERNATIONAL 2015;2015:524821. [PMID: 26236727 PMCID: PMC4509494 DOI: 10.1155/2015/524821] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Revised: 12/30/2014] [Accepted: 01/02/2015] [Indexed: 12/22/2022]

Gusareva ES, Van Steen K. Practical aspects of genome-wide association interaction analysis. Hum Genet 2014;133:1343-58. [DOI: 10.1007/s00439-014-1480-y] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Accepted: 08/18/2014] [Indexed: 12/31/2022]

Zuo X, Rao S, Fan A, Lin M, Li H, Zhao X, Qin J. To control false positives in gene-gene interaction analysis: two novel conditional entropy-based approaches. PLoS One 2013;8:e81984. [PMID: 24339984 PMCID: PMC3858311 DOI: 10.1371/journal.pone.0081984] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2013] [Accepted: 10/19/2013] [Indexed: 11/24/2022] Open

Hu T, Chen Y, Kiralis JW, Moore JH. ViSEN: methodology and software for visualization of statistical epistasis networks. Genet Epidemiol 2013;37:283-5. [PMID: 23468157 DOI: 10.1002/gepi.21718] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2012] [Revised: 12/20/2012] [Accepted: 02/05/2013] [Indexed: 11/06/2022]

Hu T, Chen Y, Kiralis JW, Collins RL, Wejse C, Sirugo G, Williams SM, Moore JH. An information-gain approach to detecting three-way epistatic interactions in genetic association studies. J Am Med Inform Assoc 2013;20:630-6. [PMID: 23396514 PMCID: PMC3721169 DOI: 10.1136/amiajnl-2012-001525] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Aschard H, Lutz S, Maus B, Duell EJ, Fingerlin TE, Chatterjee N, Kraft P, Van Steen K. Challenges and opportunities in genome-wide environmental interaction (GWEI) studies. Hum Genet 2012;131:1591-613. [PMID: 22760307 DOI: 10.1007/s00439-012-1192-0] [Citation(s) in RCA: 110] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Accepted: 06/11/2012] [Indexed: 02/03/2023]

Van Steen K. Perspectives on genome-wide multi-stage family-based association studies. Stat Med 2011;30:2201-21. [DOI: 10.1002/sim.4259] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2010] [Accepted: 03/07/2011] [Indexed: 01/03/2023]

Steen KV. Travelling the world of gene-gene interactions. Brief Bioinform 2011;13:1-19. [PMID: 21441561 DOI: 10.1093/bib/bbr012] [Citation(s) in RCA: 122] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open

Cattaert T, Urrea V, Naj AC, De Lobel L, De Wit V, Fu M, Mahachie John JM, Shen H, Calle ML, Ritchie MD, Edwards TL, Van Steen K. FAM-MDR: a flexible family-based multifactor dimensionality reduction technique to detect epistasis using related individuals. PLoS One 2010;5:e10304. [PMID: 20421984 PMCID: PMC2858665 DOI: 10.1371/journal.pone.0010304] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2010] [Accepted: 03/01/2010] [Indexed: 12/05/2022] Open

Abstract

We propose a novel multifactor dimensionality reduction method for epistasis detection in small or extended pedigrees, FAM-MDR. It combines features of the Genome-wide Rapid Association using Mixed Model And Regression approach (GRAMMAR) with Model-Based MDR (MB-MDR). We focus on continuous traits, although the method is general and can be used for outcomes of any type, including binary and censored traits. When comparing FAM-MDR with Pedigree-based Generalized MDR (PGMDR), which is a generalization of Multifactor Dimensionality Reduction (MDR) to continuous traits and related individuals, FAM-MDR was found to outperform PGMDR in terms of power, in most of the considered simulated scenarios. Additional simulations revealed that PGMDR does not appropriately deal with multiple testing and consequently gives rise to overly optimistic results. FAM-MDR adequately deals with multiple testing in epistasis screens and is in contrast rather conservative, by construction. Furthermore, simulations show that correcting for lower order (main) effects is of utmost importance when claiming epistasis. As Type 2 Diabetes Mellitus (T2DM) is a complex phenotype likely influenced by gene-gene interactions, we applied FAM-MDR to examine data on glucose area-under-the-curve (GAUC), an endophenotype of T2DM for which multiple independent genetic associations have been observed, in the Amish Family Diabetes Study (AFDS). This application reveals that FAM-MDR makes more efficient use of the available data than PGMDR and can deal with multi-generational pedigrees more easily. In conclusion, we have validated FAM-MDR and compared it to PGMDR, the current state-of-the-art MDR method for family data, using both simulations and a practical dataset. FAM-MDR is found to outperform PGMDR in that it handles the multiple testing issue more correctly, has increased power, and efficiently uses all available information.

Collapse