1
|
Singhal P, Verma SS, Ritchie MD. Gene Interactions in Human Disease Studies-Evidence Is Mounting. Annu Rev Biomed Data Sci 2023; 6:377-395. [PMID: 37196359 DOI: 10.1146/annurev-biodatasci-102022-120818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
Despite monumental advances in molecular technology to generate genome sequence data at scale, there is still a considerable proportion of heritability in most complex diseases that remains unexplained. Because many of the discoveries have been single-nucleotide variants with small to moderate effects on disease, the functional implication of many of the variants is still unknown and, thus, we have limited new drug targets and therapeutics. We, and many others, posit that one primary factor that has limited our ability to identify novel drug targets from genome-wide association studies may be due to gene interactions (epistasis), gene-environment interactions, network/pathway effects, or multiomic relationships. We propose that many of these complex models explain much of the underlying genetic architecture of complex disease. In this review, we discuss the evidence from multiple research avenues, ranging from pairs of alleles to multiomic integration studies and pharmacogenomics, that supports the need for further investigation of gene interactions (or epistasis) in genetic and genomic studies of human disease. Our goal is to catalog the mounting evidence for epistasis in genetic studies and the connections between genetic interactions and human health and disease that could enable precision medicine of the future.
Collapse
Affiliation(s)
- Pankhuri Singhal
- Genetics and Epigenetics Graduate Group, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Shefali Setia Verma
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA;
- Penn Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
2
|
Woodward AA, Urbanowicz RJ, Naj AC, Moore JH. Genetic heterogeneity: Challenges, impacts, and methods through an associative lens. Genet Epidemiol 2022; 46:555-571. [PMID: 35924480 PMCID: PMC9669229 DOI: 10.1002/gepi.22497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 07/06/2022] [Accepted: 07/19/2022] [Indexed: 01/07/2023]
Abstract
Genetic heterogeneity describes the occurrence of the same or similar phenotypes through different genetic mechanisms in different individuals. Robustly characterizing and accounting for genetic heterogeneity is crucial to pursuing the goals of precision medicine, for discovering novel disease biomarkers, and for identifying targets for treatments. Failure to account for genetic heterogeneity may lead to missed associations and incorrect inferences. Thus, it is critical to review the impact of genetic heterogeneity on the design and analysis of population level genetic studies, aspects that are often overlooked in the literature. In this review, we first contextualize our approach to genetic heterogeneity by proposing a high-level categorization of heterogeneity into "feature," "outcome," and "associative" heterogeneity, drawing on perspectives from epidemiology and machine learning to illustrate distinctions between them. We highlight the unique nature of genetic heterogeneity as a heterogeneous pattern of association that warrants specific methodological considerations. We then focus on the challenges that preclude effective detection and characterization of genetic heterogeneity across a variety of epidemiological contexts. Finally, we discuss systems heterogeneity as an integrated approach to using genetic and other high-dimensional multi-omic data in complex disease research.
Collapse
Affiliation(s)
- Alexa A. Woodward
- Department of Biostatistics, Epidemiology and InformaticsUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Ryan J. Urbanowicz
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCaliforniaUSA
| | - Adam C. Naj
- Department of Biostatistics, Epidemiology and InformaticsUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Jason H. Moore
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCaliforniaUSA
| |
Collapse
|
3
|
Abstract
BACKGROUND Autoimmune hepatitis has an unknown cause and genetic associations that are not disease-specific or always present. Clarification of its missing causality and heritability could improve prevention and management strategies. AIMS Describe the key epigenetic and genetic mechanisms that could account for missing causality and heritability in autoimmune hepatitis; indicate the prospects of these mechanisms as pivotal factors; and encourage investigations of their pathogenic role and therapeutic potential. METHODS English abstracts were identified in PubMed using multiple key search phases. Several hundred abstracts and 210 full-length articles were reviewed. RESULTS Environmental induction of epigenetic changes is the prime candidate for explaining the missing causality of autoimmune hepatitis. Environmental factors (diet, toxic exposures) can alter chromatin structure and the production of micro-ribonucleic acids that affect gene expression. Epistatic interaction between unsuspected genes is the prime candidate for explaining the missing heritability. The non-additive, interactive effects of multiple genes could enhance their impact on the propensity and phenotype of autoimmune hepatitis. Transgenerational inheritance of acquired epigenetic marks constitutes another mechanism of transmitting parental adaptations that could affect susceptibility. Management strategies could range from lifestyle adjustments and nutritional supplements to precision editing of the epigenetic landscape. CONCLUSIONS Autoimmune hepatitis has a missing causality that might be explained by epigenetic changes induced by environmental factors and a missing heritability that might reflect epistatic gene interactions or transgenerational transmission of acquired epigenetic marks. These unassessed or under-evaluated areas warrant investigation.
Collapse
|
4
|
Wang X, Cao X, Feng Y, Guo M, Yu G, Wang J. ELSSI: parallel SNP-SNP interactions detection by ensemble multi-type detectors. Brief Bioinform 2022; 23:6607749. [PMID: 35696639 DOI: 10.1093/bib/bbac213] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 04/18/2022] [Accepted: 05/07/2022] [Indexed: 12/11/2022] Open
Abstract
With the development of high-throughput genotyping technology, single nucleotide polymorphism (SNP)-SNP interactions (SSIs) detection has become an essential way for understanding disease susceptibility. Various methods have been proposed to detect SSIs. However, given the disease complexity and bias of individual SSI detectors, these single-detector-based methods are generally unscalable for real genome-wide data and with unfavorable results. We propose a novel ensemble learning-based approach (ELSSI) that can significantly reduce the bias of individual detectors and their computational load. ELSSI randomly divides SNPs into different subsets and evaluates them by multi-type detectors in parallel. Particularly, ELSSI introduces a four-stage pipeline (generate, score, switch and filter) to iteratively generate new SNP combination subsets from SNP subsets, score the combination subset by individual detectors, switch high-score combinations to other detectors for re-scoring, then filter out combinations with low scores. This pipeline makes ELSSI able to detect high-order SSIs from large genome-wide datasets. Experimental results on various simulated and real genome-wide datasets show the superior efficacy of ELSSI to state-of-the-art methods in detecting SSIs, especially for high-order ones. ELSSI is applicable with moderate PCs on the Internet and flexible to assemble new detectors. The code of ELSSI is available at https://www.sdu-idea.cn/codes.php?name=ELSSI.
Collapse
Affiliation(s)
- Xin Wang
- School of Software, Shandong University, Jinan 250101, China.,Joint SDU-NTU Centre for Artificial Intelligence Research(C-FAIR), Shandong University, Jinan 250101, China
| | - Xia Cao
- College of Computer and Information Sciences, Southwest University, Chongqing 400715, China
| | - Yuantao Feng
- College of Computer and Information Sciences, Southwest University, Chongqing 400715, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
| | - Guoxian Yu
- School of Software, Shandong University, Jinan 250101, China
| | - Jun Wang
- Joint SDU-NTU Centre for Artificial Intelligence Research(C-FAIR), Shandong University, Jinan 250101, China
| |
Collapse
|
5
|
Sakai T, Abe A, Shimizu M, Terauchi R. RIL-StEp: epistasis analysis of rice recombinant inbred lines reveals candidate interacting genes that control seed hull color and leaf chlorophyll content. G3 (BETHESDA, MD.) 2021; 11:jkab130. [PMID: 33871605 PMCID: PMC8496299 DOI: 10.1093/g3journal/jkab130] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Accepted: 04/10/2021] [Indexed: 11/19/2022]
Abstract
Characterizing epistatic gene interactions is fundamental for understanding the genetic architecture of complex traits. However, due to the large number of potential gene combinations, detecting epistatic gene interactions is computationally demanding. A simple, easy-to-perform method for sensitive detection of epistasis is required. Due to their homozygous nature, use of recombinant inbred lines excludes the dominance effect of alleles and interactions involving heterozygous genotypes, thereby allowing detection of epistasis in a simple and interpretable model. Here, we present an approach called RIL-StEp (recombinant inbred lines stepwise epistasis detection) to detect epistasis using single-nucleotide polymorphisms in the genome. We applied the method to reveal epistasis affecting rice (Oryza sativa) seed hull color and leaf chlorophyll content and successfully identified pairs of genomic regions that presumably control these phenotypes. This method has the potential to improve our understanding of the genetic architecture of various traits of crops and other organisms.
Collapse
Affiliation(s)
- Toshiyuki Sakai
- Laboratory of Crop Evolution, Graduate School of Agriculture, Kyoto University, Mozume, Muko, Kyoto 617-0001, Japan
- The Sainsbury Laboratory, University of East Anglia, Norwich Research Park, Norwich NR4 7UH, UK
| | - Akira Abe
- Iwate Biotechnology Research Center, Kitakami, Iwate 024-0003, Japan
| | - Motoki Shimizu
- Iwate Biotechnology Research Center, Kitakami, Iwate 024-0003, Japan
| | - Ryohei Terauchi
- Laboratory of Crop Evolution, Graduate School of Agriculture, Kyoto University, Mozume, Muko, Kyoto 617-0001, Japan
- Iwate Biotechnology Research Center, Kitakami, Iwate 024-0003, Japan
| |
Collapse
|
6
|
Vasilopoulou C, Morris AP, Giannakopoulos G, Duguez S, Duddy W. What Can Machine Learning Approaches in Genomics Tell Us about the Molecular Basis of Amyotrophic Lateral Sclerosis? J Pers Med 2020; 10:E247. [PMID: 33256133 PMCID: PMC7712791 DOI: 10.3390/jpm10040247] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 11/21/2020] [Accepted: 11/23/2020] [Indexed: 02/07/2023] Open
Abstract
Amyotrophic Lateral Sclerosis (ALS) is the most common late-onset motor neuron disorder, but our current knowledge of the molecular mechanisms and pathways underlying this disease remain elusive. This review (1) systematically identifies machine learning studies aimed at the understanding of the genetic architecture of ALS, (2) outlines the main challenges faced and compares the different approaches that have been used to confront them, and (3) compares the experimental designs and results produced by those approaches and describes their reproducibility in terms of biological results and the performances of the machine learning models. The majority of the collected studies incorporated prior knowledge of ALS into their feature selection approaches, and trained their machine learning models using genomic data combined with other types of mined knowledge including functional associations, protein-protein interactions, disease/tissue-specific information, epigenetic data, and known ALS phenotype-genotype associations. The importance of incorporating gene-gene interactions and cis-regulatory elements into the experimental design of future ALS machine learning studies is highlighted. Lastly, it is suggested that future advances in the genomic and machine learning fields will bring about a better understanding of ALS genetic architecture, and enable improved personalized approaches to this and other devastating and complex diseases.
Collapse
Affiliation(s)
- Christina Vasilopoulou
- Northern Ireland Centre for Stratified Medicine, Altnagelvin Hospital Campus, Ulster University, Londonderry BT47 6SB, UK; (C.V.); (S.D.)
| | - Andrew P. Morris
- Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, Manchester Academic Health Science Centre, University of Manchester, Manchester M13 9PT, UK;
| | - George Giannakopoulos
- Institute of Informatics and Telecommunications, NCSR Demokritos, 153 10 Aghia Paraskevi, Greece;
- Science For You (SciFY) PNPC, TEPA Lefkippos-NCSR Demokritos, 27, Neapoleos, 153 41 Ag. Paraskevi, Greece
| | - Stephanie Duguez
- Northern Ireland Centre for Stratified Medicine, Altnagelvin Hospital Campus, Ulster University, Londonderry BT47 6SB, UK; (C.V.); (S.D.)
| | - William Duddy
- Northern Ireland Centre for Stratified Medicine, Altnagelvin Hospital Campus, Ulster University, Londonderry BT47 6SB, UK; (C.V.); (S.D.)
| |
Collapse
|
7
|
Cao X, Yu G, Ren W, Guo M, Wang J. DualWMDR: Detecting epistatic interaction with dual screening and multifactor dimensionality reduction. Hum Mutat 2019; 41:719-734. [DOI: 10.1002/humu.23951] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 09/10/2019] [Accepted: 11/07/2019] [Indexed: 12/14/2022]
Affiliation(s)
- Xia Cao
- College of Computer and Information ScienceSouthwest UniversityChongqing China
| | - Guoxian Yu
- College of Computer and Information ScienceSouthwest UniversityChongqing China
| | - Wei Ren
- College of Computer and Information ScienceSouthwest UniversityChongqing China
| | - Maozu Guo
- School of Electrical and Information EngineeringBeijing University of Civil Engineering and ArchitectureBeijing China
- Beijing Key Laboratory of Intelligent Processing for Building Big DataBeijing China
| | - Jun Wang
- College of Computer and Information ScienceSouthwest UniversityChongqing China
| |
Collapse
|
8
|
Epistasis detectably alters correlations between genomic sites in a narrow parameter window. PLoS One 2019; 14:e0214036. [PMID: 31150393 PMCID: PMC6544209 DOI: 10.1371/journal.pone.0214036] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Accepted: 05/18/2019] [Indexed: 01/12/2023] Open
Abstract
Different genomic sites evolve inter-dependently due to the combined action of epistasis, defined as a non-multiplicative contribution of alleles at different loci to genome fitness, and the physical linkage of different loci in genome. Both epistasis and linkage, partially compensated by recombination, cause correlations between allele frequencies at the loci (linkage disequilibrium, LD). The interaction and competition between epistasis and linkage are not fully understood, nor is their relative sensitivity to recombination. Modeling an adapting population in the presence of random mutation, natural selection, pairwise epistasis, and random genetic drift, we compare the contributions of epistasis and linkage. For this end, we use a panel of haplotype-based measures of LD and their various combinations calculated for epistatic and non-epistatic pairs separately. We compute the optimal percentages of detected and false positive pairs in a one-time sample of a population of moderate size. We demonstrate that true interacting pairs can be told apart in a sufficiently short genome within a narrow window of time and parameters. Outside of this parameter region, unless the population is extremely large, shared ancestry of individual sequences generates pervasive stochastic LD for non-interacting pairs masking true epistatic associations. In the presence of sufficiently strong recombination, linkage effects decrease faster than those of epistasis, and the detection of epistasis improves. We demonstrate that the epistasis component of locus association can be isolated, at a single time point, by averaging haplotype frequencies over multiple independent populations. These results demonstrate the existence of fundamental restrictions on the protocols for detecting true interactions in DNA sequence sets.
Collapse
|
9
|
Chen AH, Ge W, Metcalf W, Jakobsson E, Mainzer LS, Lipka AE. An assessment of true and false positive detection rates of stepwise epistatic model selection as a function of sample size and number of markers. Heredity (Edinb) 2019; 122:660-671. [PMID: 30443009 PMCID: PMC6462028 DOI: 10.1038/s41437-018-0162-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2018] [Revised: 10/19/2018] [Accepted: 10/28/2018] [Indexed: 12/21/2022] Open
Abstract
Association studies have been successful at identifying genomic regions associated with important traits, but routinely employ models that only consider the additive contribution of an individual marker. Because quantitative trait variability typically arises from multiple additive and non-additive sources, utilization of statistical approaches that include main and two-way interaction marker effects of several loci in one model could lead to unprecedented characterization of these sources. Here we examine the ability of one such approach, called the Stepwise Procedure for constructing an Additive and Epistatic Multi-Locus model (SPAEML), to detect additive and epistatic signals simulated using maize and human marker data. Our results revealed that SPAEML was capable of detecting quantitative trait nucleotides (QTNs) at sample sizes as low as n = 300 and consistently specifying signals as additive and epistatic for larger sizes. Sample size and minor allele frequency had a major influence on SPAEML's ability to distinguish between additive and epistatic signals, while the number of markers tested did not. We conclude that SPAEML is a useful approach for providing further elucidation of the additive and epistatic sources contributing to trait variability when applied to a small subset of genome-wide markers located within specific genomic regions identified using a priori analyses.
Collapse
Affiliation(s)
- Angela H Chen
- Department of Statistics, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Weihao Ge
- Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
- Center for Biophysics and Computational Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - William Metcalf
- Department of Computer Sciences, Rose-Hulman Institute of Technology, Terre Haute, IN, 47803, USA
| | - Eric Jakobsson
- Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
- Center for Biophysics and Computational Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
- Neuroscience Program, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Liudmila Sergeevna Mainzer
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
- Department of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
| | - Alexander E Lipka
- Department of Crop Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
| |
Collapse
|
10
|
Van Steen K, Moore JH. How to increase our belief in discovered statistical interactions via large-scale association studies? Hum Genet 2019; 138:293-305. [PMID: 30840129 PMCID: PMC6483943 DOI: 10.1007/s00439-019-01987-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 02/20/2019] [Indexed: 12/31/2022]
Abstract
The understanding that differences in biological epistasis may impact disease risk, diagnosis, or disease management stands in wide contrast to the unavailability of widely accepted large-scale epistasis analysis protocols. Several choices in the analysis workflow will impact false-positive and false-negative rates. One of these choices relates to the exploitation of particular modelling or testing strategies. The strengths and limitations of these need to be well understood, as well as the contexts in which these hold. This will contribute to determining the potentially complementary value of epistasis detection workflows and is expected to increase replication success with biological relevance. In this contribution, we take a recently introduced regression-based epistasis detection tool as a leading example to review the key elements that need to be considered to fully appreciate the value of analytical epistasis detection performance assessments. We point out unresolved hurdles and give our perspectives towards overcoming these.
Collapse
Affiliation(s)
- K Van Steen
- WELBIO, GIGA-R Medical Genomics-BIO3, University of Liège, Liege, Belgium.
- Department of Human Genetics, University of Leuven, Leuven, Belgium.
| | - J H Moore
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, USA
| |
Collapse
|
11
|
A Low Resolution Epistasis Mapping Approach To Identify Chromosome Arm Interactions in Allohexaploid Wheat. G3-GENES GENOMES GENETICS 2019; 9:675-684. [PMID: 30455184 PMCID: PMC6404624 DOI: 10.1534/g3.118.200646] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Epistasis is an important contributor to genetic variance. In inbred populations, pairwise epistasis is present as additive by additive interactions. Testing for epistasis presents a multiple testing problem as the pairwise search space for modest numbers of markers is large. Single markers do not necessarily track functional units of interacting chromatin as well as haplotype based methods do. To harness the power of multiple markers while minimizing the number of tests conducted, we present a low resolution test for epistatic interactions across whole chromosome arms. Epistasis covariance matrices were constructed from the additive covariances of individual chromosome arms. These covariances were subsequently used to estimate an epistatic variance parameter while correcting for background additive and epistatic effects. We find significant epistasis for 2% of the interactions tested for four agronomic traits in a winter wheat breeding population. Interactions across homeologous chromosome arms were identified, but were less abundant than other chromosome arm pair interactions. The homeologous chromosome arm pair 4BL/4DL showed a strong negative relationship between additive and interaction effects that may be indicative of functional redundancy. Several chromosome arms appeared to act as hubs in an interaction network, suggesting that they may contain important regulatory factors. The differential patterns of epistasis across different traits demonstrate that detection of epistatic interactions is robust when correcting for background additive and epistatic effects in the population. The low resolution epistasis mapping method presented here identifies important epistatic interactions with a limited number of statistical tests at the cost of low precision.
Collapse
|
12
|
Pedruzzi G, Barlukova A, Rouzine IM. Evolutionary footprint of epistasis. PLoS Comput Biol 2018; 14:e1006426. [PMID: 30222748 PMCID: PMC6177197 DOI: 10.1371/journal.pcbi.1006426] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Revised: 10/09/2018] [Accepted: 08/09/2018] [Indexed: 11/18/2022] Open
Abstract
Variation of an inherited trait across a population cannot be explained by additive contributions of relevant genes, due to epigenetic effects and biochemical interactions (epistasis). Detecting epistasis in genomic data still represents a significant challenge that requires a better understanding of epistasis from the mechanistic point of view. Using a standard Wright-Fisher model of bi-allelic asexual population, we study how compensatory epistasis affects the process of adaptation. The main result is a universal relationship between four haplotype frequencies of a single site pair in a genome, which depends only on the epistasis strength of the pair defined regarding Darwinian fitness. We demonstrate the existence, at any time point, of a quasi-equilibrium between epistasis and disorder (entropy) caused by random genetic drift and mutation. We verify the accuracy of these analytic results by Monte-Carlo simulation over a broad range of parameters, including the topology of the interacting network. Thus, epistasis assists the evolutionary transit through evolutionary hurdles leaving marks at the level of haplotype disequilibrium. The method allows determining selection coefficient for each site and the epistasis strength of each pair from a sequence set. The resulting ability to detect clusters of deleterious mutation close to full compensation is essential for biomedical applications. These findings help to understand the role of epistasis in multiple compensatory mutations in viral resistance to antivirals and immune response.
Collapse
Affiliation(s)
- Gabriele Pedruzzi
- Sorbonne Université, Institute de Biologie Paris-Seine, Laboratoire de Biologie Computationelle et Quantitative, Paris, France
| | - Ayuna Barlukova
- Sorbonne Université, Institute de Biologie Paris-Seine, Laboratoire de Biologie Computationelle et Quantitative, Paris, France
| | - Igor M. Rouzine
- Sorbonne Université, Institute de Biologie Paris-Seine, Laboratoire de Biologie Computationelle et Quantitative, Paris, France
- * E-mail:
| |
Collapse
|
13
|
Kukla-Bartoszek M, Pośpiech E, Spólnicka M, Karłowska-Pik J, Strapagiel D, Żądzińska E, Rosset I, Sobalska-Kwapis M, Słomka M, Walsh S, Kayser M, Sitek A, Branicki W. Investigating the impact of age-depended hair colour darkening during childhood on DNA-based hair colour prediction with the HIrisPlex system. Forensic Sci Int Genet 2018; 36:26-33. [DOI: 10.1016/j.fsigen.2018.06.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Revised: 05/12/2018] [Accepted: 06/06/2018] [Indexed: 12/14/2022]
|
14
|
Ritchie MD, Van Steen K. The search for gene-gene interactions in genome-wide association studies: challenges in abundance of methods, practical considerations, and biological interpretation. ANNALS OF TRANSLATIONAL MEDICINE 2018; 6:157. [PMID: 29862246 DOI: 10.21037/atm.2018.04.05] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
One of the primary goals in this era of precision medicine is to understand the biology of human diseases and their treatment, such that each individual patient receives the best possible treatment for their disease based on their genetic and environmental exposures. One way to work towards achieving this goal is to identify the environmental exposures and genetic variants that are relevant to each disease in question, as well as the complex interplay between genes and environment. Genome-wide association studies (GWAS) have allowed for a greater understanding of the genetic component of many complex traits. However, these genetic effects are largely small and thus, our ability to use these GWAS finding for precision medicine is limited. As more and more GWAS have been performed, rather than focusing only on common single nucleotide polymorphisms (SNPs) and additive genetic models, many researchers have begun to explore alternative heritable components of complex traits including rare variants, structural variants, epigenetics, and genetic interactions. While genetic interactions are a plausible reality that could explain some of the heritabliy that has not yet been identified, especially when one considers the identification of genetic interactions in model organisms as well as our understanding of biological complexity, still there are significant challenges and considerations in identifying these genetic interactions. Broadly, these can be summarized in three categories: abundance of methods, practical considerations, and biological interpretation. In this review, we will discuss these important elements in the search for genetic interactions along with some potential solutions. While genetic interactions are theoretically understood to be important for complex human disease, the body of evidence is still building to support this component of the underlying genetic architecture of complex human traits. Our hope is that more sophisticated modeling approaches and more robust computational techniques will enable the community to identify these important genetic interactions and improve our ability to implement precision medicine in the future.
Collapse
Affiliation(s)
- Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Kristel Van Steen
- WELBIO, GIGA-R Medical Genomics Unit - BIO3, University of Liège, Liège, Belgium.,Department of Human Genetics, University of Leuven, Leuven, Belgium
| |
Collapse
|
15
|
Uppu S, Krishna A, Gopalan RP. A Review on Methods for Detecting SNP Interactions in High-Dimensional Genomic Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:599-612. [PMID: 28060710 DOI: 10.1109/tcbb.2016.2635125] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In this era of genome-wide association studies (GWAS), the quest for understanding the genetic architecture of complex diseases is rapidly increasing more than ever before. The development of high throughput genotyping and next generation sequencing technologies enables genetic epidemiological analysis of large scale data. These advances have led to the identification of a number of single nucleotide polymorphisms (SNPs) responsible for disease susceptibility. The interactions between SNPs associated with complex diseases are increasingly being explored in the current literature. These interaction studies are mathematically challenging and computationally complex. These challenges have been addressed by a number of data mining and machine learning approaches. This paper reviews the current methods and the related software packages to detect the SNP interactions that contribute to diseases. The issues that need to be considered when developing these models are addressed in this review. The paper also reviews the achievements in data simulation to evaluate the performance of these models. Further, it discusses the future of SNP interaction analysis.
Collapse
|
16
|
Hall MA, Moore JH, Ritchie MD. Embracing Complex Associations in Common Traits: Critical Considerations for Precision Medicine. Trends Genet 2017; 32:470-484. [PMID: 27392675 DOI: 10.1016/j.tig.2016.06.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Revised: 06/01/2016] [Accepted: 06/02/2016] [Indexed: 10/21/2022]
Abstract
Genome-wide association studies (GWAS) have identified numerous loci associated with human phenotypes. This approach, however, does not consider the richly diverse and complex environment with which humans interact throughout the life course, nor does it allow for interrelationships between genetic loci and across traits. As we move toward making precision medicine a reality, whereby we make predictions about disease risk based on genomic profiles, we need to identify improved predictive models of the relationship between genome and phenome. Methods that embrace pleiotropy (the effect of one locus on more than one trait), and gene-environment (G×E) and gene-gene (G×G) interactions, will further unveil the impact of alterations in biological pathways and identify genes that are only involved with disease in the context of the environment. This valuable information can be used to assess personal risk and choose the most appropriate medical interventions based on the genotype and environment of an individual, the whole premise of precision medicine.
Collapse
Affiliation(s)
- Molly A Hall
- Institute for Biomedical Informatics, Departments of Genetics and Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, 3535 Market Street, Philadelphia, PA 19104, USA
| | - Jason H Moore
- Institute for Biomedical Informatics, Departments of Genetics and Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, 3535 Market Street, Philadelphia, PA 19104, USA
| | - Marylyn D Ritchie
- Biomedical and Translational Informatics, Geisinger Health System, Danville, PA, USA; Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Eberly College of Science, The Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
17
|
Li R, Kim D, Ritchie MD. Methods to analyze big data in pharmacogenomics research. Pharmacogenomics 2017; 18:807-820. [PMID: 28612644 DOI: 10.2217/pgs-2016-0152] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
The scale and scope of pharmacogenomics research continues to expand as the cost and efficiency of molecular data generation techniques advance. These new technologies give rise to enormous opportunity for the identification of important genetic and genomic factors important for drug treatment response. With this opportunity come significant challenges. Most of these can be categorized as 'big data' issues, facing not only pharmacogenomics, but other fields in the life sciences as well. In this review, we describe some of the analysis techniques and tools being implemented for genetic/genomic discovery in pharmacogenomics.
Collapse
Affiliation(s)
- Ruowang Li
- Bioinformatics & Genomics Graduate Program, The Pennsylvania State University, University Park, PA 16802, USA
| | - Dokyoon Kim
- Biomedical & Translational Informatics Institute, Geisinger Health System, Danville, PA 17821, USA
| | - Marylyn D Ritchie
- Bioinformatics & Genomics Graduate Program, The Pennsylvania State University, University Park, PA 16802, USA.,Biomedical & Translational Informatics Institute, Geisinger Health System, Danville, PA 17821, USA
| |
Collapse
|
18
|
Moore JH, Andrews PC, Olson RS, Carlson SE, Larock CR, Bulhoes MJ, O'Connor JP, Greytak EM, Armentrout SL. Grid-based stochastic search for hierarchical gene-gene interactions in population-based genetic studies of common human diseases. BioData Min 2017; 10:19. [PMID: 28572842 PMCID: PMC5450417 DOI: 10.1186/s13040-017-0139-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2016] [Accepted: 05/18/2017] [Indexed: 11/18/2022] Open
Abstract
Background Large-scale genetic studies of common human diseases have focused almost exclusively on the independent main effects of single-nucleotide polymorphisms (SNPs) on disease susceptibility. These studies have had some success, but much of the genetic architecture of common disease remains unexplained. Attention is now turning to detecting SNPs that impact disease susceptibility in the context of other genetic factors and environmental exposures. These context-dependent genetic effects can manifest themselves as non-additive interactions, which are more challenging to model using parametric statistical approaches. The dimensionality that results from a multitude of genotype combinations, which results from considering many SNPs simultaneously, renders these approaches underpowered. We previously developed the multifactor dimensionality reduction (MDR) approach as a nonparametric and genetic model-free machine learning alternative. Approaches such as MDR can improve the power to detect gene-gene interactions but are limited in their ability to exhaustively consider SNP combinations in genome-wide association studies (GWAS), due to the combinatorial explosion of the search space. We introduce here a stochastic search algorithm called Crush for the application of MDR to modeling high-order gene-gene interactions in genome-wide data. The Crush-MDR approach uses expert knowledge to guide probabilistic searches within a framework that capitalizes on the use of biological knowledge to filter gene sets prior to analysis. Here we evaluated the ability of Crush-MDR to detect hierarchical sets of interacting SNPs using a biology-based simulation strategy that assumes non-additive interactions within genes and additivity in genetic effects between sets of genes within a biochemical pathway. Results We show that Crush-MDR is able to identify genetic effects at the gene or pathway level significantly better than a baseline random search with the same number of model evaluations. We then applied the same methodology to a GWAS for Alzheimer’s disease and showed base level validation that Crush-MDR was able to identify a set of interacting genes with biological ties to Alzheimer’s disease. Conclusions We discuss the role of stochastic search and cloud computing for detecting complex genetic effects in genome-wide data.
Collapse
Affiliation(s)
- Jason H Moore
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104 PA USA
| | - Peter C Andrews
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104 PA USA
| | - Randal S Olson
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104 PA USA
| | | | | | | | | | | | | |
Collapse
|
19
|
Mitra I, Lavillaureix A, Yeh E, Traglia M, Tsang K, Bearden CE, Rauen KA, Weiss LA. Reverse Pathway Genetic Approach Identifies Epistasis in Autism Spectrum Disorders. PLoS Genet 2017; 13:e1006516. [PMID: 28076348 PMCID: PMC5226683 DOI: 10.1371/journal.pgen.1006516] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 12/01/2016] [Indexed: 02/08/2023] Open
Abstract
Although gene-gene interaction, or epistasis, plays a large role in complex traits in model organisms, genome-wide by genome-wide searches for two-way interaction have limited power in human studies. We thus used knowledge of a biological pathway in order to identify a contribution of epistasis to autism spectrum disorders (ASDs) in humans, a reverse-pathway genetic approach. Based on previous observation of increased ASD symptoms in Mendelian disorders of the Ras/MAPK pathway (RASopathies), we showed that common SNPs in RASopathy genes show enrichment for association signal in GWAS (P = 0.02). We then screened genome-wide for interactors with RASopathy gene SNPs and showed strong enrichment in ASD-affected individuals (P < 2.2 x 10-16), with a number of pairwise interactions meeting genome-wide criteria for significance. Finally, we utilized quantitative measures of ASD symptoms in RASopathy-affected individuals to perform modifier mapping via GWAS. One top region overlapped between these independent approaches, and we showed dysregulation of a gene in this region, GPR141, in a RASopathy neural cell line. We thus used orthogonal approaches to provide strong evidence for a contribution of epistasis to ASDs, confirm a role for the Ras/MAPK pathway in idiopathic ASDs, and to identify a convergent candidate gene that may interact with the Ras/MAPK pathway.
Collapse
Affiliation(s)
- Ileena Mitra
- Department of Psychiatry, University of California San Francisco, San Francisco, California, United States of America
- Institute for Human Genetics, University of California San Francisco, San Francisco, California, United States of America
| | - Alinoë Lavillaureix
- Department of Psychiatry, University of California San Francisco, San Francisco, California, United States of America
- Institute for Human Genetics, University of California San Francisco, San Francisco, California, United States of America
- Université Paris Descartes, Sorbonne Paris Cité, Faculty of Medicine, Paris, France
| | - Erika Yeh
- Department of Psychiatry, University of California San Francisco, San Francisco, California, United States of America
- Institute for Human Genetics, University of California San Francisco, San Francisco, California, United States of America
| | - Michela Traglia
- Department of Psychiatry, University of California San Francisco, San Francisco, California, United States of America
- Institute for Human Genetics, University of California San Francisco, San Francisco, California, United States of America
| | - Kathryn Tsang
- Department of Psychiatry, University of California San Francisco, San Francisco, California, United States of America
- Institute for Human Genetics, University of California San Francisco, San Francisco, California, United States of America
| | - Carrie E. Bearden
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Psychology, University of California Los Angeles, Los Angeles, California, United States of America
| | - Katherine A. Rauen
- Institute for Human Genetics, University of California San Francisco, San Francisco, California, United States of America
- Department of Pediatrics, School of Medicine, University of California San Francisco, San Francisco, California, United States of America
| | - Lauren A. Weiss
- Department of Psychiatry, University of California San Francisco, San Francisco, California, United States of America
- Institute for Human Genetics, University of California San Francisco, San Francisco, California, United States of America
| |
Collapse
|
20
|
Identifying gene-gene interactions that are highly associated with four quantitative lipid traits across multiple cohorts. Hum Genet 2016; 136:165-178. [PMID: 27848076 DOI: 10.1007/s00439-016-1738-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2016] [Accepted: 10/07/2016] [Indexed: 10/20/2022]
Abstract
Genetic loci explain only 25-30 % of the heritability observed in plasma lipid traits. Epistasis, or gene-gene interactions may contribute to a portion of this missing heritability. Using the genetic data from five NHLBI cohorts of 24,837 individuals, we combined the use of the quantitative multifactor dimensionality reduction (QMDR) algorithm with two SNP-filtering methods to exhaustively search for SNP-SNP interactions that are associated with HDL cholesterol (HDL-C), LDL cholesterol (LDL-C), total cholesterol (TC) and triglycerides (TG). SNPs were filtered either on the strength of their independent effects (main effect filter) or the prior knowledge supporting a given interaction (Biofilter). After the main effect filter, QMDR identified 20 SNP-SNP models associated with HDL-C, 6 associated with LDL-C, 3 associated with TC, and 10 associated with TG (permutation P value <0.05). With the use of Biofilter, we identified 2 SNP-SNP models associated with HDL-C, 3 associated with LDL-C, 1 associated with TC and 8 associated with TG (permutation P value <0.05). In an independent dataset of 7502 individuals from the eMERGE network, we replicated 14 of the interactions identified after main effect filtering: 11 for HDL-C, 1 for LDL-C and 2 for TG. We also replicated 23 of the interactions found to be associated with TG after applying Biofilter. Prior knowledge supports the possible role of these interactions in the genetic etiology of lipid traits. This study also presents a computationally efficient pipeline for analyzing data from large genotyping arrays and detecting SNP-SNP interactions that are not primarily driven by strong main effects.
Collapse
|
21
|
Gompert Z, Egan SP, Barrett RDH, Feder JL, Nosil P. Multilocus approaches for the measurement of selection on correlated genetic loci. Mol Ecol 2016; 26:365-382. [DOI: 10.1111/mec.13867] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2016] [Revised: 09/20/2016] [Accepted: 09/26/2016] [Indexed: 02/02/2023]
Affiliation(s)
| | - Scott P. Egan
- Department of BioSciences Rice University Houston TX 77005 USA
| | | | - Jeffrey L. Feder
- Department of Biological Science University of Notre Dame South Bend IN 46556 USA
| | - Patrik Nosil
- Department of Animal and Plant Sciences University of Sheffield Sheffield S10 2TN UK
| |
Collapse
|
22
|
Verma SS, Cooke Bailey JN, Lucas A, Bradford Y, Linneman JG, Hauser MA, Pasquale LR, Peissig PL, Brilliant MH, McCarty CA, Haines JL, Wiggs JL, Vrabec TR, Tromp G, Ritchie MD. Epistatic Gene-Based Interaction Analyses for Glaucoma in eMERGE and NEIGHBOR Consortium. PLoS Genet 2016; 12:e1006186. [PMID: 27623284 PMCID: PMC5021356 DOI: 10.1371/journal.pgen.1006186] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 06/22/2016] [Indexed: 12/22/2022] Open
Abstract
Primary open angle glaucoma (POAG) is a complex disease and is one of the major leading causes of blindness worldwide. Genome-wide association studies have successfully identified several common variants associated with glaucoma; however, most of these variants only explain a small proportion of the genetic risk. Apart from the standard approach to identify main effects of variants across the genome, it is believed that gene-gene interactions can help elucidate part of the missing heritability by allowing for the test of interactions between genetic variants to mimic the complex nature of biology. To explain the etiology of glaucoma, we first performed a genome-wide association study (GWAS) on glaucoma case-control samples obtained from electronic medical records (EMR) to establish the utility of EMR data in detecting non-spurious and relevant associations; this analysis was aimed at confirming already known associations with glaucoma and validating the EMR derived glaucoma phenotype. Our findings from GWAS suggest consistent evidence of several known associations in POAG. We then performed an interaction analysis for variants found to be marginally associated with glaucoma (SNPs with main effect p-value <0.01) and observed interesting findings in the electronic MEdical Records and GEnomics Network (eMERGE) network dataset. Genes from the top epistatic interactions from eMERGE data (Likelihood Ratio Test i.e. LRT p-value <1e-05) were then tested for replication in the NEIGHBOR consortium dataset. To replicate our findings, we performed a gene-based SNP-SNP interaction analysis in NEIGHBOR and observed significant gene-gene interactions (p-value <0.001) among the top 17 gene-gene models identified in the discovery phase. Variants from gene-gene interaction analysis that we found to be associated with POAG explain 3.5% of additional genetic variance in eMERGE dataset above what is explained by the SNPs in genes that are replicated from previous GWAS studies (which was only 2.1% variance explained in eMERGE dataset); in the NEIGHBOR dataset, adding replicated SNPs from gene-gene interaction analysis explain 3.4% of total variance whereas GWAS SNPs alone explain only 2.8% of variance. Exploring gene-gene interactions may provide additional insights into many complex traits when explored in properly designed and powered association studies. The complex nature of primary-open angle glaucoma (POAG) has left researchers exploring the genetic architecture and searching for the missing heritability using a number of different study designs. Over the past decade, many studies have been conducted to explain the etiology of POAG; however, a high proportion of estimated heritability still remains unexplained. GWA studies for POAG have identified significant associations but these associations have only explained a small proportion of the genetic risk (odds ratios range between 1–3). In this paper, we sought to confirm the primary genome-wide significant associations that have been discovered so far for glaucoma in phenotypes developed from EMR data in an effort to show that EMR data can be a powerful resource for finding genetic variants influencing POAG susceptibility. Next, we tested for statistical interactions, which can be presented as an important tool in an attempt to explain POAG heritability. We used a reduced list of variants filtered by marginal main effect analysis to look for epistatic interactions. We present our results from replication of gene-based interaction analyses performed in eMERGE and the NEIGHBOR consortium data. Using expression data and annotations from various publicly available databases, the most significant genes that replicated in our analyses show expression in the eye and trabecular meshwork. Analysis for estimation of genetic variance explained by significant associations from previous GWAS and replicated variants from gene-based interactions suggest that these explain 5.6% of variance in eMERGE dataset and also explain 3.4% variance in NEIGHBOR dataset.
Collapse
Affiliation(s)
- Shefali Setia Verma
- Department of Biomedical and Translational Informatics, Geisinger Health System, Danville, Pennsylvania, United States of America
- The Huck Institute of Life Sciences, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Jessica N. Cooke Bailey
- Department of Epidemiology and Biostatistics, Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Anastasia Lucas
- Department of Biomedical and Translational Informatics, Geisinger Health System, Danville, Pennsylvania, United States of America
| | - Yuki Bradford
- Department of Biomedical and Translational Informatics, Geisinger Health System, Danville, Pennsylvania, United States of America
| | - James G. Linneman
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, United States of America
| | - Michael A. Hauser
- Department of Ophthalmology, Duke University Medical Center, Durham, North Carolina, United States of America
| | - Louis R. Pasquale
- Department of Ophthalmology, Harvard Medical School, Massachusetts Eye and Ear Infirmary, Boston, Massachusetts, United States of America
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Peggy L. Peissig
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, United States of America
| | - Murray H. Brilliant
- Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, United States of America
| | | | - Jonathan L. Haines
- Department of Epidemiology and Biostatistics, Institute for Computational Biology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Janey L. Wiggs
- Department of Ophthalmology, Harvard Medical School, Massachusetts Eye and Ear Infirmary, Boston, Massachusetts, United States of America
| | - Tamara R. Vrabec
- Department of Ophthalmology, Geisinger Health System, Danville, Pennsylvania, United States of America
| | - Gerard Tromp
- Division of Molecular Biology and Human Genetics, Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Stellenbosch University, Tygerberg, South Africa
| | - Marylyn D. Ritchie
- Department of Biomedical and Translational Informatics, Geisinger Health System, Danville, Pennsylvania, United States of America
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- * E-mail:
| | | | | |
Collapse
|
23
|
Verma SS, Frase AT, Verma A, Pendergrass SA, Mahony S, Haas DW, Ritchie MD. PHENOME-WIDE INTERACTION STUDY (PheWIS) IN AIDS CLINICAL TRIALS GROUP DATA (ACTG). PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2016; 21:57-68. [PMID: 26776173 PMCID: PMC4722952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Association studies have shown and continue to show a substantial amount of success in identifying links between multiple single nucleotide polymorphisms (SNPs) and phenotypes. These studies are also believed to provide insights toward identification of new drug targets and therapies. Albeit of all the success, challenges still remain for applying and prioritizing these associations based on available biological knowledge. Along with single variant association analysis, genetic interactions also play an important role in uncovering the etiology and progression of complex traits. For gene-gene interaction analysis, selection of the variants to test for associations still poses a challenge in identifying epistatic interactions among the large list of variants available in high-throughput, genome-wide datasets. Therefore in this study, we propose a pipeline to identify interactions among genetic variants that are associated with multiple phenotypes by prioritizing previously published results from main effect association analysis (genome-wide and phenome-wide association analysis) based on a-priori biological knowledge in AIDS Clinical Trials Group (ACTG) data. We approached the prioritization and filtration of variants by using the results of a previously published single variant PheWAS and then utilizing biological information from the Roadmap Epigenome project. We removed variants in low functional activity regions based on chromatin states annotation and then conducted an exhaustive pairwise interaction search using linear regression analysis. We performed this analysis in two independent pre-treatment clinical trial datasets from ACTG to allow for both discovery and replication. Using a regression framework, we observed 50,798 associations that replicate at p-value 0.01 for 26 phenotypes, among which 2,176 associations for 212 unique SNPs for fasting blood glucose phenotype reach Bonferroni significance and an additional 9,970 interactions for high-density lipoprotein (HDL) phenotype and fasting blood glucose (total of 12,146 associations) reach FDR significance. We conclude that this method of prioritizing variants to look for epistatic interactions can be used extensively for generating hypotheses for genomewide and phenome-wide interaction analyses. This original Phenome-wide Interaction study (PheWIS) can be applied further to patients enrolled in randomized clinical trials to establish the relationship between patient's response to a particular drug therapy and non-linear combination of variants that might be affecting the outcome.
Collapse
Affiliation(s)
- Shefali S Verma
- Center for System Genomics, The Pennsylvania State University, University Park, PA 16802, USA
| | | | | | | | | | | | | |
Collapse
|
24
|
Stephan J, Stegle O, Beyer A. A random forest approach to capture genetic effects in the presence of population structure. Nat Commun 2015; 6:7432. [DOI: 10.1038/ncomms8432] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2014] [Accepted: 05/08/2015] [Indexed: 01/07/2023] Open
|
25
|
Hall MA, Verma SS, Wallace J, Lucas A, Berg RL, Connolly J, Crawford DC, Crosslin DR, de Andrade M, Doheny KF, Haines JL, Harley JB, Jarvik GP, Kitchner T, Kuivaniemi H, Larson EB, Carrell DS, Tromp G, Vrabec TR, Pendergrass SA, McCarty CA, Ritchie MD. Biology-Driven Gene-Gene Interaction Analysis of Age-Related Cataract in the eMERGE Network. Genet Epidemiol 2015; 39:376-84. [PMID: 25982363 PMCID: PMC4550090 DOI: 10.1002/gepi.21902] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2014] [Revised: 02/27/2015] [Accepted: 03/13/2015] [Indexed: 01/19/2023]
Abstract
Bioinformatics approaches to examine gene-gene models provide a means to discover interactions between multiple genes that underlie complex disease. Extensive computational demands and adjusting for multiple testing make uncovering genetic interactions a challenge. Here, we address these issues using our knowledge-driven filtering method, Biofilter, to identify putative single nucleotide polymorphism (SNP) interaction models for cataract susceptibility, thereby reducing the number of models for analysis. Models were evaluated in 3,377 European Americans (1,185 controls, 2,192 cases) from the Marshfield Clinic, a study site of the Electronic Medical Records and Genomics (eMERGE) Network, using logistic regression. All statistically significant models from the Marshfield Clinic were then evaluated in an independent dataset of 4,311 individuals (742 controls, 3,569 cases), using independent samples from additional study sites in the eMERGE Network: Mayo Clinic, Group Health/University of Washington, Vanderbilt University Medical Center, and Geisinger Health System. Eighty-three SNP-SNP models replicated in the independent dataset at likelihood ratio test P < 0.05. Among the most significant replicating models was rs12597188 (intron of CDH1)-rs11564445 (intron of CTNNB1). These genes are known to be involved in processes that include: cell-to-cell adhesion signaling, cell-cell junction organization, and cell-cell communication. Further Biofilter analysis of all replicating models revealed a number of common functions among the genes harboring the 83 replicating SNP-SNP models, which included signal transduction and PI3K-Akt signaling pathway. These findings demonstrate the utility of Biofilter as a biology-driven method, applicable for any genome-wide association study dataset.
Collapse
Affiliation(s)
- Molly A Hall
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Eberly College of Science, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Shefali S Verma
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Eberly College of Science, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - John Wallace
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Eberly College of Science, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Anastasia Lucas
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Eberly College of Science, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Richard L Berg
- Marshfield Clinic, Marshfield, Wisconsin, United States of America
| | - John Connolly
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America
| | - Dana C Crawford
- Department of Epidemiology and Biostatistics, Institute for Computational Biology, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - David R Crosslin
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | | | - Kimberly F Doheny
- Center for Inherited Disease Research, IGM, Johns Hopkins University SOM, Baltimore, Maryland, United States of America
| | - Jonathan L Haines
- Department of Epidemiology and Biostatistics, Institute for Computational Biology, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - John B Harley
- Department of Pediatrics, Cincinnati Children's Hospital, University of Cincinnati, Cincinnati, Ohio, United States of America
| | - Gail P Jarvik
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America.,Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Terrie Kitchner
- Marshfield Clinic, Marshfield, Wisconsin, United States of America
| | - Helena Kuivaniemi
- Geisinger Health System, Danville, Pennsylvania, United States of America
| | - Eric B Larson
- Group Health Research Institute, Seattle, Washington, United States of America
| | - David S Carrell
- Group Health Research Institute, Seattle, Washington, United States of America
| | - Gerard Tromp
- Geisinger Health System, Danville, Pennsylvania, United States of America
| | - Tamara R Vrabec
- Geisinger Health System, Danville, Pennsylvania, United States of America
| | | | | | - Marylyn D Ritchie
- Department of Biochemistry and Molecular Biology, Center for Systems Genomics, Eberly College of Science, The Pennsylvania State University, University Park, Pennsylvania, United States of America.,Geisinger Health System, Danville, Pennsylvania, United States of America
| |
Collapse
|
26
|
Ellis JA, Scurrah KJ, Li YR, Ponsonby AL, Chavez RA, Pezic A, Dwyer T, Akikusa JD, Allen RC, Becker ML, Thompson SD, Lie BA, Flatø B, Førre O, Punaro M, Wise C, Finkel TH, Hakonarson H, Munro JE. Epistasis amongst PTPN2 and genes of the vitamin D pathway contributes to risk of juvenile idiopathic arthritis. J Steroid Biochem Mol Biol 2015; 145:113-20. [PMID: 25460303 DOI: 10.1016/j.jsbmb.2014.10.012] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/15/2014] [Revised: 09/22/2014] [Accepted: 10/12/2014] [Indexed: 11/22/2022]
Abstract
Juvenile idiopathic arthritis (JIA) is a leading cause of childhood-onset disability. Although epistasis (gene-gene interaction) is frequently cited as an important component of heritability in complex diseases such as JIA, there is little compelling evidence that demonstrates such interaction. PTPN2, a vitamin D responsive gene, is a confirmed susceptibility gene in JIA, and PTPN2 has been suggested to interact with vitamin D pathway genes in type 1 diabetes. We therefore, tested for evidence of epistasis amongst PTPN2 and the vitamin D pathway genes GC, VDR, CYP24A1, CYP2R1, and DHCR7 in two independent JIA case-control samples (discovery and replication). In the discovery sample (318 cases, 556 controls), we identified evidence in support of epistasis across six gene-gene combinations (e.g., GC rs1155563 and PTPN2 rs2542151, ORint=0.45, p=0.00085). Replication was obtained for three of these combinations. That is, for GC and PTPN2, CYP2R1 and VDR, and VDR and PTPN2, similar epistasis was observed using the same SNPs or correlated proxies in an independent JIA case-control sample (1008 cases, 9287 controls). Using SNP data imputed across a 4 MB region spanning each gene, we obtained highly significant evidence for epistasis amongst all 6 gene-gene combinations identified in the discovery sample (p-values ranging from 5.6×10(-9) to 7.5×10(-7)). This is the first report of epistasis in JIA risk. Epistasis amongst PTPN2 and vitamin D pathway genes was both demonstrated and replicated.
Collapse
Affiliation(s)
- Justine A Ellis
- Genes, Environment and Complex Disease, Murdoch Childrens Research Institute, Parkville, Victoria 3052, Australia; Department of Paediatrics, University of Melbourne, Parkville, Victoria 3052, Australia.
| | - Katrina J Scurrah
- Department of Physiology, University of Melbourne, Parkville, Victoria 3052, Australia
| | - Yun R Li
- Medical Scientist Training Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA; Center for Applied Genomics and Department of Pediatrics, Abramson Research Center, The Children's Hospital of Philadelphia, Philadelphia,PA 19104, USA
| | - Anne-Louise Ponsonby
- Department of Paediatrics, University of Melbourne, Parkville, Victoria 3052, Australia; Environmental and Genetic Epidemiology Research, Murdoch Childrens Research Institute, Parkville, Victoria 3052, Australia
| | - Raul A Chavez
- Genes, Environment and Complex Disease, Murdoch Childrens Research Institute, Parkville, Victoria 3052, Australia; Department of Paediatrics, University of Melbourne, Parkville, Victoria 3052, Australia
| | - Angela Pezic
- Environmental and Genetic Epidemiology Research, Murdoch Childrens Research Institute, Parkville, Victoria 3052, Australia
| | - Terence Dwyer
- Environmental and Genetic Epidemiology Research, Murdoch Childrens Research Institute, Parkville, Victoria 3052, Australia
| | - Jonathan D Akikusa
- Paediatric Rheumatology Unit, Royal Children's Hospital, Parkville, Victoria 3052, Australia; Arthritis and Rheumatology Research, Murdoch Childrens Research Institute, Parkville, Victoria 3052, Australia
| | - Roger C Allen
- Paediatric Rheumatology Unit, Royal Children's Hospital, Parkville, Victoria 3052, Australia; Arthritis and Rheumatology Research, Murdoch Childrens Research Institute, Parkville, Victoria 3052, Australia
| | - Mara L Becker
- Divisions of Rheumatology and Clinical Pharmacology and Therapeutic Innovation, Children's Mercy Hospitals and Clinics, Kansas City, MO 64108, USA
| | - Susan D Thompson
- Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Benedicte A Lie
- Department of Immunology, Oslo University Hospital and University of Oslo, Rikshospitalet, 0027 Oslo, Norway
| | - Berit Flatø
- Department of Rheumatology, Oslo University Hospital, Rikshospitalet, 0027 Oslo, Norway
| | - Oystein Førre
- Department of Rheumatology, Oslo University Hospital, Rikshospitalet, 0027 Oslo, Norway
| | - Marilynn Punaro
- Pediatric Rheumatology, Texas Scottish Rite Hospital for Children, Dallas, TX 75219, USA
| | - Carol Wise
- Sarah M. and Charles E. Seay Center for Musculoskeletal Research, Texas Scottish Rite Hospital for Children, Dallas, TX 75219, USA
| | - Terri H Finkel
- Department of Pediatrics, Nemours Children's Hospital, Orlando, FL 32827, USA; Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Hakon Hakonarson
- Center for Applied Genomics and Department of Pediatrics, Abramson Research Center, The Children's Hospital of Philadelphia, Philadelphia,PA 19104, USA; Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Jane E Munro
- Paediatric Rheumatology Unit, Royal Children's Hospital, Parkville, Victoria 3052, Australia; Arthritis and Rheumatology Research, Murdoch Childrens Research Institute, Parkville, Victoria 3052, Australia
| |
Collapse
|
27
|
Levin L, Mishmar D. A Genetic View of the Mitochondrial Role in Ageing: Killing Us Softly. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2015; 847:89-106. [DOI: 10.1007/978-1-4939-2404-2_4] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
28
|
Floudas CS, Um N, Kamboh MI, Barmada MM, Visweswaran S. Identifying genetic interactions associated with late-onset Alzheimer's disease. BioData Min 2014; 7:35. [PMID: 25649863 PMCID: PMC4300162 DOI: 10.1186/s13040-014-0035-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 12/06/2014] [Indexed: 01/23/2023] Open
Abstract
Background Identifying genetic interactions in data obtained from genome-wide association studies (GWASs) can help in understanding the genetic basis of complex diseases. The large number of single nucleotide polymorphisms (SNPs) in GWASs however makes the identification of genetic interactions computationally challenging. We developed the Bayesian Combinatorial Method (BCM) that can identify pairs of SNPs that in combination have high statistical association with disease. Results We applied BCM to two late-onset Alzheimer’s disease (LOAD) GWAS datasets to identify SNPs that interact with known Alzheimer associated SNPs. We also compared BCM with logistic regression that is implemented in PLINK. Gene Ontology analysis of genes from the top 200 dataset SNPs for both GWAS datasets showed overrepresentation of LOAD-related terms. Four genes were common to both datasets: APOE and APOC1, which have well established associations with LOAD, and CAMK1D and FBXL13, not previously linked to LOAD but having evidence of involvement in LOAD. Supporting evidence was also found for additional genes from the top 30 dataset SNPs. Conclusion BCM performed well in identifying several SNPs having evidence of involvement in the pathogenesis of LOAD that would not have been identified by univariate analysis due to small main effect. These results provide support for applying BCM to identify potential genetic variants such as SNPs from high dimensional GWAS datasets. Electronic supplementary material The online version of this article (doi:10.1186/s13040-014-0035-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Charalampos S Floudas
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Boulevard, Pittsburgh, PA 15206 USA
| | - Nara Um
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Boulevard, Pittsburgh, PA 15206 USA
| | - M Ilyas Kamboh
- Department of Human Genetics, University of Pittsburgh, 130 De Soto Street, Pittsburgh, PA 15261 USA
| | - Michael M Barmada
- Department of Human Genetics, University of Pittsburgh, 130 De Soto Street, Pittsburgh, PA 15261 USA
| | - Shyam Visweswaran
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Boulevard, Pittsburgh, PA 15206 USA ; The Intelligent Systems Program, University of Pittsburgh, 5113 Sennott Square 210 South Bouquet Street, Pittsburgh, PA 15260 USA
| |
Collapse
|
29
|
Wang X, Zhang D, Tzeng JY. Pathway-guided identification of gene-gene interactions. Ann Hum Genet 2014; 78:478-91. [PMID: 25227508 PMCID: PMC4363308 DOI: 10.1111/ahg.12080] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2014] [Accepted: 07/03/2014] [Indexed: 12/26/2022]
Abstract
Assessing gene-gene interactions (GxG) at the gene level can permit examination of epistasis at biologically functional units with amplified interaction signals from marker-marker pairs. While current gene-based GxG methods tend to be designed for two or a few genes, for complex traits, it is often common to have a list of many candidate genes to explore GxG. We propose a regression model with pathway-guided regularization for detecting interactions among genes. Specifically, we use the principal components to summarize the SNP-SNP interactions between a gene pair, and use an L1 penalty that incorporates adaptive weights based on biological guidance and trait supervision to identify important main and interaction effects. Our approach aims to combine biological guidance and data adaptiveness, and yields credible findings that may be likely to shed insights in order to formulate biological hypotheses for further molecular studies. The proposed approach can be used to explore the GxG with a list of many candidate genes and is applicable even when sample size is smaller than the number of predictors studied. We evaluate the utility of the proposed method using simulation and real data analysis. The results suggest improved performance over methods not utilizing pathway and trait guidance.
Collapse
Affiliation(s)
- Xin Wang
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
- Department of Statistics, North Carolina State University, Raleigh, NC, USA
| | - Daowen Zhang
- Department of Statistics, North Carolina State University, Raleigh, NC, USA
| | - Jung-Ying Tzeng
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
- Department of Statistics, North Carolina State University, Raleigh, NC, USA
- Department of Statistics, National Cheng-Kung University, Tainan, Taiwan
| |
Collapse
|
30
|
Abstract
Genome-wide association studies (GWASs) have become the focus of the statistical analysis of complex traits in humans, successfully shedding light on several aspects of genetic architecture and biological aetiology. Single-nucleotide polymorphisms (SNPs) are usually modelled as having additive, cumulative and independent effects on the phenotype. Although evidently a useful approach, it is often argued that this is not a realistic biological model and that epistasis (that is, the statistical interaction between SNPs) should be included. The purpose of this Review is to summarize recent directions in methodology for detecting epistasis and to discuss evidence of the role of epistasis in human complex trait variation. We also discuss the relevance of epistasis in the context of GWASs and potential hazards in the interpretation of statistical interaction terms.
Collapse
|
31
|
Tada H, Won HH, Melander O, Yang J, Peloso GM, Kathiresan S. Multiple associated variants increase the heritability explained for plasma lipids and coronary artery disease. ACTA ACUST UNITED AC 2014; 7:583-7. [PMID: 25170055 DOI: 10.1161/circgenetics.113.000420] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND Plasma lipid levels as well as coronary artery disease (CAD) have been shown to be highly heritable with estimates ranging from 40% to 60%. However, top variants detected by large-scale genome-wide association studies explain only a fraction of the total variance in plasma lipid phenotypes and CAD. METHODS AND RESULTS We performed a conditional and joint association analysis using summary-level statistics from 2 large genome-wide association meta-analyses: the Global Lipids Genetics Consortium (GLGC) study, and the Coronary Artery Disease Genome-Wide Replication and Meta-Analysis (CARDIoGRAM) study. There were 100 184 individuals from 46 GLGC studies for plasma lipids, and 22 233 cases and 64 762 controls from 14 studies for CAD. We detected several loci where multiple independent single-nucleotide polymorphisms were associated with lipid traits within a locus (12 out of 33 loci for high-density lipoprotein cholesterol, 10 of 35 loci for low-density lipoprotein cholesterol, 13 of 44 loci for total cholesterol, and 8 of 28 loci for triglycerides), reaching genome-wide significance (P<5×10(-8)), nearly doubling the heritability explained by genome-wide association studies (from 3.6 to 7.6% for high-density lipoprotein cholesterol, from 5.0 to 8.8% for low-density lipoprotein cholesterol, from 5.5 to 8.8% for total cholesterol, and from 5.7 to 8.5% for triglycerides). Multiple single-nucleotide polymorphisms were also associated with CAD (3 of 15 loci; an increase from 9.6% to 11.4% of heritability explained). CONCLUSIONS These results demonstrate that a portion of the missing heritability for lipid traits and CAD can be explained by multiple variants at each locus.
Collapse
Affiliation(s)
- Hayato Tada
- From the Center for Human Genetic Research, Massachusetts General Hospital, Boston (H.T., H.-H.W., G.M.P., S.K.); Broad Institute, Program in Medical and Population Genetics, Cambridge, MA (H.T., H.-H.W., G.M.P., S.K.); Department of Clinical Sciences, Lund University, Lund, Sweden (O.M.); Department of Internal Medicine, Skåne University Hospital, Malmö, Sweden (O.M.); and Queensland Institute of Medical Research, Brisbane, Queensland, Australia (J.Y.)
| | - Hong-Hee Won
- From the Center for Human Genetic Research, Massachusetts General Hospital, Boston (H.T., H.-H.W., G.M.P., S.K.); Broad Institute, Program in Medical and Population Genetics, Cambridge, MA (H.T., H.-H.W., G.M.P., S.K.); Department of Clinical Sciences, Lund University, Lund, Sweden (O.M.); Department of Internal Medicine, Skåne University Hospital, Malmö, Sweden (O.M.); and Queensland Institute of Medical Research, Brisbane, Queensland, Australia (J.Y.)
| | - Olle Melander
- From the Center for Human Genetic Research, Massachusetts General Hospital, Boston (H.T., H.-H.W., G.M.P., S.K.); Broad Institute, Program in Medical and Population Genetics, Cambridge, MA (H.T., H.-H.W., G.M.P., S.K.); Department of Clinical Sciences, Lund University, Lund, Sweden (O.M.); Department of Internal Medicine, Skåne University Hospital, Malmö, Sweden (O.M.); and Queensland Institute of Medical Research, Brisbane, Queensland, Australia (J.Y.)
| | - Jian Yang
- From the Center for Human Genetic Research, Massachusetts General Hospital, Boston (H.T., H.-H.W., G.M.P., S.K.); Broad Institute, Program in Medical and Population Genetics, Cambridge, MA (H.T., H.-H.W., G.M.P., S.K.); Department of Clinical Sciences, Lund University, Lund, Sweden (O.M.); Department of Internal Medicine, Skåne University Hospital, Malmö, Sweden (O.M.); and Queensland Institute of Medical Research, Brisbane, Queensland, Australia (J.Y.)
| | - Gina M Peloso
- From the Center for Human Genetic Research, Massachusetts General Hospital, Boston (H.T., H.-H.W., G.M.P., S.K.); Broad Institute, Program in Medical and Population Genetics, Cambridge, MA (H.T., H.-H.W., G.M.P., S.K.); Department of Clinical Sciences, Lund University, Lund, Sweden (O.M.); Department of Internal Medicine, Skåne University Hospital, Malmö, Sweden (O.M.); and Queensland Institute of Medical Research, Brisbane, Queensland, Australia (J.Y.)
| | - Sekar Kathiresan
- From the Center for Human Genetic Research, Massachusetts General Hospital, Boston (H.T., H.-H.W., G.M.P., S.K.); Broad Institute, Program in Medical and Population Genetics, Cambridge, MA (H.T., H.-H.W., G.M.P., S.K.); Department of Clinical Sciences, Lund University, Lund, Sweden (O.M.); Department of Internal Medicine, Skåne University Hospital, Malmö, Sweden (O.M.); and Queensland Institute of Medical Research, Brisbane, Queensland, Australia (J.Y.).
| |
Collapse
|
32
|
Deckers IA, van den Brandt PA, van Engeland M, van Schooten FJ, Godschalk RW, Keszei AP, Schouten LJ. Polymorphisms in genes of the renin-angiotensin-aldosterone system and renal cell cancer risk: interplay with hypertension and intakes of sodium, potassium and fluid. Int J Cancer 2014; 136:1104-16. [PMID: 24978482 DOI: 10.1002/ijc.29060] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2014] [Accepted: 06/18/2014] [Indexed: 01/20/2023]
Abstract
Hypertension is an established risk factor for renal cell cancer (RCC). The renin-angiotensin-aldosterone system (RAAS) regulates blood pressure and is closely linked to hypertension. RAAS additionally influences homeostasis of electrolytes (e.g. sodium and potassium) and fluid. We investigated single nucleotide polymorphisms (SNPs) in RAAS and their interactions with hypertension and intakes of sodium, potassium and fluid regarding RCC risk in the Netherlands Cohort Study (NLCS), which was initiated in 1986 and included 120,852 participants aged 55 to 69 years. Diet and lifestyle were assessed by questionnaires and toenail clippings were collected. Genotyping of toenail DNA was performed using the SEQUENOM® MassARRAY® platform for a literature-based selection of 13 candidate SNPs in seven key RAAS genes. After 20.3 years of follow-up, Cox regression analyses were conducted using a case-cohort approach including 3,583 subcohort members and 503 RCC cases. Two SNPs in AGTR1 were associated with RCC risk. AGTR1_rs1492078 (AA vs. GG) decreased RCC risk [hazard ratio (HR) (95% confidence interval (CI)): 0.70(0.49-1.00)], whereas AGTR1_rs5186 (CC vs. AA) increased RCC risk [HR(95%CI): 1.49(1.08-2.05)]. Associations were stronger in participants with hypertension. The RCC risk for AGT_rs3889728 (AG + AA vs. GG) was modified by hypertension (p interaction = 0.039). SNP-diet interactions were not significant, although HRs suggested interaction between SNPs in ACE and sodium intake. SNPs in AGTR1 and AGT influenced RCC susceptibility, and their effects were modified by hypertension. Sodium intake was differentially associated with RCC risk across genotypes of several SNPs, yet some analyses had probably inadequate power to show significant interaction. Results suggest that RAAS may be a candidate pathway in RCC etiology.
Collapse
Affiliation(s)
- Ivette A Deckers
- Department of Epidemiology, School for Oncology and Developmental Biology (GROW), Maastricht University Medical Centre, Maastricht, The Netherlands
| | | | | | | | | | | | | |
Collapse
|
33
|
Melchiotti R, Puan KJ, Andiappan AK, Poh TY, Starke M, Zhuang L, Petsch K, Lai TS, Chew FT, Larbi A, Wang DY, Poidinger M, Rotzschke O. Genetic analysis of an allergic rhinitis cohort reveals an intercellular epistasis between FAM134B and CD39. BMC MEDICAL GENETICS 2014; 15:73. [PMID: 24970562 PMCID: PMC4094447 DOI: 10.1186/1471-2350-15-73] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/05/2014] [Accepted: 06/23/2014] [Indexed: 02/11/2023]
Abstract
BACKGROUND Extracellular ATP is a pro-inflammatory molecule released by damaged cells. Regulatory T cells (Treg) can suppress inflammation by hydrolysing this molecule via ectonucleoside triphosphate diphosphohydrolase 1 (ENTPD1), also termed as CD39. Multiple studies have reported differences in CD39+ Treg percentages in diseases such as multiple sclerosis, Hepatitis B and HIV-1. In addition, CD39 polymorphisms have been implicated in immune-phenotypes such as susceptibility to inflammatory bowel disease and AIDS progression. However none of the studies published so far has linked disease-associated variants with differences in CD39 Treg surface expression. This study aims at identifying variants affecting CD39 expression on Treg and at evaluating their association with allergic rhinitis, a disease characterized by a strong Treg involvement. METHODS Cohorts consisting of individuals of different ethnicities were employed to identify any association of CD39 variants to surface expression. Significant variant(s) were tested for disease association in a published GWAS cohort by one-locus and two-locus genetic analyses based on logistic models. Further functional characterization was performed using existing microarray data and quantitative RT-PCR on sorted cells. RESULTS Our study shows that rs7071836, a promoter SNP in the CD39 gene region, affects the cell surface expression on Treg cells but not on other CD39+ leukocyte subsets. Epistasis analysis revealed that, in conjunction with a SNP upstream of the FAM134B gene (rs257174), it increased the risk of allergic rhinitis (P = 1.98 × 10-6). As a promoter SNP, rs257174 controlled the expression of the gene in monocytes but, notably, not in Treg cells. Whole blood transcriptome data of three large cohorts indicated an inverse relation in the expression of the two proteins. While this observation was in line with the epistasis data, it also implied that a functional link must exist. Exposure of monocytes to extracellular ATP resulted in an up-regulation of FAM134B gene expression, suggesting that extracellular ATP released from damaged cells represents the connection for the biological interaction of CD39 on Treg cells with FAM134B on monocytes. CONCLUSIONS The interplay between promoter SNPs of CD39 and FAM134B results in an intercellular epistasis which influences the risk of a complex inflammatory disease.
Collapse
MESH Headings
- Antigens, CD/genetics
- Antigens, CD/immunology
- Apyrase/genetics
- Apyrase/immunology
- Case-Control Studies
- Epistasis, Genetic
- Genetic Variation
- Humans
- Intracellular Signaling Peptides and Proteins
- Membrane Proteins
- Monocytes/immunology
- Neoplasm Proteins/genetics
- Polymorphism, Single Nucleotide
- Promoter Regions, Genetic
- Reproducibility of Results
- Rhinitis, Allergic
- Rhinitis, Allergic, Perennial/genetics
- Rhinitis, Allergic, Perennial/immunology
- T-Lymphocytes, Regulatory/immunology
Collapse
Affiliation(s)
- Rossella Melchiotti
- SIgN (Singapore Immunology Network), A*STAR (Agency for Science, Technology and Research), Singapore 138648, Singapore
- Doctoral School in Translational and Molecular Medicine (DIMET), University of Milano-Bicocca, Milan 20126, Italy
| | - Kia Joo Puan
- SIgN (Singapore Immunology Network), A*STAR (Agency for Science, Technology and Research), Singapore 138648, Singapore
| | - Anand Kumar Andiappan
- SIgN (Singapore Immunology Network), A*STAR (Agency for Science, Technology and Research), Singapore 138648, Singapore
| | - Tuang Yeow Poh
- SIgN (Singapore Immunology Network), A*STAR (Agency for Science, Technology and Research), Singapore 138648, Singapore
| | - Mireille Starke
- SIgN (Singapore Immunology Network), A*STAR (Agency for Science, Technology and Research), Singapore 138648, Singapore
| | - Li Zhuang
- SIgN (Singapore Immunology Network), A*STAR (Agency for Science, Technology and Research), Singapore 138648, Singapore
| | - Kerstin Petsch
- SIgN (Singapore Immunology Network), A*STAR (Agency for Science, Technology and Research), Singapore 138648, Singapore
| | - Tuck Siong Lai
- SIgN (Singapore Immunology Network), A*STAR (Agency for Science, Technology and Research), Singapore 138648, Singapore
| | - Fook Tim Chew
- Department of Biological Sciences, National University of Singapore, Singapore 117543, Singapore
| | - Anis Larbi
- SIgN (Singapore Immunology Network), A*STAR (Agency for Science, Technology and Research), Singapore 138648, Singapore
| | - De Yun Wang
- Department of Otolaryngology, National University of Singapore, Singapore 119228, Singapore
| | - Michael Poidinger
- SIgN (Singapore Immunology Network), A*STAR (Agency for Science, Technology and Research), Singapore 138648, Singapore
| | - Olaf Rotzschke
- SIgN (Singapore Immunology Network), A*STAR (Agency for Science, Technology and Research), Singapore 138648, Singapore
| |
Collapse
|
34
|
Ma L, Ballantyne C, Brautbar A, Keinan A. Analysis of multiple association studies provides evidence of an expression QTL hub in gene-gene interaction network affecting HDL cholesterol levels. PLoS One 2014; 9:e92469. [PMID: 24651390 PMCID: PMC3961362 DOI: 10.1371/journal.pone.0092469] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2014] [Accepted: 02/21/2014] [Indexed: 11/18/2022] Open
Abstract
Epistasis has been suggested to underlie part of the missing heritability in genome-wide association studies. In this study, we first report an analysis of gene-gene interactions affecting HDL cholesterol (HDL-C) levels in a candidate gene study of 2,091 individuals with mixed dyslipidemia from a clinical trial. Two additional studies, the Atherosclerosis Risk in Communities study (ARIC; n = 9,713) and the Multi-Ethnic Study of Atherosclerosis (MESA; n = 2,685), were considered for replication. We identified a gene-gene interaction between rs1532085 and rs12980554 (P = 7.1×10−7) in their effect on HDL-C levels, which is significant after Bonferroni correction (Pc = 0.017) for the number of SNP pairs tested. The interaction successfully replicated in the ARIC study (P = 7.0×10−4; Pc = 0.02). Rs1532085, an expression QTL (eQTL) of LIPC, is one of the two SNPs involved in another, well-replicated gene-gene interaction underlying HDL-C levels. To further investigate the role of this eQTL SNP in gene-gene interactions affecting HDL-C, we tested in the ARIC study for interaction between this SNP and any other SNP genome-wide. We found the eQTL to be involved in a few suggestive interactions, one of which significantly replicated in MESA. Importantly, these gene-gene interactions, involving only rs1532085, explain an additional 1.4% variation of HDL-C, on top of the 0.65% explained by rs1532085 alone. LIPC plays a key role in the lipid metabolism pathway and it, and rs1532085 in particular, has been associated with HDL-C and other lipid levels. Collectively, we discovered several novel gene-gene interactions, all involving an eQTL of LIPC, thus suggesting a hub role of LIPC in the gene-gene interaction network that regulates HDL-C levels, which in turn raises the hypothesis that LIPC's contribution is largely via interactions with other lipid metabolism related genes.
Collapse
Affiliation(s)
- Li Ma
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
| | - Christie Ballantyne
- Section of Cardiovascular Research, Department of Medicine, Baylor College of Medicine, Houston, Texas, United States of America
| | - Ariel Brautbar
- Section of Cardiovascular Research, Department of Medicine, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Medical Genetics, Marshfield Clinic, Marshfield, Wisconsin, United States of America
- * E-mail: (AK); (AB)
| | - Alon Keinan
- Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America
- * E-mail: (AK); (AB)
| |
Collapse
|
35
|
DE OLIVEIRA LIGIAPETROLINI, LÓPEZ IGNACIO, SANTOS ERIKAMARIAMONTEIRODOS, TUCCI PAULA, MARÍN MÓNICA, SOARES FERNANDOAUGUSTO, ROSSI BENEDITOMAURO, DE ALMEIDA COUDRY RENATA. Association of the p53 codon 72 polymorphism with clinicopathological characteristics of colorectal cancer through mRNA analysis. Oncol Rep 2013; 31:1396-406. [DOI: 10.3892/or.2013.2940] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2013] [Accepted: 10/18/2013] [Indexed: 11/05/2022] Open
|
36
|
Affiliation(s)
- Valeriya Lyssenko
- Department of Clinical Sciences, Diabetes and Endocrinology, Lund University, Malmö, Sweden.
| | | |
Collapse
|
37
|
Abstract
Background It has been hypothesized that multivariate analysis and systematic detection of epistatic interactions between explanatory genotyping variables may help resolve the problem of "missing heritability" currently observed in genome-wide association studies (GWAS). However, even the simplest bivariate analysis is still held back by significant statistical and computational challenges that are often addressed by reducing the set of analysed markers. Theoretically, it has been shown that combinations of loci may exist that show weak or no effects individually, but show significant (even complete) explanatory power over phenotype when combined. Reducing the set of analysed SNPs before bivariate analysis could easily omit such critical loci. Results We have developed an exhaustive bivariate GWAS analysis methodology that yields a manageable subset of candidate marker pairs for subsequent analysis using other, often more computationally expensive techniques. Our model-free filtering approach is based on classification using ROC curve analysis, an alternative to much slower regression-based modelling techniques. Exhaustive analysis of studies containing approximately 450,000 SNPs and 5,000 samples requires only 2 hours using a desktop CPU or 13 minutes using a GPU (Graphics Processing Unit). We validate our methodology with analysis of simulated datasets as well as the seven Wellcome Trust Case-Control Consortium datasets that represent a wide range of real life GWAS challenges. We have identified SNP pairs that have considerably stronger association with disease than their individual component SNPs that often show negligible effect univariately. When compared against previously reported results in the literature, our methods re-detect most significant SNP-pairs and additionally detect many pairs absent from the literature that show strong association with disease. The high overlap suggests that our fast analysis could substitute for some slower alternatives. Conclusions We demonstrate that the proposed methodology is robust, fast and capable of exhaustive search for epistatic interactions using a standard desktop computer. First, our implementation is significantly faster than timings for comparable algorithms reported in the literature, especially as our method allows simultaneous use of multiple statistical filters with low computing time overhead. Second, for some diseases, we have identified hundreds of SNP pairs that pass formal multiple test (Bonferroni) correction and could form a rich source of hypotheses for follow-up analysis. Availability A web-based version of the software used for this analysis is available at http://bioinformatics.research.nicta.com.au/gwis.
Collapse
|
38
|
Inferring gene function and network organization in Drosophila signaling by combined analysis of pleiotropy and epistasis. G3-GENES GENOMES GENETICS 2013; 3:807-14. [PMID: 23550134 PMCID: PMC3656728 DOI: 10.1534/g3.113.005710] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
High-throughput genetic interaction screens have enabled functional genomics on a network scale. Groups of cofunctional genes commonly exhibit similar interaction patterns across a large network, leading to novel functional inferences for a minority of previously uncharacterized genes within a group. However, such analyses are often unsuited to cases with a few relevant gene variants or sparse annotation. Here we describe an alternative analysis of cell growth signaling using a computational strategy that integrates patterns of pleiotropy and epistasis to infer how gene knockdowns enhance or suppress the effects of other knockdowns. We analyzed the interaction network for RNAi knockdowns of a set of 93 incompletely annotated genes in a Drosophila melanogaster model of cellular signaling. We inferred novel functional relationships between genes by modeling genetic interactions in terms of knockdown-to-knockdown influences. The method simultaneously analyzes the effects of partially pleiotropic genes on multiple quantitative phenotypes to infer a consistent model of each genetic interaction. From these models we proposed novel candidate Ras inhibitors and their Ras signaling interaction partners, and each of these hypotheses can be inferred independent of network-wide patterns. At the same time, the network-scale interaction patterns consistently mapped pathway organization. The analysis therefore assigns functional relevance to individual genetic interactions while also revealing global genetic architecture.
Collapse
|
39
|
Setsirichok D, Tienboon P, Jaroonruang N, Kittichaijaroen S, Wongseree W, Piroonratana T, Usavanarong T, Limwongse C, Aporntewan C, Phadoongsidhi M, Chaiyaratana N. An omnibus permutation test on ensembles of two-locus analyses can detect pure epistasis and genetic heterogeneity in genome-wide association studies. SPRINGERPLUS 2013; 2:230. [PMID: 24804170 PMCID: PMC4006521 DOI: 10.1186/2193-1801-2-230] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/20/2012] [Accepted: 04/24/2013] [Indexed: 01/20/2023]
Abstract
This article presents the ability of an omnibus permutation test on ensembles of two-locus analyses (2LOmb) to detect pure epistasis in the presence of genetic heterogeneity. The performance of 2LOmb is evaluated in various simulation scenarios covering two independent causes of complex disease where each cause is governed by a purely epistatic interaction. Different scenarios are set up by varying the number of available single nucleotide polymorphisms (SNPs) in data, number of causative SNPs and ratio of case samples from two affected groups. The simulation results indicate that 2LOmb outperforms multifactor dimensionality reduction (MDR) and random forest (RF) techniques in terms of a low number of output SNPs and a high number of correctly-identified causative SNPs. Moreover, 2LOmb is capable of identifying the number of independent interactions in tractable computational time and can be used in genome-wide association studies. 2LOmb is subsequently applied to a type 1 diabetes mellitus (T1D) data set, which is collected from a UK population by the Wellcome Trust Case Control Consortium (WTCCC). After screening for SNPs that locate within or near genes and exhibit no marginal single-locus effects, the T1D data set is reduced to 95,991 SNPs from 12,146 genes. The 2LOmb search in the reduced T1D data set reveals that 12 SNPs, which can be divided into two independent sets, are associated with the disease. The first SNP set consists of three SNPs from MUC21 (mucin 21, cell surface associated), three SNPs from MUC22 (mucin 22), two SNPs from PSORS1C1 (psoriasis susceptibility 1 candidate 1) and one SNP from TCF19 (transcription factor 19). A four-locus interaction between these four genes is also detected. The second SNP set consists of three SNPs from ATAD1 (ATPase family, AAA domain containing 1). Overall, the findings indicate the detection of pure epistasis in the presence of genetic heterogeneity and provide an alternative explanation for the aetiology of T1D in the UK population.
Collapse
Affiliation(s)
- Damrongrit Setsirichok
- Department of Electrical and Computer Engineering, Faculty of Engineering, King Mongkut's University of Technology North Bangkok, 1518 Pracharat Sai 1 Road, Bangsue, Bangkok 10800, Thailand
| | - Phuwadej Tienboon
- Department of Electrical and Computer Engineering, Faculty of Engineering, King Mongkut's University of Technology North Bangkok, 1518 Pracharat Sai 1 Road, Bangsue, Bangkok 10800, Thailand
| | - Nattapong Jaroonruang
- Department of Computer Engineering, Faculty of Engineering, King Mongkut's University of Technology Thonburi, 126 Pracha-utid Road, Bangmod, Toongkru, Bangkok 10140, Thailand
| | - Somkit Kittichaijaroen
- Department of Electrical and Computer Engineering, Faculty of Engineering, King Mongkut's University of Technology North Bangkok, 1518 Pracharat Sai 1 Road, Bangsue, Bangkok 10800, Thailand
| | - Waranyu Wongseree
- Division of Technology of Information System Management, Faculty of Engineering, Mahidol University, 25/25 Phuttamonthon 4 Road, Nakhon Pathom 73170, Salaya, Thailand
| | - Theera Piroonratana
- Department of Electrical and Computer Engineering, Faculty of Engineering, King Mongkut's University of Technology North Bangkok, 1518 Pracharat Sai 1 Road, Bangsue, Bangkok 10800, Thailand
| | - Touchpong Usavanarong
- Department of Electrical and Computer Engineering, Faculty of Engineering, King Mongkut's University of Technology North Bangkok, 1518 Pracharat Sai 1 Road, Bangsue, Bangkok 10800, Thailand
| | - Chanin Limwongse
- Division of Molecular Genetics, Department of Research and Development, Faculty of Medicine Siriraj Hospital, Mahidol University, 2 Prannok Road, Bangkok 10700, Bangkoknoi, Thailand
| | - Chatchawit Aporntewan
- Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, 254 Phayathai Road, Pathumwan, Bangkok 10330, Thailand
| | - Marong Phadoongsidhi
- Department of Computer Engineering, Faculty of Engineering, King Mongkut's University of Technology Thonburi, 126 Pracha-utid Road, Bangmod, Toongkru, Bangkok 10140, Thailand
| | - Nachol Chaiyaratana
- Department of Electrical and Computer Engineering, Faculty of Engineering, King Mongkut's University of Technology North Bangkok, 1518 Pracharat Sai 1 Road, Bangsue, Bangkok 10800, Thailand ; Division of Molecular Genetics, Department of Research and Development, Faculty of Medicine Siriraj Hospital, Mahidol University, 2 Prannok Road, Bangkok 10700, Bangkoknoi, Thailand
| |
Collapse
|
40
|
Gundert-Remy U, Dimovski A, Gajović S. Personalized medicine - where do we stand? Pouring some water into wine: a realistic perspective. Croat Med J 2013; 53:314-20. [PMID: 22911523 PMCID: PMC3428819 DOI: 10.3325/cmj.2012.53.314] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Abstract Reviewing the past and the present status of personalized medicine, the hope and promise from several years ago was critically compared to what is really achieved to tailor the drug treatment according to the patient’s individuality. The basis for consideration is what we know about the variant of the disease the patient is suffering from, and about the mechanisms influencing the plasma concentration-time profile, such as activity of metabolizing enzymes and transporters. In cancer treatment, drugs are currently selected regarding molecular properties of the cancer tissue, eg, expressing receptors such as HER2 receptor. Currently diagnostic tests are available allowing to detect somatic cell mutations that can be used to guide drug selection. Unfortunately, tumor heterogeneity and developing resistance by further mutations may limit the success of the therapy determined by molecular diagnostics. The present status can be described that in drug kinetics we know the influencing factors and we understand the mechanisms. However, only in a few cases the genetic background is the main determinant of kinetic variability, and environmental and other factors have an additional important role. Therefore, much more has to be done before we can translate the accumulating knowledge into a benefit for the patient. Only then, we can speak about personalized medicine.
Collapse
Affiliation(s)
- Ursula Gundert-Remy
- Charite Universitätsmedizin Berlin, Institute for Clinical Pharmacology and Toxicology, Berlin, Germany.
| | | | | |
Collapse
|
41
|
Louie RJ, Guo J, Rodgers JW, White R, Shah N, Pagant S, Kim P, Livstone M, Dolinski K, McKinney BA, Hong J, Sorscher EJ, Bryan J, Miller EA, Hartman JL. A yeast phenomic model for the gene interaction network modulating CFTR-ΔF508 protein biogenesis. Genome Med 2012; 4:103. [PMID: 23270647 PMCID: PMC3906889 DOI: 10.1186/gm404] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2012] [Accepted: 12/27/2012] [Indexed: 01/20/2023] Open
Abstract
Background The overall influence of gene interaction in human disease is unknown. In cystic fibrosis (CF) a single allele of the cystic fibrosis transmembrane conductance regulator (CFTR-ΔF508) accounts for most of the disease. In cell models, CFTR-ΔF508 exhibits defective protein biogenesis and degradation rather than proper trafficking to the plasma membrane where CFTR normally functions. Numerous genes function in the biogenesis of CFTR and influence the fate of CFTR-ΔF508. However it is not known whether genetic variation in such genes contributes to disease severity in patients. Nor is there an easy way to study how numerous gene interactions involving CFTR-ΔF would manifest phenotypically. Methods To gain insight into the function and evolutionary conservation of a gene interaction network that regulates biogenesis of a misfolded ABC transporter, we employed yeast genetics to develop a 'phenomic' model, in which the CFTR-ΔF508-equivalent residue of a yeast homolog is mutated (Yor1-ΔF670), and where the genome is scanned quantitatively for interaction. We first confirmed that Yor1-ΔF undergoes protein misfolding and has reduced half-life, analogous to CFTR-ΔF. Gene interaction was then assessed quantitatively by growth curves for approximately 5,000 double mutants, based on alteration in the dose response to growth inhibition by oligomycin, a toxin extruded from the cell at the plasma membrane by Yor1. Results From a comparative genomic perspective, yeast gene interactions influencing Yor1-ΔF biogenesis were representative of human homologs previously found to modulate processing of CFTR-ΔF in mammalian cells. Additional evolutionarily conserved pathways were implicated by the study, and a ΔF-specific pro-biogenesis function of the recently discovered ER membrane complex (EMC) was evident from the yeast screen. This novel function was validated biochemically by siRNA of an EMC ortholog in a human cell line expressing CFTR-ΔF508. The precision and accuracy of quantitative high throughput cell array phenotyping (Q-HTCP), which captures tens of thousands of growth curves simultaneously, provided powerful resolution to measure gene interaction on a phenomic scale, based on discrete cell proliferation parameters. Conclusion We propose phenomic analysis of Yor1-ΔF as a model for investigating gene interaction networks that can modulate cystic fibrosis disease severity. Although the clinical relevance of the Yor1-ΔF gene interaction network for cystic fibrosis remains to be defined, the model appears to be informative with respect to human cell models of CFTR-ΔF. Moreover, the general strategy of yeast phenomics can be employed in a systematic manner to model gene interaction for other diseases relating to pathologies that result from protein misfolding or potentially any disease involving evolutionarily conserved genetic pathways.
Collapse
|
42
|
Liu Y, Maxwell S, Feng T, Zhu X, Elston RC, Koyutürk M, Chance MR. Gene, pathway and network frameworks to identify epistatic interactions of single nucleotide polymorphisms derived from GWAS data. BMC SYSTEMS BIOLOGY 2012; 6 Suppl 3:S15. [PMID: 23281810 PMCID: PMC3524014 DOI: 10.1186/1752-0509-6-s3-s15] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Background Interactions among genomic loci (also known as epistasis) have been suggested as one of the potential sources of missing heritability in single locus analysis of genome-wide association studies (GWAS). The computational burden of searching for interactions is compounded by the extremely low threshold for identifying significant p-values due to multiple hypothesis testing corrections. Utilizing prior biological knowledge to restrict the set of candidate SNP pairs to be tested can alleviate this problem, but systematic studies that investigate the relative merits of integrating different biological frameworks and GWAS data have not been conducted. Results We developed four biologically based frameworks to identify pairwise interactions among candidate SNP pairs as follows: (1) for each human protein-coding gene, a set of SNPs associated with that gene was constructed providing a gene-based interaction model, (2) for each known biological pathway, a set of SNPs associated with the genes in the pathway was constructed providing a pathway-based interaction model, (3) a set of SNPs associated with genes in a disease-related subnetwork provides a network-based interaction model, and (4) a framework is based on the function of SNPs. The last approach uses expression SNPs (eSNPs or eQTLs), which are SNPs or loci that have defined effects on the abundance of transcripts of other genes. We constructed pairs of eSNPs and SNPs located in the target genes whose expression is regulated by eSNPs. For all four frameworks the SNP sets were exhaustively tested for pairwise interactions within the sets using a traditional logistic regression model after excluding genes that were previously identified to associate with the trait. Using previously published GWAS data for type 2 diabetes (T2D) and the biologically based pair-wise interaction modeling, we identify twelve genes not seen in the previous single locus analysis. Conclusion We present four approaches to detect interactions associated with complex diseases. The results show our approaches outperform the traditional single locus approaches in detecting genes that previously did not reach significance; the results also provide novel drug targets and biomarkers relevant to the underlying mechanisms of disease.
Collapse
Affiliation(s)
- Yu Liu
- Center for Proteomics and Bioinformatics, Case Western Reserve University, Cleveland, OH, USA
| | | | | | | | | | | | | |
Collapse
|
43
|
Systems genetics in "-omics" era: current and future development. Theory Biosci 2012; 132:1-16. [PMID: 23138757 DOI: 10.1007/s12064-012-0168-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2012] [Accepted: 10/25/2012] [Indexed: 02/06/2023]
Abstract
The systems genetics is an emerging discipline that integrates high-throughput expression profiling technology and systems biology approaches for revealing the molecular mechanism of complex traits, and will improve our understanding of gene functions in the biochemical pathway and genetic interactions between biological molecules. With the rapid advances of microarray analysis technologies, bioinformatics is extensively used in the studies of gene functions, SNP-SNP genetic interactions, LD block-block interactions, miRNA-mRNA interactions, DNA-protein interactions, protein-protein interactions, and functional mapping for LD blocks. Based on bioinformatics panel, which can integrate "-omics" datasets to extract systems knowledge and useful information for explaining the molecular mechanism of complex traits, systems genetics is all about to enhance our understanding of biological processes. Systems biology has provided systems level recognition of various biological phenomena, and constructed the scientific background for the development of systems genetics. In addition, the next-generation sequencing technology and post-genome wide association studies empower the discovery of new gene and rare variants. The integration of different strategies will help to propose novel hypothesis and perfect the theoretical framework of systems genetics, which will make contribution to the future development of systems genetics, and open up a whole new area of genetics.
Collapse
|
44
|
Rose AM, Bell LCK. Epistasis and immunity: the role of genetic interactions in autoimmune diseases. Immunology 2012; 137:131-8. [PMID: 22804709 DOI: 10.1111/j.1365-2567.2012.03623.x] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Autoimmune disorders are a complex and varied group of diseases that are caused by breakdown of self-tolerance. The aetiology of autoimmunity is multi-factorial, with both environmental triggers and genetically determined risk factors. In recent years, it has been increasingly recognized that genetic risk factors do not act in isolation, but rather the combination of individual additive effects, gene-gene interactions and gene-environment interactions determine overall risk of autoimmunity. The importance of gene-gene interactions, or epistasis, has been recently brought into focus, with research demonstrating that many autoimmune diseases, including rheumatic arthritis, autoimmune glomerulonephritis, systemic lupus erythematosus and multiple sclerosis, are influenced by epistatic interactions. This review sets out to examine the basic mechanisms of epistasis, how epistasis influences the immune system and the role of epistasis in two major autoimmune conditions, systemic lupus erythematosus and multiple sclerosis.
Collapse
Affiliation(s)
- Anna M Rose
- Department of Genetics, UCL Institute of Ophthalmology, London, UK.
| | | |
Collapse
|
45
|
Carter GW, Hays M, Sherman A, Galitski T. Use of pleiotropy to model genetic interactions in a population. PLoS Genet 2012; 8:e1003010. [PMID: 23071457 PMCID: PMC3469415 DOI: 10.1371/journal.pgen.1003010] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2012] [Accepted: 08/19/2012] [Indexed: 12/01/2022] Open
Abstract
Systems-level genetic studies in humans and model systems increasingly involve both high-resolution genotyping and multi-dimensional quantitative phenotyping. We present a novel method to infer and interpret genetic interactions that exploits the complementary information in multiple phenotypes. We applied this approach to a population of yeast strains with randomly assorted perturbations of five genes involved in mating. We quantified pheromone response at the molecular level and overall mating efficiency. These phenotypes were jointly analyzed to derive a network of genetic interactions that mapped mating-pathway relationships. To determine the distinct biological processes driving the phenotypic complementarity, we analyzed patterns of gene expression to find that the pheromone response phenotype is specific to cellular fusion, whereas mating efficiency was a combined measure of cellular fusion, cell cycle arrest, and modifications in cellular metabolism. We applied our novel method to global gene expression patterns to derive an expression-specific interaction network and demonstrate applicability to global transcript data. Our approach provides a basis for interpretation of genetic interactions and the generation of specific hypotheses from populations assayed for multiple phenotypes. Parallel advances in genotype and phenotype measurement technologies are yielding large-scale, multidimensional datasets that can potentially decipher the genetic etiology of complex traits. Understanding these data will require methods that combine the experimental power of molecular biology and the quantitative power of statistical genetics. In this work, we describe a novel approach that uses the complementary information encoded by multiple phenotypes in conjunction with genetic data to map genetic interaction networks in terms of quantitative variant-to-variant and variant-to-phenotype influences. We tested this method using a population of yeast strains with random combinations of five genetic mutations and derived an interaction network using molecular and colony-level assays of mating phenotypes. Distinct biological processes that underlie the two phenotypes were identified with gene expression analysis, validating the method's ability to exploit complementary biological information in multiple phenotypes. Our method generates data-driven models and testable hypotheses of how the genetic variation in a population combines to affect complex traits. It is designed to be flexible and scalable for application to populations with extensive genetic diversity.
Collapse
|
46
|
Holzinger ER, Ritchie MD. Integrating heterogeneous high-throughput data for meta-dimensional pharmacogenomics and disease-related studies. Pharmacogenomics 2012; 13:213-22. [PMID: 22256870 DOI: 10.2217/pgs.11.145] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
The current paradigm of human genetics research is to analyze variation of a single data type (i.e., DNA sequence or RNA levels) to detect genes and pathways that underlie complex traits such as disease state or drug response. While these studies have detected thousands of variations that associate with hundreds of complex phenotypes, much of the estimated heritability, or trait variability due to genetic factors, remain unexplained. We may be able to account for a portion of the missing heritability if we incorporate a systems biology approach into these analyses. Rapid technological advances will make it possible for scientists to explore this hypothesis via the generation of high-throughput omics data - transcriptomic, proteomic and methylomic to name a few. Analyzing this 'meta-dimensional' data will require clever statistical techniques that allow for the integration of qualitative and quantitative predictor variables. For this article, we examine two major categories of approaches for integrated data analysis, give examples of their use in experimental and in silico datasets, and assess the limitations of each method.
Collapse
Affiliation(s)
- Emily R Holzinger
- Center for Human Genetics Research, Vanderbilt University, Department of Molecular Physiology & Biophysics, Nashville, TN, USA
| | | |
Collapse
|
47
|
Bebek G, Koyutürk M, Price ND, Chance MR. Network biology methods integrating biological data for translational science. Brief Bioinform 2012; 13:446-59. [PMID: 22390873 PMCID: PMC3404396 DOI: 10.1093/bib/bbr075] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2011] [Revised: 11/29/2011] [Indexed: 12/29/2022] Open
Abstract
The explosion of biomedical data, both on the genomic and proteomic side as well as clinical data, will require complex integration and analysis to provide new molecular variables to better understand the molecular basis of phenotype. Currently, much data exist in silos and is not analyzed in frameworks where all data are brought to bear in the development of biomarkers and novel functional targets. This is beginning to change. Network biology approaches, which emphasize the interactions between genes, proteins and metabolites provide a framework for data integration such that genome, proteome, metabolome and other -omics data can be jointly analyzed to understand and predict disease phenotypes. In this review, recent advances in network biology approaches and results are identified. A common theme is the potential for network analysis to provide multiplexed and functionally connected biomarkers for analyzing the molecular basis of disease, thus changing our approaches to analyzing and modeling genome- and proteome-wide data.
Collapse
|
48
|
Abstract
Evolution of RNA viruses occurs through disequilibria of collections of closely related mutant spectra or mutant clouds termed viral quasispecies. Here we review the origin of the quasispecies concept and some biological implications of quasispecies dynamics. Two main aspects are addressed: (i) mutant clouds as reservoirs of phenotypic variants for virus adaptability and (ii) the internal interactions that are established within mutant spectra that render a virus ensemble the unit of selection. The understanding of viruses as quasispecies has led to new antiviral designs, such as lethal mutagenesis, whose aim is to drive viruses toward low fitness values with limited chances of fitness recovery. The impact of quasispecies for three salient human pathogens, human immunodeficiency virus and the hepatitis B and C viruses, is reviewed, with emphasis on antiviral treatment strategies. Finally, extensions of quasispecies to nonviral systems are briefly mentioned to emphasize the broad applicability of quasispecies theory.
Collapse
Affiliation(s)
- Esteban Domingo
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), C/ Nicolás Cabrera, Universidad Autónoma de Madrid, Cantoblanco, Madrid, Spain.
| | | | | |
Collapse
|
49
|
Reder NP, Tayo BO, Salako B, Ogunniyi A, Adeyemo A, Rotimi C, Cooper RS. Adrenergic alpha-1 pathway is associated with hypertension among Nigerians in a pathway-focused analysis. PLoS One 2012; 7:e37145. [PMID: 22615923 PMCID: PMC3353888 DOI: 10.1371/journal.pone.0037145] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2012] [Accepted: 04/16/2012] [Indexed: 12/24/2022] Open
Abstract
Background The pathway-focused association approach offers a hypothesis driven alternative to the agnostic genome-wide association study. Here we apply the pathway-focused approach to an association study of hypertension, systolic blood pressure (SBP), and diastolic blood pressure (DBP) in 1614 Nigerians with genome-wide data. Methods and Results Testing of 28 pathways with biological relevance to hypertension, selected a priori, containing a total of 101 unique genes and 4,349 unique single-nucleotide polymorphisms (SNPs) showed an association for the adrenergic alpha 1 (ADRA1) receptor pathway with hypertension (p<0.0009) and diastolic blood pressure (p<0.0007). Within the ADRA1 pathway, the genes PNMT (hypertension Pgene<0.004, DBP Pgene<0.004, and SBP Pgene<0.009, and ADRA1B (hypertension Pgene<0.005, DBP Pgene<0.02, and SBP Pgene<0.02) displayed the strongest associations. Neither ADRA1B nor PNMT could be the sole mediator of the observed pathway association as the ADRA1 pathway remained significant after removing ADRA1B, and other pathways involving PNMT did not reach pathway significance. Conclusions We conclude that multiple variants in several genes in the ADRA1 pathway led to associations with hypertension and DBP. SNPs in ADRA1B and PNMT have not previously been linked to hypertension in a genome-wide association study, but both genes have shown associations with hypertension through linkage or model organism studies. The identification of moderately significant (10−2>p>10−5) SNPs offers a novel method for detecting the “missing heritability” of hypertension. These findings warrant further studies in similar and other populations to assess the generalizability of our results, and illustrate the potential of the pathway-focused approach to investigate genetic variation in hypertension.
Collapse
Affiliation(s)
- Nicholas P Reder
- Department of Preventive Medicine and Epidemiology, Stritch School of Medicine, Loyola University Chicago, Maywood, Illinois, United States of America.
| | | | | | | | | | | | | |
Collapse
|
50
|
Uzun A, Laliberte A, Parker J, Andrew C, Winterrowd E, Sharma S, Istrail S, Padbury JF. dbPTB: a database for preterm birth. Database (Oxford) 2012; 2012:bar069. [PMID: 22323062 PMCID: PMC3275764 DOI: 10.1093/database/bar069] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2011] [Revised: 12/28/2011] [Accepted: 12/29/2011] [Indexed: 02/07/2023]
Abstract
Genome-wide association studies (GWAS) query the entire genome in a hypothesis-free, unbiased manner. Since they have the potential for identifying novel genetic variants, they have become a very popular approach to the investigation of complex diseases. Nonetheless, since the success of the GWAS approach varies widely, the identification of genetic variants for complex diseases remains a difficult problem. We developed a novel bioinformatics approach to identify the nominal genetic variants associated with complex diseases. To test the feasibility of our approach, we developed a web-based aggregation tool to organize the genes, genetic variations and pathways involved in preterm birth. We used semantic data mining to extract all published articles related to preterm birth. All articles were reviewed by a team of curators. Genes identified from public databases and archives of expression arrays were aggregated with genes curated from the literature. Pathway analysis was used to impute genes from pathways identified in the curations. The curated articles and collected genetic information form a unique resource for investigators interested in preterm birth. The Database for Preterm Birth exemplifies an approach that is generalizable to other disorders for which there is evidence of significant genetic contributions.
Collapse
Affiliation(s)
- Alper Uzun
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - Alyse Laliberte
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - Jeremy Parker
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - Caroline Andrew
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - Emily Winterrowd
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - Surendra Sharma
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - Sorin Istrail
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| | - James F. Padbury
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI 02905, USA, Center for Computational Molecular Biology, Brown University, Providence, RI 02912, USA and Brown Alpert Medical School, Providence, RI 02912, USA
| |
Collapse
|