1
|
Lesmana MHS, Le NQK, Chiu WC, Chung KH, Wang CY, Irham LM, Chung MH. Genomic-Analysis-Oriented Drug Repurposing in the Search for Novel Antidepressants. Biomedicines 2022; 10:biomedicines10081947. [PMID: 36009493 PMCID: PMC9405592 DOI: 10.3390/biomedicines10081947] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 08/07/2022] [Accepted: 08/08/2022] [Indexed: 12/02/2022] Open
Abstract
From inadequate prior antidepressants that targeted monoamine neurotransmitter systems emerged the discovery of alternative drugs for depression. For instance, drugs targeted interleukin 6 receptor (IL6R) in inflammatory system. Genomic analysis-based drug repurposing using single nucleotide polymorphism (SNP) inclined a promising method for several diseases. However, none of the diseases was depression. Thus, we aimed to identify drug repurposing candidates for depression treatment by adopting a genomic-analysis-based approach. The 5885 SNPs obtained from the machine learning approach were annotated using HaploReg v4.1. Five sets of functional annotations were applied to determine the depression risk genes. The STRING database was used to expand the target genes and identify drug candidates from the DrugBank database. We validated the findings using the ClinicalTrial.gov and PubMed databases. Seven genes were observed to be strongly associated with depression (functional annotation score = 4). Interestingly, IL6R was auspicious as a target gene according to the validation outcome. We identified 20 drugs that were undergoing preclinical studies or clinical trials for depression. In addition, we identified sarilumab and satralizumab as drugs that exhibit strong potential for use in the treatment of depression. Our findings indicate that a genomic-analysis-based approach can facilitate the discovery of drugs that can be repurposed for treating depression.
Collapse
Affiliation(s)
| | - Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei 11031, Taiwan
- Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei 11031, Taiwan
- Translational Imaging Research Center, Taipei Medical University Hospital, Taipei 11031, Taiwan
| | - Wei-Che Chiu
- Department of Psychiatry, Cathay General Hospital, Taipei 10630, Taiwan
- School of Medicine, Fu Jen Catholic University, New Taipei City 242062, Taiwan
| | - Kuo-Hsuan Chung
- Department of Psychiatry, School of Medicine, College of Medicine, Taipei Medical University, Taipei 11031, Taiwan
- Department of Psychiatry and Psychiatric Research Center, Taipei Medical University Hospital, Taipei Medical University, Taipei 11031, Taiwan
| | - Chih-Yang Wang
- Ph.D. Program for Cancer Molecular Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University and Academia Sinica, Taipei 11031, Taiwan
- Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei 11031, Taiwan
| | - Lalu Muhammad Irham
- Faculty of Pharmacy, University of Ahmad Dahlan, Yogyakarta 55164, Indonesia
- Correspondence: (L.M.I.); (M.-H.C.); Tel.: +62-851-322-55-414 (L.M.I.); +886-02-2736-1661 (M.-H.C.)
| | - Min-Huey Chung
- School of Nursing, College of Nursing, Taipei Medical University, Taipei 11031, Taiwan
- Department of Nursing, Shuang Ho Hospital, Taipei Medical University, New Taipei City 23561, Taiwan
- Correspondence: (L.M.I.); (M.-H.C.); Tel.: +62-851-322-55-414 (L.M.I.); +886-02-2736-1661 (M.-H.C.)
| |
Collapse
|
2
|
Jeng XJ, Rhyne J, Zhang T, Tzeng JY. Effective SNP ranking improves the performance of eQTL mapping. Genet Epidemiol 2020; 44:611-619. [PMID: 32216117 DOI: 10.1002/gepi.22293] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 02/21/2020] [Accepted: 03/11/2020] [Indexed: 11/06/2022]
Abstract
Genome-wide expression quantitative trait loci (eQTLs) mapping explores the relationship between gene expression and DNA variants, such as single-nucleotide polymorphism (SNPs), to understand genetic basis of human diseases. Due to the large number of genes and SNPs that need to be assessed, current methods for eQTL mapping often suffer from low detection power, especially for identifying trans-eQTLs. In this paper, we propose the idea of performing SNP ranking based on the higher criticism statistic, a summary statistic developed in large-scale signal detection. We illustrate how the HC-based SNP ranking can effectively prioritize eQTL signals over noise, greatly reduce the burden of joint modeling, and improve the power for eQTL mapping. Numerical results in simulation studies demonstrate the superior performance of our method compared to existing methods. The proposed method is also evaluated in HapMap eQTL data analysis and the results are compared to a database of known eQTLs.
Collapse
Affiliation(s)
- X Jessie Jeng
- Department of Statistics, North Carolina State University, Raleigh, North Carolina
| | - Jacob Rhyne
- Department of Statistics, North Carolina State University, Raleigh, North Carolina
| | - Teng Zhang
- Department of Statistics, North Carolina State University, Raleigh, North Carolina
| | - Jung-Ying Tzeng
- Department of Statistics, North Carolina State University, Raleigh, North Carolina.,Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina.,Department of Statistics, National Cheng-Kung University, Tainan, Taiwan.,Division of Biostatistics, Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan
| |
Collapse
|
3
|
Abstract
Expression quantitative trait locus (eQTL) analysis has proven to be a powerful method to describe how variation in phenotypes may be attributed to a given genotype. While the field of bioinformatics and genomics has experienced exponential growth with modern technological advances, an unintended consequence arises as a lack of a gold standard for many applications and methods, which may be compounded with ever-improving computational capabilities. Researchers working on eQTL analysis have at their disposal a multitude of bioinformatics software, each with different assumptions and algorithms, which may produce confusion as to their respective applicability. In this chapter, we will introduce eQTLs, survey commonly used software to conduct a mapping study, as well as provide data correction methods to avoid the pitfalls of such analyses.
Collapse
Affiliation(s)
- Conor Nodzak
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, USA.
| |
Collapse
|
4
|
Zeng Y. Cloning and Analysis of the Multiple Transcriptomes of Serine Protease Homologs in Crayfish (Procambarus clarkii). Immunol Invest 2019; 48:682-690. [DOI: 10.1080/08820139.2018.1509870] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Yong Zeng
- College of Life Sciences, Yantai University, Yantai, Shandong, PR China
| |
Collapse
|
5
|
Ferguson LB, Zhang L, Kircher D, Wang S, Mayfield RD, Crabbe JC, Morrisett RA, Harris RA, Ponomarev I. Dissecting Brain Networks Underlying Alcohol Binge Drinking Using a Systems Genomics Approach. Mol Neurobiol 2018; 56:2791-2810. [PMID: 30062672 PMCID: PMC6459809 DOI: 10.1007/s12035-018-1252-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2018] [Accepted: 07/17/2018] [Indexed: 12/22/2022]
Abstract
Alcohol use disorder (AUD) is a complex psychiatric disorder with strong genetic and environmental risk factors. We studied the molecular perturbations underlying risky drinking behavior by measuring transcriptome changes across the neurocircuitry of addiction in a genetic mouse model of binge drinking. Sixteen generations of selective breeding for high blood alcohol levels after a binge drinking session produced global changes in brain gene expression in alcohol-naïve High Drinking in the Dark (HDID-1) mice. Using gene expression profiles to generate circuit-level hypotheses, we developed a systems approach that integrated regulation of gene coexpression networks across multiple brain regions, neuron-specific transcriptional signatures, and knowledgebase analytics. Whole-cell, voltage-clamp recordings from nucleus accumbens shell neurons projecting to the ventral tegmental area showed differential ethanol-induced plasticity in HDID-1 and control mice and provided support for one of the hypotheses. There were similarities in gene networks between HDID-1 mouse brains and postmortem brains of human alcoholics, suggesting that some gene expression patterns associated with high alcohol consumption are conserved across species. This study demonstrated the value of gene networks for data integration across biological modalities and species to study mechanisms of disease.
Collapse
Affiliation(s)
- Laura B Ferguson
- The Waggoner Center for Alcohol and Addiction Research, The University of Texas at Austin, Austin, TX, USA.,The Institute for Neuroscience, The University of Texas at Austin, Austin, TX, USA
| | - Lingling Zhang
- The Waggoner Center for Alcohol and Addiction Research, The University of Texas at Austin, Austin, TX, USA
| | - Daniel Kircher
- The Waggoner Center for Alcohol and Addiction Research, The University of Texas at Austin, Austin, TX, USA
| | - Shi Wang
- The Waggoner Center for Alcohol and Addiction Research, The University of Texas at Austin, Austin, TX, USA
| | - R Dayne Mayfield
- The Waggoner Center for Alcohol and Addiction Research, The University of Texas at Austin, Austin, TX, USA
| | - John C Crabbe
- Portland Alcohol Research Center, VA Portland Health Care System, Portland, OR, USA.,Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, OR, USA
| | - Richard A Morrisett
- The Waggoner Center for Alcohol and Addiction Research, The University of Texas at Austin, Austin, TX, USA
| | - R Adron Harris
- The Waggoner Center for Alcohol and Addiction Research, The University of Texas at Austin, Austin, TX, USA
| | - Igor Ponomarev
- The Waggoner Center for Alcohol and Addiction Research, The University of Texas at Austin, Austin, TX, USA.
| |
Collapse
|
6
|
Serin EAR, Snoek LB, Nijveen H, Willems LAJ, Jiménez-Gómez JM, Hilhorst HWM, Ligterink W. Construction of a High-Density Genetic Map from RNA-Seq Data for an Arabidopsis Bay-0 × Shahdara RIL Population. Front Genet 2017; 8:201. [PMID: 29259624 PMCID: PMC5723289 DOI: 10.3389/fgene.2017.00201] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Accepted: 11/21/2017] [Indexed: 12/17/2022] Open
Abstract
High-density genetic maps are essential for high resolution mapping of quantitative traits. Here, we present a new genetic map for an Arabidopsis Bayreuth × Shahdara recombinant inbred line (RIL) population, built on RNA-seq data. RNA-seq analysis on 160 RILs of this population identified 30,049 single-nucleotide polymorphisms (SNPs) covering the whole genome. Based on a 100-kbp window SNP binning method, 1059 bin-markers were identified, physically anchored on the genome. The total length of the RNA-seq genetic map spans 471.70 centimorgans (cM) with an average marker distance of 0.45 cM and a maximum marker distance of 4.81 cM. This high resolution genotyping revealed new recombination breakpoints in the population. To highlight the advantages of such high-density map, we compared it to two publicly available genetic maps for the same population, comprising 69 PCR-based markers and 497 gene expression markers derived from microarray data, respectively. In this study, we show that SNP markers can effectively be derived from RNA-seq data. The new RNA-seq map closes many existing gaps in marker coverage, saturating the previously available genetic maps. Quantitative trait locus (QTL) analysis for published phenotypes using the available genetic maps showed increased QTL mapping resolution and reduced QTL confidence interval using the RNA-seq map. The new high-density map is a valuable resource that facilitates the identification of candidate genes and map-based cloning approaches.
Collapse
Affiliation(s)
- Elise A R Serin
- Wageningen Seed Lab, Laboratory of Plant Physiology, Wageningen University, Wageningen, Netherlands
| | - L B Snoek
- Laboratory of Nematology, Wageningen University, Wageningen, Netherlands.,Theoretical Biology and Bioinformatics, Utrecht University, Utrecht, Netherlands
| | - Harm Nijveen
- Wageningen Seed Lab, Laboratory of Plant Physiology, Wageningen University, Wageningen, Netherlands.,Laboratory of Bioinformatics, Wageningen University, Wageningen, Netherlands
| | - Leo A J Willems
- Wageningen Seed Lab, Laboratory of Plant Physiology, Wageningen University, Wageningen, Netherlands
| | - Jose M Jiménez-Gómez
- Department of Plant Breeding and Genetics, Max Planck Institute for Plant Breeding Research, Cologne, Germany.,Institut Jean-Pierre Bourgin, Institut National de la Recherche Agronomique, AgroParisTech, Centre National de la Recherche Scientifique, Université Paris-Saclay, Versailles Cedex, France
| | - Henk W M Hilhorst
- Wageningen Seed Lab, Laboratory of Plant Physiology, Wageningen University, Wageningen, Netherlands
| | - Wilco Ligterink
- Wageningen Seed Lab, Laboratory of Plant Physiology, Wageningen University, Wageningen, Netherlands
| |
Collapse
|
7
|
Xu CJ, Bonder MJ, Söderhäll C, Bustamante M, Baïz N, Gehring U, Jankipersadsing SA, van der Vlies P, van Diemen CC, van Rijkom B, Just J, Kull I, Kere J, Antó JM, Bousquet J, Zhernakova A, Wijmenga C, Annesi-Maesano I, Sunyer J, Melén E, Li Y, Postma DS, Koppelman GH. The emerging landscape of dynamic DNA methylation in early childhood. BMC Genomics 2017; 18:25. [PMID: 28056824 PMCID: PMC5217260 DOI: 10.1186/s12864-016-3452-1] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Accepted: 12/21/2016] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND DNA methylation has been found to associate with disease, aging and environmental exposure, but it is unknown how genome, environment and disease influence DNA methylation dynamics in childhood. RESULTS By analysing 538 paired DNA blood samples from children at birth and at 4-5 years old and 726 paired samples from children at 4 and 8 years old from four European birth cohorts using the Illumina Infinium Human Methylation 450 k chip, we have identified 14,150 consistent age-differential methylation sites (a-DMSs) at epigenome-wide significance of p < 1.14 × 10-7. Genes with an increase in age-differential methylation were enriched in pathways related to 'development', and were more often located in bivalent transcription start site (TSS) regions, which can silence or activate expression of developmental genes. Genes with a decrease in age-differential methylation were involved in cell signalling, and enriched on H3K27ac, which can predict developmental state. Maternal smoking tended to decrease methylation levels at the identified da-DMSs. We also found 101 a-DMSs (0.71%) that were regulated by genetic variants using cis-differential Methylation Quantitative Trait Locus (cis-dMeQTL) mapping. Moreover, a-DMS-associated genes during early development were significantly more likely to be linked with disease. CONCLUSION Our study provides new insights into the dynamic epigenetic landscape of the first 8 years of life.
Collapse
Affiliation(s)
- Cheng-Jian Xu
- Department of Pulmonology, GRIAC Research Institute, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands. .,Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands.
| | - Marc Jan Bonder
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Cilla Söderhäll
- Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden.,Department of Women's and Children's Health, Karolinska Institutet, Stockholm, Sweden
| | - Mariona Bustamante
- ISGlobal, Centre for Research in Environmental Epidemiology, Barcelona, Spain.,Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,CIBER Epidemiología y Salud Pública (CIBERESP), Madrid, Spain
| | - Nour Baïz
- Epidemiology of Allergic and Respiratory Diseases Department (EPAR), Sorbonne Université, UPMC Univ Paris 06, INSERM, Pierre Louis Institute of Epidemiology and Public Health, Saint-Antoine Medical School, Paris, France
| | - Ulrike Gehring
- Institute for Risk Assessment Sciences, Utrecht University, Utrecht, The Netherlands
| | - Soesma A Jankipersadsing
- Department of Pulmonology, GRIAC Research Institute, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands.,Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Pieter van der Vlies
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Cleo C van Diemen
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Bianca van Rijkom
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Jocelyne Just
- Epidemiology of Allergic and Respiratory Diseases Department (EPAR), Sorbonne Université, UPMC Univ Paris 06, INSERM, Pierre Louis Institute of Epidemiology and Public Health, Saint-Antoine Medical School, Paris, France.,Department of Allergology-Centre de l'Asthme et des Allergies, Hôpital d'Enfants Armand Trousseau, Assistance Publique-Hôpitaux de Paris, Paris, France
| | - Inger Kull
- Department of Clinical Science and Education, Stockholm South General Hospital, Karolinska Institutet, and Sachs' Children's Hospital, SE-118 83, Stockholm, Sweden
| | - Juha Kere
- Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden.,Folkhälsan Institute of Genetics and Research Programs Unit, University of Helsinki, Helsinki, Finland
| | - Josep Maria Antó
- ISGlobal, Centre for Research in Environmental Epidemiology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,CIBER Epidemiología y Salud Pública (CIBERESP), Madrid, Spain.,IMIM (Hospital del Mar Medical Research Institute), Barcelona, Spain
| | - Jean Bousquet
- University Hospital, Montpellier, France.,MACVIA-France, Contre les Maladies Chroniques pour un VIeillissement Actif en France, European Innovation Partnership on Active and Healthy Ageing Reference Site, Paris, France.,INSERM, VIMA: Ageing and chronic diseases. Epidemiological and public health approaches, U1168, Paris, France.,UVSQ, UMR-S 1168, Université Versailles St-Quentin-en-Yvelines, Versailles, France
| | - Alexandra Zhernakova
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Cisca Wijmenga
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Isabella Annesi-Maesano
- Epidemiology of Allergic and Respiratory Diseases Department (EPAR), Sorbonne Université, UPMC Univ Paris 06, INSERM, Pierre Louis Institute of Epidemiology and Public Health, Saint-Antoine Medical School, Paris, France
| | - Jordi Sunyer
- ISGlobal, Centre for Research in Environmental Epidemiology, Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain.,CIBER Epidemiología y Salud Pública (CIBERESP), Madrid, Spain.,IMIM (Hospital del Mar Medical Research Institute), Barcelona, Spain
| | - Erik Melén
- Department of Paediatric Pulmonology and Paediatric Allergy, Beatrix Children's Hospital, GRIAC Research Institute, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Yang Li
- Department of Genetics, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands.
| | - Dirkje S Postma
- Department of Pulmonology, GRIAC Research Institute, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Gerard H Koppelman
- Department of Paediatric Pulmonology and Paediatric Allergy, University of Groningen, University Medical Center Groningen, Beatrix Children's Hospital, GRIAC Research Institute, Groningen, The Netherlands
| |
Collapse
|
8
|
Cogni R, Kuczynski K, Lavington E, Koury S, Behrman EL, O'Brien KR, Schmidt PS, Eanes WF. Variation in Drosophila melanogaster central metabolic genes appears driven by natural selection both within and between populations. Proc Biol Sci 2016; 282:20142688. [PMID: 25520361 DOI: 10.1098/rspb.2014.2688] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
In this report, we examine the hypothesis that the drivers of latitudinal selection observed in the eastern US Drosophila melanogaster populations are reiterated within seasons in a temperate orchard population in Pennsylvania, USA. Specifically, we ask whether alleles that are apparently favoured in northern populations are also favoured early in the spring, and decrease in frequency from the spring to autumn with the population expansion. We use SNP data collected for 46 metabolic genes and 128 SNPs representing the central metabolic pathway and examine for the aggregate SNP allele frequencies whether the association of allele change with latitude and that with increasing days of spring-autumn season are reversed. Testing by random permutation, we observe a highly significant negative correlation between these associations that is consistent with this expectation. This correlation is stronger when we confine our analysis to only those alleles that show significant latitudinal changes. This pattern is not caused by association with chromosomal inversions. When data are resampled using SNPs for amino acid change the relationship is not significant but is supported when SNPs associated with cis-expression are only considered. Our results suggest that climate factors driving latitudinal molecular variation in a metabolic pathway are related to those operating on a seasonal level within populations.
Collapse
Affiliation(s)
- Rodrigo Cogni
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY 11794, USA
| | - Kate Kuczynski
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY 11794, USA
| | - Erik Lavington
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY 11794, USA
| | - Spencer Koury
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY 11794, USA
| | - Emily L Behrman
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
| | | | - Paul S Schmidt
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - Walter F Eanes
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY 11794, USA
| |
Collapse
|
9
|
PExFInS: An Integrative Post-GWAS Explorer for Functional Indels and SNPs. Sci Rep 2015; 5:17302. [PMID: 26612672 PMCID: PMC4661514 DOI: 10.1038/srep17302] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Accepted: 10/28/2015] [Indexed: 12/22/2022] Open
Abstract
Expression quantitative trait loci (eQTLs) mapping and linkage disequilibrium (LD) analysis have been widely employed to interpret findings of genome-wide association studies (GWAS). With the availability of deep sequencing data of 423 lymphoblastoid cell lines (LCLs) from six global populations and the microarray expression data, we performed eQTL analysis, identified more than 228 K SNP cis-eQTLs and 21 K indel cis-eQTLs and generated a LCL cis-eQTL database. We demonstrate that the percentages of population-shared and population-specific cis-eQTLs are comparable; while indel cis-eQTLs in the population-specific subsection make more contribution to gene expression variations than those in the population-shared subsection. We found cis-eQTLs, especially the population-shared cis-eQTLs are significantly enriched toward transcription start site. Moreover, the National Human Genome Research Institute cataloged GWAS SNPs are enriched for LCL cis-eQTLs. Specifically, 32.8% GWAS SNPs are LCL cis-eQTLs, among which 12.5% can be tagged by indel cis-eQTLs, suggesting the fundamental contribution of indel cis-eQTLs to GWAS association signals. To search for functional indels and SNPs tagging GWAS SNPs, a pipeline Post-GWAS Explorer for Functional Indels and SNPs (PExFInS) has been developed, integrating LD analysis, functional annotation from public databases, cis-eQTL mapping with our LCL cis-eQTL database and other published cis-eQTL datasets.
Collapse
|
10
|
King EG, Sanderson BJ, McNeil CL, Long AD, Macdonald SJ. Genetic dissection of the Drosophila melanogaster female head transcriptome reveals widespread allelic heterogeneity. PLoS Genet 2014; 10:e1004322. [PMID: 24810915 PMCID: PMC4014434 DOI: 10.1371/journal.pgen.1004322] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2013] [Accepted: 03/10/2014] [Indexed: 12/01/2022] Open
Abstract
Modern genetic mapping is plagued by the “missing heritability” problem, which refers to the discordance between the estimated heritabilities of quantitative traits and the variance accounted for by mapped causative variants. One major potential explanation for the missing heritability is allelic heterogeneity, in which there are multiple causative variants at each causative gene with only a fraction having been identified. The majority of genome-wide association studies (GWAS) implicitly assume that a single SNP can explain all the variance for a causative locus. However, if allelic heterogeneity is prevalent, a substantial amount of genetic variance will remain unexplained. In this paper, we take a haplotype-based mapping approach and quantify the number of alleles segregating at each locus using a large set of 7922 eQTL contributing to regulatory variation in the Drosophila melanogaster female head. Not only does this study provide a comprehensive eQTL map for a major community genetic resource, the Drosophila Synthetic Population Resource, but it also provides a direct test of the allelic heterogeneity hypothesis. We find that 95% of cis-eQTLs and 78% of trans-eQTLs are due to multiple alleles, demonstrating that allelic heterogeneity is widespread in Drosophila eQTL. Allelic heterogeneity likely contributes significantly to the missing heritability problem common in GWAS studies. For traits with complex genetic inheritance it has generally proven very difficult to identify the majority of the specific causative variants involved. A range of hypotheses have been put forward to explain this so-called “missing heritability”. One idea—allelic heterogeneity, where genes each harbor multiple different causative variants—has received little attention, because it is difficult to detect with most genetic mapping designs. Here we make use of a panel of Drosophila melanogaster lines derived from multiple founders, allowing us to directly test for the presence of multiple alleles at a large set of genetic loci influencing gene expression. We find that the vast majority of loci harbor more than two functional alleles, demonstrating extensive allelic heterogeneity at the level of gene expression and suggesting that such heterogeneity is an important factor determining the genetic basis of complex trait variation in general.
Collapse
Affiliation(s)
- Elizabeth G. King
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California, United States of America
- * E-mail:
| | - Brian J. Sanderson
- Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, United States of America
| | - Casey L. McNeil
- Department of Biology, Newman University, Wichita, Kansas, United States of America
| | - Anthony D. Long
- Department of Ecology and Evolutionary Biology, University of California Irvine, Irvine, California, United States of America
| | - Stuart J. Macdonald
- Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, United States of America
| |
Collapse
|
11
|
Yang HC, Lin CW, Chen CW, Chen JJ. Applying genome-wide gene-based expression quantitative trait locus mapping to study population ancestry and pharmacogenetics. BMC Genomics 2014; 15:319. [PMID: 24779372 PMCID: PMC4236814 DOI: 10.1186/1471-2164-15-319] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Accepted: 04/15/2014] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND Gene-based analysis has become popular in genomic research because of its appealing biological and statistical properties compared with those of a single-locus analysis. However, only a few, if any, studies have discussed a mapping of expression quantitative trait loci (eQTL) in a gene-based framework. Neither study has discussed ancestry-informative eQTL nor investigated their roles in pharmacogenetics by integrating single nucleotide polymorphism (SNP)-based eQTL (s-eQTL) and gene-based eQTL (g-eQTL). RESULTS In this g-eQTL mapping study, the transcript expression levels of genes (transcript-level genes; T-genes) were correlated with the SNPs of genes (sequence-level genes; S-genes) by using a method of gene-based partial least squares (PLS). Ancestry-informative transcripts were identified using a rank-score-based multivariate association test, and ancestry-informative eQTL were identified using Fisher's exact test. Furthermore, key ancestry-predictive eQTL were selected in a flexible discriminant analysis. We analyzed SNPs and gene expression of 210 independent people of African-, Asian- and European-descent. We identified numerous cis- and trans-acting g-eQTL and s-eQTL for each population by using PLS. We observed ancestry information enriched in eQTL. Furthermore, we identified 2 ancestry-informative eQTL associated with adverse drug reactions and/or drug response. Rs1045642, located on MDR1, is an ancestry-informative eQTL (P = 2.13E-13, using Fisher's exact test) associated with adverse drug reactions to amitriptyline and nortriptyline and drug responses to morphine. Rs20455, located in KIF6, is an ancestry-informative eQTL (P = 2.76E-23, using Fisher's exact test) associated with the response to statin drugs (e.g., pravastatin and atorvastatin). The ancestry-informative eQTL of drug biotransformation genes were also observed; cross-population cis-acting expression regulators included SPG7, TAP2, SLC7A7, and CYP4F2. Finally, we also identified key ancestry-predictive eQTL and established classification models with promising training and testing accuracies in separating samples from close populations. CONCLUSIONS In summary, we developed a gene-based PLS procedure and a SAS macro for identifying g-eQTL and s-eQTL. We established data archives of eQTL for global populations. The program and data archives are accessible at http://www.stat.sinica.edu.tw/hsinchou/genetics/eQTL/HapMapII.htm. Finally, the results from our investigations regarding the interrelationship between eQTL, ancestry information, and pharmacodynamics provide rich resources for future eQTL studies and practical applications in population genetics and medical genetics.
Collapse
Affiliation(s)
- Hsin-Chou Yang
- Institute of Statistical Science, Academia Sinica, No 128, Academia Road, Section 2, Nankang, Taipei, Taiwan.
| | | | | | | |
Collapse
|
12
|
Lavington E, Cogni R, Kuczynski C, Koury S, Behrman EL, O'Brien KR, Schmidt PS, Eanes WF. A small system--high-resolution study of metabolic adaptation in the central metabolic pathway to temperate climates in Drosophila melanogaster. Mol Biol Evol 2014; 31:2032-41. [PMID: 24770333 DOI: 10.1093/molbev/msu146] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
In this article, we couple the geographic variation in 127 single-nucleotide polymorphism (SNP) frequencies in genes of 46 enzymes of central metabolism with their associated cis-expression variation to predict latitudinal or climatic-driven gene expression changes in the metabolic architecture of Drosophila melanogaster. Forty-two percent of the SNPs in 65% of the genes show statistically significant clines in frequency with latitude across the 20 local population samples collected from southern Florida to Ontario. A number of SNPs in the screened genes are also associated with significant expression variation within the Raleigh population from North Carolina. A principal component analysis of the full variance-covariance matrix of latitudinal changes in SNP-associated standardized gene expression allows us to identify those major genes in the pathway and its associated branches that are likely targets of natural selection. When embedded in a central metabolic context, we show that these apparent targets are concentrated in the genes of the upper glycolytic pathway and pentose shunt, those controlling glycerol shuttle activity, and finally those enzymes associated with the utilization of glutamate and pyruvate. These metabolites possess high connectivity and thus may be the points where flux balance can be best shifted. We also propose that these points are conserved points associated with coupling energy homeostasis and energy sensing in mammals. We speculate that the modulation of gene expression at specific points in central metabolism that are associated with shifting flux balance or possibly energy-state sensing plays a role in adaptation to climatic variation.
Collapse
Affiliation(s)
- Erik Lavington
- Department of Ecology and Evolution, Stony Brook University
| | - Rodrigo Cogni
- Department of Ecology and Evolution, Stony Brook University
| | | | - Spencer Koury
- Department of Ecology and Evolution, Stony Brook University
| | | | | | | | - Walter F Eanes
- Department of Ecology and Evolution, Stony Brook University
| |
Collapse
|
13
|
Cogni R, Kuczynski C, Koury S, Lavington E, Behrman EL, O'Brien KR, Schmidt PS, Eanes WF. THE INTENSITY OF SELECTION ACTING ON THECOUCH POTATOGENE-SPATIAL-TEMPORAL VARIATION IN A DIAPAUSE CLINE. Evolution 2013; 68:538-48. [DOI: 10.1111/evo.12291] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2013] [Accepted: 09/26/2013] [Indexed: 11/27/2022]
Affiliation(s)
- Rodrigo Cogni
- Department of Ecology and Evolution; Stony Brook University; Stony Brook New York
| | - Caitlin Kuczynski
- Department of Ecology and Evolution; Stony Brook University; Stony Brook New York
| | - Spencer Koury
- Department of Ecology and Evolution; Stony Brook University; Stony Brook New York
| | - Erik Lavington
- Department of Ecology and Evolution; Stony Brook University; Stony Brook New York
| | - Emily L. Behrman
- Department of Biology; University of Pennsylvania; Philadelphia Pennsylvania
| | | | - Paul S. Schmidt
- Department of Biology; University of Pennsylvania; Philadelphia Pennsylvania
| | - Walter F. Eanes
- Department of Ecology and Evolution; Stony Brook University; Stony Brook New York
| |
Collapse
|
14
|
Abo R, Jenkins GD, Wang L, Fridley BL. Identifying the genetic variation of gene expression using gene sets: application of novel gene Set eQTL approach to PharmGKB and KEGG. PLoS One 2012; 7:e43301. [PMID: 22905253 PMCID: PMC3419168 DOI: 10.1371/journal.pone.0043301] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2012] [Accepted: 07/19/2012] [Indexed: 11/18/2022] Open
Abstract
Genetic variation underlying the regulation of mRNA gene expression in humans may provide key insights into the molecular mechanisms of human traits and complex diseases. Current statistical methods to map genetic variation associated with mRNA gene expression have typically applied standard linkage and/or association methods; however, when genome-wide SNP and mRNA expression data are available performing all pair wise comparisons is computationally burdensome and may not provide optimal power to detect associations. Consideration of different approaches to account for the high dimensionality and multiple testing issues may provide increased efficiency and statistical power. Here we present a novel approach to model and test the association between genetic variation and mRNA gene expression levels in the context of gene sets (GSs) and pathways, referred to as gene set - expression quantitative trait loci analysis (GS-eQTL). The method uses GSs to initially group SNPs and mRNA expression, followed by the application of principal components analysis (PCA) to collapse the variation and reduce the dimensionality within the GSs. We applied GS-eQTL to assess the association between SNP and mRNA expression level data collected from a cell-based model system using PharmGKB and KEGG defined GSs. We observed a large number of significant GS-eQTL associations, in which the most significant associations arose between genetic variation and mRNA expression from the same GS. However, a number of associations involving genetic variation and mRNA expression from different GSs were also identified. Our proposed GS-eQTL method effectively addresses the multiple testing limitations in eQTL studies and provides biological context for SNP-expression associations.
Collapse
Affiliation(s)
- Ryan Abo
- Division of Clinical Pharmacology, Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Gregory D. Jenkins
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Liewei Wang
- Division of Clinical Pharmacology, Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Brooke L. Fridley
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America
- * E-mail:
| |
Collapse
|
15
|
Dannemann M, Lachmann M, Lorenc A. 'maskBAD'--a package to detect and remove Affymetrix probes with binding affinity differences. BMC Bioinformatics 2012; 13:56. [PMID: 22507266 PMCID: PMC3439685 DOI: 10.1186/1471-2105-13-56] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2011] [Accepted: 03/16/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Hybridization differences caused by target sequence differences can be a confounding factor in analyzing gene expression on microarrays, lead to false positives and reduce power to detect real expression differences. We prepared an R Bioconductor compatible package to detect, characterize and remove such probes in Affymetrix 3'IVT and exon-based arrays on the basis of correlation of signal intensities from probes within probe sets. RESULTS Using completely mouse genomes we determined type 1 (false negatives) and type 2 (false positives) errors with high accuracy and we show that our method routinely outperforms previous methods. When detecting 76.2% of known SNP/indels in mouse expression data, we obtain at most 5.5% false positives. At the same level of false positives, best previous method detected 72.6%. We also show that probes with differing binding affinity both hinder differential expression detection and introduce artifacts in cancer-healthy tissue comparison. CONCLUSIONS Detection and removal of such probes should be a routine step in Affymetrix data preprocessing. We prepared a user friendly R package, compatible with Bioconductor, that allows the filtering and improving of data from Affymetrix microarrays experiments.
Collapse
Affiliation(s)
- Michael Dannemann
- Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103 Leipzig, Germany
| | | | | |
Collapse
|
16
|
Farris SP, Miles MF. Ethanol modulation of gene networks: implications for alcoholism. Neurobiol Dis 2012; 45:115-21. [PMID: 21536129 PMCID: PMC3158275 DOI: 10.1016/j.nbd.2011.04.013] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2011] [Revised: 04/11/2011] [Accepted: 04/13/2011] [Indexed: 12/21/2022] Open
Abstract
Alcoholism is a complex disease caused by a confluence of environmental and genetic factors influencing multiple brain pathways to produce a variety of behavioral sequelae, including addiction. Genetic factors contribute to over 50% of the risk for alcoholism and recent evidence points to a large number of genes with small effect sizes as the likely molecular basis for this disease. Recent progress in genomics (microarrays or RNA-Seq) and genetics has led to the identification of a large number of potential candidate genes influencing ethanol behaviors or alcoholism itself. To organize this complex information, investigators have begun to focus on the contribution of gene networks, rather than individual genes, for various ethanol-induced behaviors in animal models or behavioral endophenotypes comprising alcoholism. This chapter reviews some of the methods used for constructing gene networks from genomic data and some of the recent progress made in applying such approaches to the study of the neurobiology of ethanol. We show that rapid technology development in gathering genomic data, together with sophisticated experimental design and a growing collection of analysis tools are producing novel insights for understanding the molecular basis of alcoholism and that such approaches promise new opportunities for therapeutic development.
Collapse
Affiliation(s)
- Sean P Farris
- Department of Pharmacology and Toxicology, Virginia Commonwealth University, Richmond, VA 23298, USA
| | | |
Collapse
|
17
|
Farber CR, Bennett BJ, Orozco L, Zou W, Lira A, Kostem E, Kang HM, Furlotte N, Berberyan A, Ghazalpour A, Suwanwela J, Drake TA, Eskin E, Wang QT, Teitelbaum SL, Lusis AJ. Mouse genome-wide association and systems genetics identify Asxl2 as a regulator of bone mineral density and osteoclastogenesis. PLoS Genet 2011; 7:e1002038. [PMID: 21490954 PMCID: PMC3072371 DOI: 10.1371/journal.pgen.1002038] [Citation(s) in RCA: 101] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2010] [Accepted: 02/12/2011] [Indexed: 12/31/2022] Open
Abstract
Significant advances have been made in the discovery of genes affecting bone mineral density (BMD); however, our understanding of its genetic basis remains incomplete. In the current study, genome-wide association (GWA) and co-expression network analysis were used in the recently described Hybrid Mouse Diversity Panel (HMDP) to identify and functionally characterize novel BMD genes. In the HMDP, a GWA of total body, spinal, and femoral BMD revealed four significant associations (-log10P>5.39) affecting at least one BMD trait on chromosomes (Chrs.) 7, 11, 12, and 17. The associations implicated a total of 163 genes with each association harboring between 14 and 112 genes. This list was reduced to 26 functional candidates by identifying those genes that were regulated by local eQTL in bone or harbored potentially functional non-synonymous (NS) SNPs. This analysis revealed that the most significant BMD SNP on Chr. 12 was a NS SNP in the additional sex combs like-2 (Asxl2) gene that was predicted to be functional. The involvement of Asxl2 in the regulation of bone mass was confirmed by the observation that Asxl2 knockout mice had reduced BMD. To begin to unravel the mechanism through which Asxl2 influenced BMD, a gene co-expression network was created using cortical bone gene expression microarray data from the HMDP strains. Asxl2 was identified as a member of a co-expression module enriched for genes involved in the differentiation of myeloid cells. In bone, osteoclasts are bone-resorbing cells of myeloid origin, suggesting that Asxl2 may play a role in osteoclast differentiation. In agreement, the knockdown of Asxl2 in bone marrow macrophages impaired their ability to form osteoclasts. This study identifies a new regulator of BMD and osteoclastogenesis and highlights the power of GWA and systems genetics in the mouse for dissecting complex genetic traits.
Collapse
Affiliation(s)
- Charles R Farber
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia, United States of America.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Pelgas B, Bousquet J, Meirmans PG, Ritland K, Isabel N. QTL mapping in white spruce: gene maps and genomic regions underlying adaptive traits across pedigrees, years and environments. BMC Genomics 2011; 12:145. [PMID: 21392393 PMCID: PMC3068112 DOI: 10.1186/1471-2164-12-145] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2010] [Accepted: 03/10/2011] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND The genomic architecture of bud phenology and height growth remains poorly known in most forest trees. In non model species, QTL studies have shown limited application because most often QTL data could not be validated from one experiment to another. The aim of our study was to overcome this limitation by basing QTL detection on the construction of genetic maps highly-enriched in gene markers, and by assessing QTLs across pedigrees, years, and environments. RESULTS Four saturated individual linkage maps representing two unrelated mapping populations of 260 and 500 clonally replicated progeny were assembled from 471 to 570 markers, including from 283 to 451 gene SNPs obtained using a multiplexed genotyping assay. Thence, a composite linkage map was assembled with 836 gene markers.For individual linkage maps, a total of 33 distinct quantitative trait loci (QTLs) were observed for bud flush, 52 for bud set, and 52 for height growth. For the composite map, the corresponding numbers of QTL clusters were 11, 13, and 10. About 20% of QTLs were replicated between the two mapping populations and nearly 50% revealed spatial and/or temporal stability. Three to four occurrences of overlapping QTLs between characters were noted, indicating regions with potential pleiotropic effects. Moreover, some of the genes involved in the QTLs were also underlined by recent genome scans or expression profile studies.Overall, the proportion of phenotypic variance explained by each QTL ranged from 3.0 to 16.4% for bud flush, from 2.7 to 22.2% for bud set, and from 2.5 to 10.5% for height growth. Up to 70% of the total character variance could be accounted for by QTLs for bud flush or bud set, and up to 59% for height growth. CONCLUSIONS This study provides a basic understanding of the genomic architecture related to bud flush, bud set, and height growth in a conifer species, and a useful indicator to compare with Angiosperms. It will serve as a basic reference to functional and association genetic studies of adaptation and growth in Picea taxa. The putative QTNs identified will be tested for associations in natural populations, with potential applications in molecular breeding and gene conservation programs. QTLs mapping consistently across years and environments could also be the most important targets for breeding, because they represent genomic regions that may be least affected by G × E interactions.
Collapse
Affiliation(s)
- Betty Pelgas
- Natural Resources Canada, Canadian Forest Service, Laurentian Forestry Centre, Québec, Québec, G1V 4C7, Canada
- Arborea and Canada Research Chair in Forest and Environmental Genomics, Forest Research Centre and Institute for Systems and Integrative Biology, Université Laval, Québec, Québec, G1V OA6, Canada
| | - Jean Bousquet
- Arborea and Canada Research Chair in Forest and Environmental Genomics, Forest Research Centre and Institute for Systems and Integrative Biology, Université Laval, Québec, Québec, G1V OA6, Canada
| | - Patrick G Meirmans
- Natural Resources Canada, Canadian Forest Service, Laurentian Forestry Centre, Québec, Québec, G1V 4C7, Canada
- Current address: Institute of Biodiversity and Ecosystem Dynamics, Universiteit van Amsterdam, PO Box 94248, 1090GE Amsterdam, The Netherlands
| | - Kermit Ritland
- Department of Forest Science, Faculty of Forestry, The University of British Columbia, 2424 Main Mall, Vancouver, BC, V6T 1Z4, Canada
| | - Nathalie Isabel
- Natural Resources Canada, Canadian Forest Service, Laurentian Forestry Centre, Québec, Québec, G1V 4C7, Canada
- Arborea and Canada Research Chair in Forest and Environmental Genomics, Forest Research Centre and Institute for Systems and Integrative Biology, Université Laval, Québec, Québec, G1V OA6, Canada
| |
Collapse
|
19
|
Hsiao CL, Lian IB, Hsieh AR, Fann CS. Modeling expression quantitative trait loci in data combining ethnic populations. BMC Bioinformatics 2010; 11:111. [PMID: 20187971 PMCID: PMC2844390 DOI: 10.1186/1471-2105-11-111] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2009] [Accepted: 02/27/2010] [Indexed: 12/18/2022] Open
Abstract
Background Combining data from different ethnic populations in a study can increase efficacy of methods designed to identify expression quantitative trait loci (eQTL) compared to analyzing each population independently. In such studies, however, the genetic diversity of minor allele frequencies among populations has rarely been taken into account. Due to the fact that allele frequency diversity and population-level expression differences are present in populations, a consensus regarding the optimal statistical approach for analysis of eQTL in data combining different populations remains inconclusive. Results In this report, we explored the applicability of a constrained two-way model to identify eQTL for combined ethnic data that might contain genetic diversity among ethnic populations. In addition, gene expression differences resulted from ethnic allele frequency diversity between populations were directly estimated and analyzed by the constrained two-way model. Through simulation, we investigated effects of genetic diversity on eQTL identification by examining gene expression data pooled from normal quantile transformation of each population. Using the constrained two-way model to reanalyze data from Caucasians and Asian individuals available from HapMap, a large number of eQTL were identified with similar genetic effects on the gene expression levels in these two populations. Furthermore, 19 single nucleotide polymorphisms with inter-population differences with respect to both genotype frequency and gene expression levels directed by genotypes were identified and reflected a clear distinction between Caucasians and Asian individuals. Conclusions This study illustrates the influence of minor allele frequencies on common eQTL identification using either separate or combined population data. Our findings are important for future eQTL studies in which different datasets are combined to increase the power of eQTL identification.
Collapse
Affiliation(s)
- Ching-Lin Hsiao
- Division of Biostatistics, Institute & Department of Public Health, National Yang-Ming University, Taipei 112, Taiwan
| | | | | | | |
Collapse
|
20
|
Abstract
Common sequence variants within a gene often generate important differences in expression of corresponding mRNAs. This high level of local (allelic) control-or cis modulation-rivals that produced by gene targeting, but expression is titrated finely over a range of levels. We are interested in exploiting this allelic variation to study gene function and downstream consequences of differences in expression dosage. We have used several bioinformatics and molecular approaches to estimate error rates in the discovery of cis modulation and to analyze some of the biological and technical confounds that contribute to the variation in gene expression profiling. Our analysis of SNPs and alternative transcripts, combined with eQTL maps and selective gene resequencing, revealed that between 17 and 25% of apparent cis modulation is caused by SNPs that overlap probes rather than by genuine quantitative differences in mRNA levels. This estimate climbs to 40-50% when qualitative differences between isoform variants are included. We have developed an analytical approach to filter differences in expression and improve the yield of genuine cis-modulated transcripts to approximately 80%. This improvement is important because the resulting variation can be successfully used to study downstream consequences of altered expression on higher-order phenotypes. Using a systems genetics approach we show that two validated cis-modulated genes, Stk25 and Rasd2, are likely to control expression of downstream targets and affect disease susceptibility.
Collapse
|