1
|
Wenz BM, He Y, Chen NC, Pickrell JK, Li JH, Dudek MF, Li T, Keener R, Voight BF, Brown CD, Battle A. Genotype inference from aggregated chromatin accessibility data reveals genetic regulatory mechanisms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.04.610850. [PMID: 39282458 PMCID: PMC11398312 DOI: 10.1101/2024.09.04.610850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/21/2024]
Abstract
Background Understanding the genetic causes for variability in chromatin accessibility can shed light on the molecular mechanisms through which genetic variants may affect complex traits. Thousands of ATAC-seq samples have been collected that hold information about chromatin accessibility across diverse cell types and contexts, but most of these are not paired with genetic information and come from diverse distinct projects and laboratories. Results We report here joint genotyping, chromatin accessibility peak calling, and discovery of quantitative trait loci which influence chromatin accessibility (caQTLs), demonstrating the capability of performing caQTL analysis on a large scale in a diverse sample set without pre-existing genotype information. Using 10,293 profiling samples representing 1,454 unique donor individuals across 653 studies from public databases, we catalog 23,381 caQTLs in total. After joint discovery analysis, we cluster samples based on accessible chromatin profiles to identify context-specific caQTLs. We find that caQTLs are strongly enriched for annotations of gene regulatory elements across diverse cell types and tissues and are often strongly linked with genetic variation associated with changes in expression (eQTLs), indicating that caQTLs can mediate genetic effects on gene expression. We demonstrate sharing of causal variants for chromatin accessibility and diverse complex human traits, enabling a more complete picture of the genetic mechanisms underlying complex human phenotypes. Conclusions Our work provides a proof of principle for caQTL calling from previously ungenotyped samples, and represents one of the largest, most diverse caQTL resources currently available, informing mechanisms of genetic regulation of gene expression and contribution to disease.
Collapse
Affiliation(s)
- Brandon M. Wenz
- Genetics and Epigenetics Program, Cell and Molecular Biology Graduate Group, Biomedical Graduate Studies, University of Pennsylvania - Perelman School of Medicine, Philadelphia PA 19104
| | - Yuan He
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, 21218
| | - Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, 21218
| | | | | | - Max F. Dudek
- Graduate Group in Genomics and Computational Biology, University of Pennsylvania, Philadelphia, PA 19104
| | - Taibo Li
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, 21218
| | - Rebecca Keener
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, 21218
| | - Benjamin F. Voight
- Department of Genetics, University of Pennsylvania - Perelman School of Medicine, Philadelphia, PA, 19104
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania - Perelman School of Medicine, Philadelphia PA, 19104
- Institute for Translational Medicine and Therapeutics, University of Pennsylvania – Perelman School of Medicine, Philadelphia, PA, 19104
| | - Christopher D. Brown
- Department of Genetics, University of Pennsylvania - Perelman School of Medicine, Philadelphia, PA, 19104
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, 21218
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, 21218
- Department of Genetic Medicine, Johns Hopkins University; Baltimore, MD, 21218
- Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, 21218
- Data Science and AI Institute, Johns Hopkins University, Baltimore, MD, 21218
| |
Collapse
|
2
|
Brotman SM, El-Sayed Moustafa JS, Guan L, Broadaway KA, Wang D, Jackson AU, Welch R, Currin KW, Tomlinson M, Vadlamudi S, Stringham HM, Roberts AL, Lakka TA, Oravilahti A, Silva LF, Narisu N, Erdos MR, Yan T, Bonnycastle LL, Raulerson CK, Raza Y, Yan X, Parker SCJ, Kuusisto J, Pajukanta P, Tuomilehto J, Collins FS, Boehnke M, Love MI, Koistinen HA, Laakso M, Mohlke KL, Small KS, Scott LJ. Adipose tissue eQTL meta-analysis reveals the contribution of allelic heterogeneity to gene expression regulation and cardiometabolic traits. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.26.563798. [PMID: 37961277 PMCID: PMC10634839 DOI: 10.1101/2023.10.26.563798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Complete characterization of the genetic effects on gene expression is needed to elucidate tissue biology and the etiology of complex traits. Here, we analyzed 2,344 subcutaneous adipose tissue samples and identified 34K conditionally distinct expression quantitative trait locus (eQTL) signals in 18K genes. Over half of eQTL genes exhibited at least two eQTL signals. Compared to primary signals, non-primary signals had lower effect sizes, lower minor allele frequencies, and less promoter enrichment; they corresponded to genes with higher heritability and higher tolerance for loss of function. Colocalization of eQTL with conditionally distinct genome-wide association study signals for 28 cardiometabolic traits identified 3,605 eQTL signals for 1,861 genes. Inclusion of non-primary eQTL signals increased colocalized signals by 46%. Among 30 genes with ≥2 pairs of colocalized signals, 21 showed a mediating gene dosage effect on the trait. Thus, expanded eQTL identification reveals more mechanisms underlying complex traits and improves understanding of the complexity of gene expression regulation.
Collapse
Affiliation(s)
- Sarah M Brotman
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | | | - Li Guan
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - K Alaine Broadaway
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | - Dongmeng Wang
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Anne U Jackson
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Ryan Welch
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Kevin W Currin
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | - Max Tomlinson
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
- Department of Medical and Molecular Genetics, King's College London, London, UK
| | | | - Heather M Stringham
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Amy L Roberts
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Timo A Lakka
- Institute of Biomedicine, School of Medicine, University of Eastern Finland, Kuopio, Finland
- Department of Clinical Physiology and Nuclear Medicine, Kuopio University Hospital, Kuopio, Finland
- Foundation for Research in Health Exercise and Nutrition, Kuopio Research Institute of Exercise Medicine, Kuopio, Finland
| | - Anniina Oravilahti
- Institute of Clinical Medicine, Kuopio University Hospital, University of Eastern Finland, Kuopio, Finland
| | - Lilian Fernandes Silva
- Institute of Clinical Medicine, Kuopio University Hospital, University of Eastern Finland, Kuopio, Finland
| | - Narisu Narisu
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Michael R Erdos
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Tingfen Yan
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lori L Bonnycastle
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Yasrab Raza
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Xinyu Yan
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Stephen C J Parker
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, USA
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Johanna Kuusisto
- Department of Medicine and Clinical Research, Kuopio University Hospital, Kuopio, Finland
| | - Päivi Pajukanta
- Department of Human Genetics and Institute for Precision Health, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - Jaakko Tuomilehto
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
- Department of Public Health, University of Helsinki, Helsinki, Finland
- Diabetes Research Group, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Francis S Collins
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Michael Boehnke
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Michael I Love
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| | - Heikki A Koistinen
- Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
- University of Helsinki and Department of Medicine, Helsinki University Hospital, Helsinki, Finland
- Minerva Foundation Institute for Medical Research, Helsinki, Finland
| | - Markku Laakso
- Institute of Clinical Medicine, Kuopio University Hospital, University of Eastern Finland, Kuopio, Finland
- Department of Medicine and Clinical Research, Kuopio University Hospital, Kuopio, Finland
| | - Karen L Mohlke
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | - Kerrin S Small
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Laura J Scott
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
3
|
Marrella MA, Biase FH. Robust identification of regulatory variants (eQTLs) using a differential expression framework developed for RNA-sequencing. J Anim Sci Biotechnol 2023; 14:62. [PMID: 37143150 PMCID: PMC10161580 DOI: 10.1186/s40104-023-00861-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 03/05/2023] [Indexed: 05/06/2023] Open
Abstract
BACKGROUND A gap currently exists between genetic variants and the underlying cell and tissue biology of a trait, and expression quantitative trait loci (eQTL) studies provide important information to help close that gap. However, two concerns that arise with eQTL analyses using RNA-sequencing data are normalization of data across samples and the data not following a normal distribution. Multiple pipelines have been suggested to address this. For instance, the most recent analysis of the human and farm Genotype-Tissue Expression (GTEx) project proposes using trimmed means of M-values (TMM) to normalize the data followed by an inverse normal transformation. RESULTS In this study, we reasoned that eQTL analysis could be carried out using the same framework used for differential gene expression (DGE), which uses a negative binomial model, a statistical test feasible for count data. Using the GTEx framework, we identified 35 significant eQTLs (P < 5 × 10-8) following the ANOVA model and 39 significant eQTLs (P < 5 × 10-8) following the additive model. Using a differential gene expression framework, we identified 930 and six significant eQTLs (P < 5 × 10-8) following an analytical framework equivalent to the ANOVA and additive model, respectively. When we compared the two approaches, there was no overlap of significant eQTLs between the two frameworks. Because we defined specific contrasts, we identified trans eQTLs that more closely resembled what we expect from genetic variants showing complete dominance between alleles. Yet, these were not identified by the GTEx framework. CONCLUSIONS Our results show that transforming RNA-sequencing data to fit a normal distribution prior to eQTL analysis is not required when the DGE framework is employed. Our proposed approach detected biologically relevant variants that otherwise would not have been identified due to data transformation to fit a normal distribution.
Collapse
Affiliation(s)
- Mackenzie A Marrella
- School of Animal Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA
| | - Fernando H Biase
- School of Animal Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA.
| |
Collapse
|
4
|
Brown M, Greenwood E, Zeng B, Powell JE, Gibson G. Effect of all-but-one conditional analysis for eQTL isolation in peripheral blood. Genetics 2023; 223:iyac162. [PMID: 36321965 PMCID: PMC9836021 DOI: 10.1093/genetics/iyac162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 10/13/2022] [Indexed: 11/13/2022] Open
Abstract
Expression quantitative trait locus detection has become increasingly important for understanding how noncoding variants contribute to disease susceptibility and complex traits. The major challenges in expression quantitative trait locus fine-mapping and causal variant discovery relate to the impact of linkage disequilibrium on signals due to one or multiple functional variants that lie within a credible set. We perform expression quantitative trait locus fine-mapping using the all-but-one approach, conditioning each signal on all others detected in an interval, on the Consortium for the Architecture of Gene Expression cohorts of microarray-based peripheral blood gene expression in 2,138 European-ancestry human adults. We contrast these results with traditional forward stepwise conditional analysis and a Bayesian localization method. All-but-one conditioning significantly modifies effect-size estimates for 51% of 2,351 expression quantitative trait locus peaks, but only modestly affects credible set size and location. On the other hand, both conditioning approaches result in unexpectedly low overlap with Bayesian credible sets, with just 57% peak concordance and between 50% and 70% SNP sharing, leading us to caution against the assumption that any one localization method is superior to another. We also cross reference our results with ATAC-seq data, cell-type-specific expression quantitative trait locus, and activity-by-contact-enhancers, leading to the proposal of a 5-tier approach to further reduce credible set sizes and prioritize likely causal variants for all known inflammatory bowel disease risk loci active in immune cells.
Collapse
Affiliation(s)
- Margaret Brown
- Center for Integrative Genomics, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Emily Greenwood
- Center for Integrative Genomics, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Biao Zeng
- Present address for Biao Zeng: Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Joseph E Powell
- Present address for Joseph E Powell: Garvan-Weizmann Center for Cellular Genomics, Sydney, NSW 2010, Australia
| | - Greg Gibson
- Center for Integrative Genomics, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
5
|
Meta-imputation of transcriptome from genotypes across multiple datasets by leveraging publicly available summary-level data. PLoS Genet 2022; 18:e1009571. [PMID: 35100255 PMCID: PMC8830793 DOI: 10.1371/journal.pgen.1009571] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 02/10/2022] [Accepted: 01/07/2022] [Indexed: 11/22/2022] Open
Abstract
Transcriptome wide association studies (TWAS) can be used as a powerful method to identify and interpret the underlying biological mechanisms behind GWAS by mapping gene expression levels with phenotypes. In TWAS, gene expression is often imputed from individual-level genotypes of regulatory variants identified from external resources, such as Genotype-Tissue Expression (GTEx) Project. In this setting, a straightforward approach to impute expression levels of a specific tissue is to use the model trained from the same tissue type. When multiple tissues are available for the same subjects, it has been demonstrated that training imputation models from multiple tissue types improves the accuracy because of shared eQTLs between the tissues and increase in effective sample size. However, existing joint-tissue methods require access of genotype and expression data across all tissues. Moreover, they cannot leverage the abundance of various expression datasets across various tissues for non-overlapping individuals. Here, we explore the optimal way to combine imputed levels across training models from multiple tissues and datasets in a flexible manner using summary-level data. Our proposed method (SWAM) combines arbitrary number of transcriptome imputation models to linearly optimize the imputation accuracy given a target tissue. By integrating models across tissues and/or individuals, SWAM can improve the accuracy of transcriptome imputation or to improve power to TWAS while only requiring individual-level data from a single reference cohort. To evaluate the accuracy of SWAM, we combined 49 tissue-specific gene expression imputation models from the GTEx Project as well as from a large eQTL study of Depression Susceptibility Genes and Networks (DGN) Project and tested imputation accuracy in GEUVADIS lymphoblastoid cell lines samples. We also extend our meta-imputation method to meta-TWAS to leverage multiple tissues in TWAS analysis with summary-level statistics. Our results capitalize on the importance of integrating multiple tissues to unravel regulatory impacts of genetic variants on complex traits. The gene expression levels within a cell are affected by various factors, including DNA variation, cell type, cellular microenvironment, disease status, and other environmental factors surrounding the individual. The genetic component of gene expression is known to explain a substantial fraction of transcriptional variation among individuals and can be imputed from genotypes in a tissue-specific manner, by training from population-scale transcriptomic profiles designed to identify expression quantitative loci (eQTLs). Imputing gene expression levels is shown to help understand the genetic basis of human disease through Transcriptome-wide association analysis (TWAS) and Mendelian Randomization (MR). However, it has been unclear how to integrate multiple imputation models trained from individual datasets to maximize their accuracy without having to access individual genotypes and expression levels that are often protected for privacy concerns. We developed SWAM (Smartly Weighted Averaging across Multiple datasets), a meta-imputation framework which can accurately impute gene expression levels from genotypes by integrating multiple imputation models without requiring individual-level data. Our method examines the similarity or differences between resources and borrowing information most relevant to the tissue of interest. We demonstrate that SWAM outperforms existing single-tissue and multi-tissue imputation models and continue to increase accuracy when integrating additional imputation models.
Collapse
|
6
|
Ngwa JS, Yanek LR, Kammers K, Kanchan K, Taub MA, Scharpf RB, Faraday N, Becker LC, Mathias RA, Ruczinski I. Secondary analyses for genome-wide association studies using expression quantitative trait loci. Genet Epidemiol 2022; 46:170-181. [PMID: 35312098 PMCID: PMC9086181 DOI: 10.1002/gepi.22448] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 11/19/2021] [Accepted: 01/20/2022] [Indexed: 01/01/2023]
Abstract
Genome-wide association studies (GWAS) have successfully identified thousands of single nucleotide polymorphisms (SNPs) associated with complex traits; however, the identified SNPs account for a fraction of trait heritability, and identifying the functional elements through which genetic variants exert their effects remains a challenge. Recent evidence suggests that SNPs associated with complex traits are more likely to be expression quantitative trait loci (eQTL). Thus, incorporating eQTL information can potentially improve power to detect causal variants missed by traditional GWAS approaches. Using genomic, transcriptomic, and platelet phenotype data from the Genetic Study of Atherosclerosis Risk family-based study, we investigated the potential to detect novel genomic risk loci by incorporating information from eQTL in the relevant target tissues (i.e., platelets and megakaryocytes) using established statistical principles in a novel way. Permutation analyses were performed to obtain family-wise error rates for eQTL associations, substantially lowering the genome-wide significance threshold for SNP-phenotype associations. In addition to confirming the well known association between PEAR1 and platelet aggregation, our eQTL-focused approach identified a novel locus (rs1354034) and gene (ARHGEF3) not previously identified in a GWAS of platelet aggregation phenotypes. A colocalization analysis showed strong evidence for a functional role of this eQTL.
Collapse
Affiliation(s)
- Julius S. Ngwa
- Department of BiostatisticsJohns Hopkins Bloomberg School of Public HealthBaltimoreMarylandUSA
| | - Lisa R. Yanek
- Department of MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Kai Kammers
- Department of OncologyJohns Hopkins University, School of MedicineBaltimoreMarylandUSA
| | - Kanika Kanchan
- Department of MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Margaret A. Taub
- Department of BiostatisticsJohns Hopkins Bloomberg School of Public HealthBaltimoreMarylandUSA
| | - Robert B. Scharpf
- Department of OncologyJohns Hopkins University, School of MedicineBaltimoreMarylandUSA
| | - Nauder Faraday
- Department of Anesthesiology and Critical Care MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Lewis C. Becker
- Department of MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Rasika A. Mathias
- Department of MedicineJohns Hopkins University School of MedicineBaltimoreMarylandUSA
| | - Ingo Ruczinski
- Department of BiostatisticsJohns Hopkins Bloomberg School of Public HealthBaltimoreMarylandUSA
| |
Collapse
|
7
|
Cao C, Kwok D, Edie S, Li Q, Ding B, Kossinna P, Campbell S, Wu J, Greenberg M, Long Q. kTWAS: integrating kernel machine with transcriptome-wide association studies improves statistical power and reveals novel genes. Brief Bioinform 2021; 22:5985285. [PMID: 33200776 DOI: 10.1093/bib/bbaa270] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 09/17/2020] [Accepted: 09/18/2020] [Indexed: 12/31/2022] Open
Abstract
The power of genotype-phenotype association mapping studies increases greatly when contributions from multiple variants in a focal region are meaningfully aggregated. Currently, there are two popular categories of variant aggregation methods. Transcriptome-wide association studies (TWAS) represent a set of emerging methods that select variants based on their effect on gene expressions, providing pretrained linear combinations of variants for downstream association mapping. In contrast to this, kernel methods such as sequence kernel association test (SKAT) model genotypic and phenotypic variance use various kernel functions that capture genetic similarity between subjects, allowing nonlinear effects to be included. From the perspective of machine learning, these two methods cover two complementary aspects of feature engineering: feature selection/pruning and feature aggregation. Thus far, no thorough comparison has been made between these categories, and no methods exist which incorporate the advantages of TWAS- and kernel-based methods. In this work, we developed a novel method called kernel-based TWAS (kTWAS) that applies TWAS-like feature selection to a SKAT-like kernel association test, combining the strengths of both approaches. Through extensive simulations, we demonstrate that kTWAS has higher power than TWAS and multiple SKAT-based protocols, and we identify novel disease-associated genes in Wellcome Trust Case Control Consortium genotyping array data and MSSNG (Autism) sequence data. The source code for kTWAS and our simulations are available in our GitHub repository (https://github.com/theLongLab/kTWAS).
Collapse
Affiliation(s)
- Chen Cao
- Department of Biochemistry & Molecular Biology, University of Calgary
| | - Devin Kwok
- Department of Mathematics & Statistics, University of Calgary
| | | | - Qing Li
- Department of Biochemistry & Molecular Biology, University of Calgary
| | - Bowei Ding
- Department of Mathematics & Statistics, University of Calgary
| | - Pathum Kossinna
- Department of Biochemistry & Molecular Biology, University of Calgary
| | | | - Jingjing Wu
- Department of Mathematics & Statistics, University of Calgary
| | | | - Quan Long
- Departments of Biochemistry & Molecular Biology, Medical Genetics and Mathematics & Statistics
| |
Collapse
|
8
|
Korbolina EE, Bryzgalov LO, Ustrokhanova DZ, Postovalov SN, Poverin DV, Damarov IS, Merkulova TI. A Panel of rSNPs Demonstrating Allelic Asymmetry in Both ChIP-seq and RNA-seq Data and the Search for Their Phenotypic Outcomes through Analysis of DEGs. Int J Mol Sci 2021; 22:ijms22147240. [PMID: 34298860 PMCID: PMC8303726 DOI: 10.3390/ijms22147240] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 06/24/2021] [Accepted: 06/30/2021] [Indexed: 12/12/2022] Open
Abstract
Currently, the detection of the allele asymmetry of gene expression from RNA-seq data or the transcription factor binding from ChIP-seq data is one of the approaches used to identify the functional genetic variants that can affect gene expression (regulatory SNPs or rSNPs). In this study, we searched for rSNPs using the data for human pulmonary arterial endothelial cells (PAECs) available from the Sequence Read Archive (SRA). Allele-asymmetric binding and expression events are analyzed in paired ChIP-seq data for H3K4me3 mark and RNA-seq data obtained for 19 individuals. Two statistical approaches, weighted z-scores and predicted probabilities, were used to improve the efficiency of finding rSNPs. In total, we identified 14,266 rSNPs associated with both allele-specific binding and expression. Among them, 645 rSNPs were associated with GWAS phenotypes; 4746 rSNPs were reported as eQTLs by GTEx, and 11,536 rSNPs were located in 374 candidate transcription factor binding motifs. Additionally, we searched for the rSNPs associated with gene expression using an SRA RNA-seq dataset for 281 clinically annotated human postmortem brain samples and detected eQTLs for 2505 rSNPs. Based on these results, we conducted Gene Ontology (GO), Disease Ontology (DO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses and constructed the protein-protein interaction networks to represent the top-ranked biological processes with a possible contribution to the phenotypic outcome.
Collapse
Affiliation(s)
- Elena E. Korbolina
- The Federal Research Center Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Science, 10 LavrentyevaProspekt, 630090 Novosibirsk, Russia; (L.O.B.); (I.S.D.); (T.I.M.)
- Correspondence:
| | - Leonid O. Bryzgalov
- The Federal Research Center Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Science, 10 LavrentyevaProspekt, 630090 Novosibirsk, Russia; (L.O.B.); (I.S.D.); (T.I.M.)
- VECTOR-BEST, PO BOX 492, 630117 Novosibirsk, Russia
| | - Diana Z. Ustrokhanova
- Department of Information Biology, The Novosibirsk State University, 1 Pirogovast, 630090 Novosibirsk, Russia;
| | - Sergey N. Postovalov
- Department of Theoretical and Applied Informatics, The Novosibirsk State Technical University, 630073 Novosibirsk, Russia; (S.N.P.); (D.V.P.)
| | - Dmitry V. Poverin
- Department of Theoretical and Applied Informatics, The Novosibirsk State Technical University, 630073 Novosibirsk, Russia; (S.N.P.); (D.V.P.)
| | - Igor S. Damarov
- The Federal Research Center Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Science, 10 LavrentyevaProspekt, 630090 Novosibirsk, Russia; (L.O.B.); (I.S.D.); (T.I.M.)
| | - Tatiana I. Merkulova
- The Federal Research Center Institute of Cytology and Genetics, The Siberian Branch of the Russian Academy of Science, 10 LavrentyevaProspekt, 630090 Novosibirsk, Russia; (L.O.B.); (I.S.D.); (T.I.M.)
- Department of Information Biology, The Novosibirsk State University, 1 Pirogovast, 630090 Novosibirsk, Russia;
| |
Collapse
|
9
|
Degtyareva AO, Antontseva EV, Merkulova TI. Regulatory SNPs: Altered Transcription Factor Binding Sites Implicated in Complex Traits and Diseases. Int J Mol Sci 2021; 22:6454. [PMID: 34208629 PMCID: PMC8235176 DOI: 10.3390/ijms22126454] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 06/15/2021] [Accepted: 06/15/2021] [Indexed: 12/19/2022] Open
Abstract
The vast majority of the genetic variants (mainly SNPs) associated with various human traits and diseases map to a noncoding part of the genome and are enriched in its regulatory compartment, suggesting that many causal variants may affect gene expression. The leading mechanism of action of these SNPs consists in the alterations in the transcription factor binding via creation or disruption of transcription factor binding sites (TFBSs) or some change in the affinity of these regulatory proteins to their cognate sites. In this review, we first focus on the history of the discovery of regulatory SNPs (rSNPs) and systematized description of the existing methodical approaches to their study. Then, we brief the recent comprehensive examples of rSNPs studied from the discovery of the changes in the TFBS sequence as a result of a nucleotide substitution to identification of its effect on the target gene expression and, eventually, to phenotype. We also describe state-of-the-art genome-wide approaches to identification of regulatory variants, including both making molecular sense of genome-wide association studies (GWAS) and the alternative approaches the primary goal of which is to determine the functionality of genetic variants. Among these approaches, special attention is paid to expression quantitative trait loci (eQTLs) analysis and the search for allele-specific events in RNA-seq (ASE events) as well as in ChIP-seq, DNase-seq, and ATAC-seq (ASB events) data.
Collapse
Affiliation(s)
- Arina O. Degtyareva
- Department of Molecular Genetic, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (A.O.D.); (E.V.A.)
| | - Elena V. Antontseva
- Department of Molecular Genetic, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (A.O.D.); (E.V.A.)
| | - Tatiana I. Merkulova
- Department of Molecular Genetic, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (A.O.D.); (E.V.A.)
- Department of Natural Sciences, Novosibirsk State University, 630090 Novosibirsk, Russia
| |
Collapse
|
10
|
Liu L, Chandrashekar P, Zeng B, Sanderford MD, Kumar S, Gibson G. TreeMap: a structured approach to fine mapping of eQTL variants. Bioinformatics 2021; 37:1125-1134. [PMID: 33135051 PMCID: PMC8150140 DOI: 10.1093/bioinformatics/btaa927] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 10/01/2020] [Accepted: 10/20/2020] [Indexed: 11/14/2022] Open
Abstract
Motivation Expression quantitative trait loci (eQTL) harbor genetic variants modulating gene transcription. Fine mapping of regulatory variants at these loci is a daunting task due to the juxtaposition of causal and linked variants at a locus as well as the likelihood of interactions among multiple variants. This problem is exacerbated in genes with multiple cis-acting eQTL, where superimposed effects of adjacent loci further distort the association signals. Results We developed a novel algorithm, TreeMap, that identifies putative causal variants in cis-eQTL accounting for multisite effects and genetic linkage at a locus. Guided by the hierarchical structure of linkage disequilibrium, TreeMap performs an organized search for individual and multiple causal variants. Via extensive simulations, we show that TreeMap detects co-regulating variants more accurately than current methods. Furthermore, its high computational efficiency enables genome-wide analysis of long-range eQTL. We applied TreeMap to GTEx data of brain hippocampus samples and transverse colon samples to search for eQTL in gene bodies and in 4 Mbps gene-flanking regions, discovering numerous distal eQTL. Furthermore, we found concordant distal eQTL that were present in both brain and colon samples, implying long-range regulation of gene expression. Availability and implementation TreeMap is available as an R package enabled for parallel processing at https://github.com/liliulab/treemap. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Li Liu
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA.,Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA
| | - Pramod Chandrashekar
- College of Health Solutions, Arizona State University, Phoenix, AZ 85004, USA.,Center for Personalized Diagnostics, Biodesign Institute, Arizona State University, Tempe, AZ 85281, USA
| | - Biao Zeng
- Center for Integrative Genomics, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Maxwell D Sanderford
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA.,Department of Biology, Temple University, Philadelphia, PA 19122, USA.,Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Greg Gibson
- Center for Integrative Genomics, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
11
|
Abdallah AM, Abu-Madi M. The Genetic Control of the Rheumatic Heart: Closing the Genotype-Phenotype Gap. Front Med (Lausanne) 2021; 8:611036. [PMID: 33842495 PMCID: PMC8024521 DOI: 10.3389/fmed.2021.611036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 01/07/2021] [Indexed: 12/20/2022] Open
Abstract
Rheumatic heart disease (RHD) is a heritable inflammatory condition characterized by carditis, arthritis, and systemic disease. Although remaining neglected, the last 3 years has seen some promising advances in RHD research. Whilst it is clear that RHD can be triggered by recurrent group A streptococcal infections, the mechanisms driving clinical progression are still poorly understood. This review summarizes our current understanding of the genetics implicated in this process and the genetic determinants that predispose some people to RHD. The evidence demonstrating the importance of individual cell types and cellular states in delineating causal genetic variants is discussed, highlighting phenotype/genotype correlations where possible. Genetic fine mapping and functional studies in extreme phenotypes, together with large-scale omics studies including genomics, transcriptomics, epigenomics, and metabolomics, are expected to provide new information not only on RHD but also on the mechanisms of other autoimmune diseases and facilitate future clinical translation.
Collapse
Affiliation(s)
- Atiyeh M Abdallah
- Biomedical and Pharmaceutical Research Unit, Department of Biomedical Sciences, College of Health Sciences, QU Health, Qatar University, Doha, Qatar
| | - Marawan Abu-Madi
- Biomedical and Pharmaceutical Research Unit, Department of Biomedical Sciences, College of Health Sciences, QU Health, Qatar University, Doha, Qatar
| |
Collapse
|
12
|
Zhou XJ, Tsoi LC, Hu Y, Patrick MT, He K, Berthier CC, Li Y, Wang YN, Qi YY, Zhang YM, Gan T, Li Y, Hou P, Liu LJ, Shi SF, Lv JC, Xu HJ, Zhang H. Exome Chip Analyses and Genetic Risk for IgA Nephropathy among Han Chinese. Clin J Am Soc Nephrol 2021; 16:213-224. [PMID: 33462083 PMCID: PMC7863642 DOI: 10.2215/cjn.06910520] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Accepted: 12/11/2020] [Indexed: 02/04/2023]
Abstract
BACKGROUND AND OBJECTIVES IgA nephropathy is the most common form of primary GN worldwide. The evidence of geographic and ethnic differences, as well as familial aggregation of the disease, supports a strong genetic contribution to IgA nephropathy. Evidence for genetic factors in IgA nephropathy comes also from genome-wide association patient-control studies. However, few studies have systematically evaluated the contribution of coding variation in IgA nephropathy. DESIGN, SETTING, PARTICIPANTS, & MEASUREMENTS We performed a two-stage exome chip-based association study in 13,242 samples, including 3363 patients with IgA nephropathy and 9879 healthy controls of Han Chinese ancestry. Common variant functional annotation, gene-based low-frequency variants analysis, differential mRNA expression, and gene network integration were also explored. RESULTS We identified three non-HLA gene regions (FBXL21, CCR6, and STAT3) and one HLA gene region (GABBR1) with suggestive significance (Pmeta <5×10-5) in single-variant associations. These novel non-HLA variants were annotated as expression-associated single-nucleotide polymorphisms and were located in enhancer regions enriched in histone marks H3K4me1 in primary B cells. Gene-based low-frequency variants analysis suggests CFB as another potential susceptibility gene. Further combined expression and network integration suggested that the five novel susceptibility genes, TGFBI, CCR6, STAT3, GABBR1, and CFB, were involved in IgA nephropathy. CONCLUSIONS Five novel gene regions with suggestive significance for IgA nephropathy were identified and shed new light for further mechanism investigation.
Collapse
Affiliation(s)
- Xu-jie Zhou
- Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, People’s Republic of China
- Key Laboratory of Renal Disease, Ministry of Health of China, Beijing, People’s Republic of China
- Key Laboratory of Chronic Kidney Disease Prevention and Treatment (Peking University), Ministry of Education, Beijing, People’s Republic of China
- Research Units of Diagnosis and Treatment of Immune-Mediated Kidney Diseases, Chinese Academy of Medical Sciences, Beijing, People’s Republic of China
| | - Lam C. Tsoi
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, Michigan
- Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Yong Hu
- Beijing Institute of Biotechnology, Beijing, People’s Republic of China
| | - Matthew T. Patrick
- Department of Dermatology, University of Michigan Medical School, Ann Arbor, Michigan
| | - Kevin He
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan
- Kidney Epidemiology and Cost Center, School of Public Health, University of Michigan, Ann Arbor, Michigan
| | - Celine C. Berthier
- Division of Nephrology, Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan
| | - Yanming Li
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, Kansas
| | - Yan-na Wang
- Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, People’s Republic of China
- Key Laboratory of Renal Disease, Ministry of Health of China, Beijing, People’s Republic of China
- Key Laboratory of Chronic Kidney Disease Prevention and Treatment (Peking University), Ministry of Education, Beijing, People’s Republic of China
- Research Units of Diagnosis and Treatment of Immune-Mediated Kidney Diseases, Chinese Academy of Medical Sciences, Beijing, People’s Republic of China
| | - Yuan-yuan Qi
- Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, People’s Republic of China
- Key Laboratory of Renal Disease, Ministry of Health of China, Beijing, People’s Republic of China
- Key Laboratory of Chronic Kidney Disease Prevention and Treatment (Peking University), Ministry of Education, Beijing, People’s Republic of China
- Research Units of Diagnosis and Treatment of Immune-Mediated Kidney Diseases, Chinese Academy of Medical Sciences, Beijing, People’s Republic of China
| | - Yue-miao Zhang
- Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, People’s Republic of China
- Key Laboratory of Renal Disease, Ministry of Health of China, Beijing, People’s Republic of China
- Key Laboratory of Chronic Kidney Disease Prevention and Treatment (Peking University), Ministry of Education, Beijing, People’s Republic of China
- Research Units of Diagnosis and Treatment of Immune-Mediated Kidney Diseases, Chinese Academy of Medical Sciences, Beijing, People’s Republic of China
| | - Ting Gan
- Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, People’s Republic of China
- Key Laboratory of Renal Disease, Ministry of Health of China, Beijing, People’s Republic of China
- Key Laboratory of Chronic Kidney Disease Prevention and Treatment (Peking University), Ministry of Education, Beijing, People’s Republic of China
- Research Units of Diagnosis and Treatment of Immune-Mediated Kidney Diseases, Chinese Academy of Medical Sciences, Beijing, People’s Republic of China
| | - Yang Li
- Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, People’s Republic of China
- Key Laboratory of Renal Disease, Ministry of Health of China, Beijing, People’s Republic of China
- Key Laboratory of Chronic Kidney Disease Prevention and Treatment (Peking University), Ministry of Education, Beijing, People’s Republic of China
- Research Units of Diagnosis and Treatment of Immune-Mediated Kidney Diseases, Chinese Academy of Medical Sciences, Beijing, People’s Republic of China
| | - Ping Hou
- Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, People’s Republic of China
- Key Laboratory of Renal Disease, Ministry of Health of China, Beijing, People’s Republic of China
- Key Laboratory of Chronic Kidney Disease Prevention and Treatment (Peking University), Ministry of Education, Beijing, People’s Republic of China
- Research Units of Diagnosis and Treatment of Immune-Mediated Kidney Diseases, Chinese Academy of Medical Sciences, Beijing, People’s Republic of China
| | - Li-jun Liu
- Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, People’s Republic of China
- Key Laboratory of Renal Disease, Ministry of Health of China, Beijing, People’s Republic of China
- Key Laboratory of Chronic Kidney Disease Prevention and Treatment (Peking University), Ministry of Education, Beijing, People’s Republic of China
- Research Units of Diagnosis and Treatment of Immune-Mediated Kidney Diseases, Chinese Academy of Medical Sciences, Beijing, People’s Republic of China
| | - Su-fang Shi
- Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, People’s Republic of China
- Key Laboratory of Renal Disease, Ministry of Health of China, Beijing, People’s Republic of China
- Key Laboratory of Chronic Kidney Disease Prevention and Treatment (Peking University), Ministry of Education, Beijing, People’s Republic of China
- Research Units of Diagnosis and Treatment of Immune-Mediated Kidney Diseases, Chinese Academy of Medical Sciences, Beijing, People’s Republic of China
| | - Ji-cheng Lv
- Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, People’s Republic of China
- Key Laboratory of Renal Disease, Ministry of Health of China, Beijing, People’s Republic of China
- Key Laboratory of Chronic Kidney Disease Prevention and Treatment (Peking University), Ministry of Education, Beijing, People’s Republic of China
- Research Units of Diagnosis and Treatment of Immune-Mediated Kidney Diseases, Chinese Academy of Medical Sciences, Beijing, People’s Republic of China
| | - Hu-ji Xu
- Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, Kansas
- Department of Rheumatology and Immunology, Shanghai Changzheng Hospital, The Second Military Medical University, Shanghai, People’s Republic of China
| | - Hong Zhang
- Renal Division, Peking University First Hospital, Peking University Institute of Nephrology, Beijing, People’s Republic of China
- Key Laboratory of Renal Disease, Ministry of Health of China, Beijing, People’s Republic of China
- Key Laboratory of Chronic Kidney Disease Prevention and Treatment (Peking University), Ministry of Education, Beijing, People’s Republic of China
- Research Units of Diagnosis and Treatment of Immune-Mediated Kidney Diseases, Chinese Academy of Medical Sciences, Beijing, People’s Republic of China
| |
Collapse
|
13
|
Umans BD, Battle A, Gilad Y. Where Are the Disease-Associated eQTLs? Trends Genet 2021; 37:109-124. [PMID: 32912663 PMCID: PMC8162831 DOI: 10.1016/j.tig.2020.08.009] [Citation(s) in RCA: 151] [Impact Index Per Article: 50.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 08/07/2020] [Accepted: 08/14/2020] [Indexed: 02/07/2023]
Abstract
Most disease-associated variants, although located in putatively regulatory regions, do not have detectable effects on gene expression. One explanation could be that we have not examined gene expression in the cell types or conditions that are most relevant for disease. Even large-scale efforts to study gene expression across tissues are limited to human samples obtained opportunistically or postmortem, mostly from adults. In this review we evaluate recent findings and suggest an alternative strategy, drawing on the dynamic and highly context-specific nature of gene regulation. We discuss new technologies that can extend the standard regulatory mapping framework to more diverse, disease-relevant cell types and states.
Collapse
Affiliation(s)
- Benjamin D Umans
- Department of Medicine, University of Chicago, Chicago, IL, USA.
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA; Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
| | - Yoav Gilad
- Department of Medicine, University of Chicago, Chicago, IL, USA; Department of Human Genetics, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
14
|
Li JK, Li L, Li W, Wang Z, Gao F, Hu FY, Zhang S, Qu SF, Huang J, Wang LS, Wu JH, Chen F. Panel-based targeted exome sequencing reveals novel candidate susceptibility loci for age-related cataracts in Chinese Cohort. Mol Genet Genomic Med 2020; 8:e1218. [PMID: 32337810 PMCID: PMC7336732 DOI: 10.1002/mgg3.1218] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Revised: 02/05/2020] [Accepted: 02/25/2020] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Age-related cataracts (ARC) is the most common blinding eye disease worldwide, and its incidence tend to become younger. However, the relationship between genetic factors and mechanisms is not fully understood. The aim of the study was to further clarify the relationship between ARC and genetic mechanisms in East Asian populations and to elucidate the pathogenesis. METHODS The study collected 191 sporadic cataracts and 208 healthy people from the eastern provinces of China, with an average age of about 60 years. All participants were subjected to a comprehensive ophthalmic clinical examination and peripheral blood samples were collected and their genomic DNA was extracted. Mutations were screened among 792 candidate genes to enhance understanding of the disease through targeted capture and high-throughput sequencing. RESULTS We identified novel candidate susceptibility gene, which may serve as a potential susceptibility factor leading to an increase in the incidence of age-related cataracts. Three novel loci are associated with age-related cataracts significant significance: rs129882 in DBH (p = 5.27E-07, odds ratio = 3.9), rs1800280 in DMD (p = 2.85E-06, odds ratio = 1.4) and rs2871776 in ATP13A2 (p = 4.18E-05, odds ratio = 0.04). Gene-gene interaction analysis revealed that the most significant interactions between genes include the interaction between DBH and TUB (rs17847537 in TUB, rs129882 in DBH, p-value = 2.12E-14), and the interaction between DBH and DMD (rs1800280 in DMD, rs129882 in DBH, p-value = 2.12E-14). Pathway analysis shows that the most significant processes are concentrated in response to light stimulation (adjusted p-Value = 5.56E-03), response to radiation (adjusted P-Value = 5.56E-03), abiotic stimulus (adjusted p-Value = 5.56E-03). eQTL analysis shows that DBH rs129882 could regulate the expression of DBH mRNA in various tissues including retina. CONCLUSION Our study indicates rs129882 and rs1800280 loci are associated with age-related cataracts, which enlarge the gene map of age-related cataracts.
Collapse
Affiliation(s)
- Jian-Kang Li
- Dept of Computer ScienceCity University of Hong KongKowloonHong Kong
- BGI‐ShenzhenShenzhenChina
- Guangdong Provincial Key Laboratory of Human Disease Genomics Shenzhen Key Laboratory of GenomicsBGI-ShenzhenShanghaiChina
| | - Li‐Li Li
- National Institutes of food and drug Control (NIFDC)BeijingP. R. China
| | - Wei Li
- BGI‐ShenzhenShenzhenChina
- Guangdong Provincial Key Laboratory of Human Disease Genomics Shenzhen Key Laboratory of GenomicsBGI-ShenzhenShanghaiChina
- BGI Education CenterUniversity of Chinese Academy of SciencesShenzhenChina
| | - Zi‐Wei Wang
- BGI‐ShenzhenShenzhenChina
- BGI Education CenterUniversity of Chinese Academy of SciencesShenzhenChina
| | - Feng‐Juan Gao
- Eye Institute, Eye and ENT HospitalCollege of MedicineFudan UniversityShanghaiChina
- Shanghai Key Laboratory of Visual Impairment and Restoration, Science and Technology Commission of Shanghai MunicipalityShanghaiChina
- Key Laboratory of MyopiaMinistry of HealthShanghaiChina
| | - Fang-Yuan Hu
- Eye Institute, Eye and ENT HospitalCollege of MedicineFudan UniversityShanghaiChina
- Shanghai Key Laboratory of Visual Impairment and Restoration, Science and Technology Commission of Shanghai MunicipalityShanghaiChina
- Key Laboratory of MyopiaMinistry of HealthShanghaiChina
| | - Sheng‐Hai Zhang
- Eye Institute, Eye and ENT HospitalCollege of MedicineFudan UniversityShanghaiChina
- Shanghai Key Laboratory of Visual Impairment and Restoration, Science and Technology Commission of Shanghai MunicipalityShanghaiChina
- Key Laboratory of MyopiaMinistry of HealthShanghaiChina
| | - Shou-Fang Qu
- National Institutes of food and drug Control (NIFDC)BeijingP. R. China
| | - Jie Huang
- National Institutes of food and drug Control (NIFDC)BeijingP. R. China
| | - Lu-Sheng Wang
- Dept of Computer ScienceCity University of Hong KongKowloonHong Kong
- BGI‐ShenzhenShenzhenChina
| | - Ji-Hong Wu
- Eye Institute, Eye and ENT HospitalCollege of MedicineFudan UniversityShanghaiChina
- Shanghai Key Laboratory of Visual Impairment and Restoration, Science and Technology Commission of Shanghai MunicipalityShanghaiChina
- Key Laboratory of MyopiaMinistry of HealthShanghaiChina
| | - Fang Chen
- BGI‐ShenzhenShenzhenChina
- Guangdong Provincial Key Laboratory of Human Disease Genomics Shenzhen Key Laboratory of GenomicsBGI-ShenzhenShanghaiChina
| |
Collapse
|
15
|
Tian R, Pan Y, Etheridge THA, Deshmukh H, Gulick D, Gibson G, Bao G, Lee CM. Pitfalls in Single Clone CRISPR-Cas9 Mutagenesis to Fine-map Regulatory Intervals. Genes (Basel) 2020; 11:E504. [PMID: 32375333 PMCID: PMC7288657 DOI: 10.3390/genes11050504] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2020] [Revised: 04/15/2020] [Accepted: 04/22/2020] [Indexed: 12/11/2022] Open
Abstract
The majority of genetic variants affecting complex traits map to regulatory regions of genes, and typically lie in credible intervals of 100 or more SNPs. Fine mapping of the causal variant(s) at a locus depends on assays that are able to discriminate the effects of polymorphisms or mutations on gene expression. Here, we evaluated a moderate-throughput CRISPR-Cas9 mutagenesis approach, based on replicated measurement of transcript abundance in single-cell clones, by deleting candidate regulatory SNPs, affecting four genes known to be affected by large-effect expression Quantitative Trait Loci (eQTL) in leukocytes, and using Fluidigm qRT-PCR to monitor gene expression in HL60 pro-myeloid human cells. We concluded that there were multiple constraints that rendered the approach generally infeasible for fine mapping. These included the non-targetability of many regulatory SNPs, clonal variability of single-cell derivatives, and expense. Power calculations based on the measured variance attributable to major sources of experimental error indicated that typical eQTL explaining 10% of the variation in expression of a gene would usually require at least eight biological replicates of each clone. Scanning across credible intervals with this approach is not recommended.
Collapse
Affiliation(s)
- Ruoyu Tian
- Center for Integrative Genomics, Georgia Institute of Technology, Atlanta, GA 30332, USA; (R.T.); (D.G.)
| | - Yidan Pan
- Systems, Synthetic, and Physical Biology, Rice University, Houston, TX 77005, USA;
- Department of Bioengineering, Rice University, Houston, TX 77005, USA; (H.D.); (T.H.A.E.)
| | - Thomas H. A. Etheridge
- Department of Bioengineering, Rice University, Houston, TX 77005, USA; (H.D.); (T.H.A.E.)
| | - Harshavardhan Deshmukh
- Department of Bioengineering, Rice University, Houston, TX 77005, USA; (H.D.); (T.H.A.E.)
| | - Dalia Gulick
- Center for Integrative Genomics, Georgia Institute of Technology, Atlanta, GA 30332, USA; (R.T.); (D.G.)
| | - Greg Gibson
- Center for Integrative Genomics, Georgia Institute of Technology, Atlanta, GA 30332, USA; (R.T.); (D.G.)
| | - Gang Bao
- Systems, Synthetic, and Physical Biology, Rice University, Houston, TX 77005, USA;
- Department of Bioengineering, Rice University, Houston, TX 77005, USA; (H.D.); (T.H.A.E.)
| | - Ciaran M Lee
- APC Microbiome Ireland, University College Cork, Cork T12 YN60, Ireland
| |
Collapse
|
16
|
Pan Y, Tian R, Lee C, Bao G, Gibson G. Fine-mapping within eQTL credible intervals by expression CROP-seq. Biol Methods Protoc 2020; 5:bpaa008. [PMID: 32665975 PMCID: PMC7334875 DOI: 10.1093/biomethods/bpaa008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2019] [Revised: 03/06/2020] [Accepted: 03/26/2020] [Indexed: 01/02/2023] Open
Abstract
The majority of genome-wide association study (GWAS)-identified SNPs are located in noncoding regions of genes and are likely to influence disease risk and phenotypes by affecting gene expression. Since credible intervals responsible for genome-wide associations typically consist of ≥100 variants with similar statistical support, experimental methods are needed to fine map causal variants. We report here a moderate-throughput approach to identifying regulatory GWAS variants, expression CROP-seq, which consists of multiplex CRISPR-Cas9 genome editing combined with single-cell RNAseq to measure perturbation in transcript abundance. Mutations were induced in the HL60/S4 myeloid cell line nearby 57 SNPs in three genes, two of which, rs2251039 and rs35675666, significantly altered CISD1 and PARK7 expression, respectively, with strong replication and validation in single-cell clones. The sites overlap with chromatin accessibility peaks and define causal variants for inflammatory bowel disease at the two loci. This relatively inexpensive approach should be scalable for broad surveys and is also implementable for the fine mapping of individual genes.
Collapse
Affiliation(s)
- Yidan Pan
- Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, USA
- Department of Bioengineering, Rice University, Houston, TX, USA
| | - Ruoyu Tian
- Center for Integrative Genomics, Georgia Institute of Technology, Atlanta, GA, USA
| | - Ciaran Lee
- APC Microbiome Ireland, University College, Cork, Ireland
| | - Gang Bao
- Systems, Synthetic, and Physical Biology, Rice University, Houston, TX, USA
- Department of Bioengineering, Rice University, Houston, TX, USA
| | - Greg Gibson
- Center for Integrative Genomics, Georgia Institute of Technology, Atlanta, GA, USA
| |
Collapse
|
17
|
Richardson TG, Hemani G, Gaunt TR, Relton CL, Davey Smith G. A transcriptome-wide Mendelian randomization study to uncover tissue-dependent regulatory mechanisms across the human phenome. Nat Commun 2020; 11:185. [PMID: 31924771 PMCID: PMC6954187 DOI: 10.1038/s41467-019-13921-9] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Accepted: 11/26/2019] [Indexed: 11/09/2022] Open
Abstract
Developing insight into tissue-specific transcriptional mechanisms can help improve our understanding of how genetic variants exert their effects on complex traits and disease. In this study, we apply the principles of Mendelian randomization to systematically evaluate transcriptome-wide associations between gene expression (across 48 different tissue types) and 395 complex traits. Our findings indicate that variants which influence gene expression levels in multiple tissues are more likely to influence multiple complex traits. Moreover, detailed investigations of our results highlight tissue-specific associations, drug validation opportunities, insight into the likely causal pathways for trait-associated variants and also implicate putative associations at loci yet to be implicated in disease susceptibility. Similar evaluations can be conducted at http://mrcieu.mrsoftware.org/Tissue_MR_atlas/.
Collapse
Affiliation(s)
- Tom G Richardson
- MRC Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK.
| | - Gibran Hemani
- MRC Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK
| | - Tom R Gaunt
- MRC Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK
| | - Caroline L Relton
- MRC Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK
| | - George Davey Smith
- MRC Integrative Epidemiology Unit (IEU), Population Health Sciences, Bristol Medical School, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN, UK
| |
Collapse
|