1
|
Malakhov MM, Dai B, Shen XT, Pan W. A bootstrap model comparison test for identifying genes with context-specific patterns of genetic regulation. Ann Appl Stat 2024; 18:1840-1857. [PMID: 39421855 PMCID: PMC11484521 DOI: 10.1214/23-aoas1859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2024]
Abstract
Understanding how genetic variation affects gene expression is essential for a complete picture of the functional pathways that give rise to complex traits. Although numerous studies have established that many genes are differentially expressed in distinct human tissues and cell types, no tools exist for identifying the genes whose expression is differentially regulated. Here we introduce DRAB (differential regulation analysis by bootstrapping), a gene-based method for testing whether patterns of genetic regulation are significantly different between tissues or other biological contexts. DRAB first leverages the elastic net to learn context-specific models of local genetic regulation and then applies a novel bootstrap-based model comparison test to check their equivalency. Unlike previous model comparison tests, our proposed approach can determine whether population-level models have equal predictive performance by accounting for the variability of feature selection and model training. We validated DRAB on mRNA expression data from a variety of human tissues in the Genotype-Tissue Expression (GTEx) Project. DRAB yielded biologically reasonable results and had sufficient power to detect genes with tissue-specific regulatory profiles while effectively controlling false positives. By providing a framework that facilitates the prioritization of differentially regulated genes, our study enables future discoveries on the genetic architecture of molecular phenotypes.
Collapse
Affiliation(s)
| | - Ben Dai
- Department of Statistics, The Chinese University of Hong Kong
| | | | - Wei Pan
- Division of Biostatistics and Health Data Science, University of Minnesota
| |
Collapse
|
2
|
Hassan NE, El-Masry SA, El Shebini SM, Ahmed NH, Mehanna NS, Abdel Wahed MM, Amine D, Hashish A, Selim M, Afify MAS, Alian K. Effect of weight loss program using prebiotics and probiotics on body composition, physique, and metabolic products: longitudinal intervention study. Sci Rep 2024; 14:10960. [PMID: 38744950 PMCID: PMC11094057 DOI: 10.1038/s41598-024-61130-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 05/02/2024] [Indexed: 05/16/2024] Open
Abstract
The relationship between gut microbiota and obesity has recently been an important subject for research as the gut microbiota is thought to affect body homeostasis including body weight and composition, intervening with pro and prebiotics is an intelligent possible way for obesity management. To evaluate the effect of hypo caloric adequate fiber regimen with probiotic supplementation and physical exercise, whether it will have a good impact on health, body composition, and physique among obese Egyptian women or has no significant effect. The enrolled 58 women, in this longitudinal follow-up intervention study; followed a weight loss eating regimen (prebiotic), including a low-carbohydrate adequate-fiber adequate-protein dietary pattern with decreased energy intake. They additionally received daily probiotic supplements in the form of yogurt and were instructed to exercise regularly for 3 months. Anthropometric measurements, body composition, laboratory investigations, and microbiota analysis were obtained before and after the 3 months weight loss program. Statistically highly significant differences in the anthropometry, body composition parameters: and obesity-related biomarkers (Leptin, ALT, and AST) between the pre and post-follow-up measurements at the end of the study as they were all decreased. The prebiotic and probiotic supplementation induced statistically highly significant alterations in the composition of the gut microbiota with increased relative abundance of Lactobacillus, Bifidobacteria, and Bacteroidetes and decreased relative abundance of Firmicutes and Firmicutes/Bacteroidetes Ratio. Hypo caloric adequate fiber regimen diet with probiotics positively impacts body composition and is effective for weight loss normalizing serum Leptin and AST.
Collapse
Affiliation(s)
- Nayera E Hassan
- Biological Anthropology Department, Medical Research and Clinical Studies Institute, National Research Centre, 33 El-Buhouth St., Dokki, Giza, 12622, Egypt
| | - Sahar A El-Masry
- Biological Anthropology Department, Medical Research and Clinical Studies Institute, National Research Centre, 33 El-Buhouth St., Dokki, Giza, 12622, Egypt.
| | - Salwa M El Shebini
- Nutrition and Food Science Department, Nutrition and Food Science Institute, National Research Centre, Giza, Egypt
| | - Nihad H Ahmed
- Nutrition and Food Science Department, Nutrition and Food Science Institute, National Research Centre, Giza, Egypt
| | - Nayra Sh Mehanna
- Dairy Science Department, Nutrition and Food Science Institute, National Research Centre, Giza, Egypt
| | - Mai Magdy Abdel Wahed
- Clinical and Chemical Pathology Department, Medical Research and Clinical Studies Institute, National Research Centre, Giza, Egypt
| | - Darine Amine
- Biological Anthropology Department, Medical Research and Clinical Studies Institute, National Research Centre, 33 El-Buhouth St., Dokki, Giza, 12622, Egypt
| | - Adel Hashish
- Children with Special Needs Department, Medical Research and Clinical Studies Institute, National Research Centre, Giza, Egypt
| | - Mohamed Selim
- Researches and Applications of Complementary Medicine Department, Medical Research and Clinical Studies Institute, National Research Centre, Giza, Egypt
| | - Mahmoud A S Afify
- Biological Anthropology Department, Medical Research and Clinical Studies Institute, National Research Centre, 33 El-Buhouth St., Dokki, Giza, 12622, Egypt
| | - Khadija Alian
- Biological Anthropology Department, Medical Research and Clinical Studies Institute, National Research Centre, 33 El-Buhouth St., Dokki, Giza, 12622, Egypt
| |
Collapse
|
3
|
Vetr NG, Gay NR, Montgomery SB. The impact of exercise on gene regulation in association with complex trait genetics. Nat Commun 2024; 15:3346. [PMID: 38693125 PMCID: PMC11063075 DOI: 10.1038/s41467-024-45966-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 02/01/2024] [Indexed: 05/03/2024] Open
Abstract
Endurance exercise training is known to reduce risk for a range of complex diseases. However, the molecular basis of this effect has been challenging to study and largely restricted to analyses of either few or easily biopsied tissues. Extensive transcriptome data collected across 15 tissues during exercise training in rats as part of the Molecular Transducers of Physical Activity Consortium has provided a unique opportunity to clarify how exercise can affect tissue-specific gene expression and further suggest how exercise adaptation may impact complex disease-associated genes. To build this map, we integrate this multi-tissue atlas of gene expression changes with gene-disease targets, genetic regulation of expression, and trait relationship data in humans. Consensus from multiple approaches prioritizes specific tissues and genes where endurance exercise impacts disease-relevant gene expression. Specifically, we identify a total of 5523 trait-tissue-gene triplets to serve as a valuable starting point for future investigations [Exercise; Transcription; Human Phenotypic Variation].
Collapse
|
4
|
Mews MA, Naj AC, Griswold AJ, Below JE, Bush WS. Brain and Blood Transcriptome-Wide Association Studies Identify Five Novel Genes Associated with Alzheimer's Disease. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.17.24305737. [PMID: 38699333 PMCID: PMC11065015 DOI: 10.1101/2024.04.17.24305737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
INTRODUCTION Transcriptome-wide Association Studies (TWAS) extend genome-wide association studies (GWAS) by integrating genetically-regulated gene expression models. We performed the most powerful AD-TWAS to date, using summary statistics from cis -eQTL meta-analyses and the largest clinically-adjudicated Alzheimer's Disease (AD) GWAS. METHODS We implemented the OTTERS TWAS pipeline, leveraging cis -eQTL data from cortical brain tissue (MetaBrain; N=2,683) and blood (eQTLGen; N=31,684) to predict gene expression, then applied these models to AD-GWAS data (Cases=21,982; Controls=44,944). RESULTS We identified and validated five novel gene associations in cortical brain tissue ( PRKAG1 , C3orf62 , LYSMD4 , ZNF439 , SLC11A2 ) and six genes proximal to known AD-related GWAS loci (Blood: MYBPC3 ; Brain: MTCH2 , CYB561 , MADD , PSMA5 , ANXA11 ). Further, using causal eQTL fine-mapping, we generated sparse models that retained the strength of the AD-TWAS association for MTCH2 , MADD , ZNF439 , CYB561 , and MYBPC3 . DISCUSSION Our comprehensive AD-TWAS discovered new gene associations and provided insights into the functional relevance of previously associated variants.
Collapse
|
5
|
Hassan NE, El-Masry SA, El Shebini SM, Ahmed NH, Mohamed T F, Mostafa MI, Afify MAS, Kamal AN, Badie MM, Hashish A, Alian K. Gut dysbiosis is linked to metabolic syndrome in obese Egyptian women: potential treatment by probiotics and high fiber diets regimen. Sci Rep 2024; 14:5464. [PMID: 38443406 PMCID: PMC10914807 DOI: 10.1038/s41598-024-54285-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Accepted: 02/10/2024] [Indexed: 03/07/2024] Open
Abstract
Metabolic syndrome (MetS) is defined as a cluster of glucose intolerance, hypertension, dyslipidemia, and central obesity with insulin resistance. The role of gut microbiota in metabolic disorders is increasingly considered. To investigate the effects of probiotic supplements and hypocaloric high fiber regimen on MetS in obese Egyptian women. A longitudinal follow-up intervention study included 58 obese Egyptian women, with a mean age of 41.62 ± 10.70 years. They were grouped according to the criteria of MetS into 2 groups; 23 obese women with MetS and 35 ones without MetS. They followed a hypocaloric high fiber regimen weight loss program, light physical exercise, and received a probiotic supplement daily for 3 months. For each participating woman, blood pressure, anthropometric measurements, basal metabolic rate (BMR), dietary recalls, laboratory investigations, and microbiota analysis were acquired before and after 3 months of follow-up. After intervention by the probiotic and hypocaloric high fiber regimen and light exercise, reduction ranged from numerical to significant difference in the anthropometric parameters, blood pressure, and BMR was reported. All the biochemical parameters characterized by MetS decreased significantly at p ≤ 0.05-0.01. Before the intervention, results revealed abundant of Bacteroidetes bacteria over Firmicutes with a low Firmicutes/Bacteroidetes ratio. After the intervention, Log Lactobacillus, Log Bifidobacteria, and Log Bacteroidetes increased significantly in both groups, while Log Firmicutes and the Firmicutes/Bacteroidetes Ratio revealed a significant decrease. In conclusion, this study's results highlight a positive trend of probiotics supplementation with hypocaloric high-fiber diets in amelioration of the criteria of the Mets in obese Egyptian women.
Collapse
Affiliation(s)
- Nayera E Hassan
- Biological Anthropology Department, Medical Research and Clinical Studies Institute, National Research Centre, 33 El-Buhouth St., Dokki, Giza, 12622, Egypt
| | - Sahar A El-Masry
- Biological Anthropology Department, Medical Research and Clinical Studies Institute, National Research Centre, 33 El-Buhouth St., Dokki, Giza, 12622, Egypt.
| | - Salwa M El Shebini
- Nutrition and Food Science Department, Food and Industries and Nutrition Research Institute, National Research Centre, Giza, Egypt
| | - Nihad H Ahmed
- Nutrition and Food Science Department, Food and Industries and Nutrition Research Institute, National Research Centre, Giza, Egypt
| | - Fouad Mohamed T
- Food and Dairy Microbiology Department, Food and Industries and Nutrition Research Institute, National Research Centre, Giza, Egypt
| | - Mohammed I Mostafa
- Clinical Pathology Department, Medical Research and Clinical Studies Institute, National Research Centre, Giza, Egypt
| | - Mahmoud A S Afify
- Biological Anthropology Department, Medical Research and Clinical Studies Institute, National Research Centre, 33 El-Buhouth St., Dokki, Giza, 12622, Egypt
| | - Ayat N Kamal
- Biological Anthropology Department, Medical Research and Clinical Studies Institute, National Research Centre, 33 El-Buhouth St., Dokki, Giza, 12622, Egypt
| | - Mai M Badie
- Biological Anthropology Department, Medical Research and Clinical Studies Institute, National Research Centre, 33 El-Buhouth St., Dokki, Giza, 12622, Egypt
| | - Adel Hashish
- Children With Special Needs Department, Medical Research and Clinical Studies Institute, National Research Centre, Giza, Egypt
| | - Khadija Alian
- Biological Anthropology Department, Medical Research and Clinical Studies Institute, National Research Centre, 33 El-Buhouth St., Dokki, Giza, 12622, Egypt
| |
Collapse
|
6
|
Zhang X, Gomez L, Below JE, Naj AC, Martin ER, Kunkle BW, Bush WS. An X Chromosome Transcriptome Wide Association Study Implicates ARMCX6 in Alzheimer's Disease. J Alzheimers Dis 2024; 98:1053-1067. [PMID: 38489177 DOI: 10.3233/jad-231075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2024]
Abstract
Background The X chromosome is often omitted in disease association studies despite containing thousands of genes that may provide insight into well-known sex differences in the risk of Alzheimer's disease (AD). Objective To model the expression of X chromosome genes and evaluate their impact on AD risk in a sex-stratified manner. Methods Using elastic net, we evaluated multiple modeling strategies in a set of 175 whole blood samples and 126 brain cortex samples, with whole genome sequencing and RNA-seq data. SNPs (MAF > 0.05) within the cis-regulatory window were used to train tissue-specific models of each gene. We apply the best models in both tissues to sex-stratified summary statistics from a meta-analysis of Alzheimer's Disease Genetics Consortium (ADGC) studies to identify AD-related genes on the X chromosome. Results Across different model parameters, sample sex, and tissue types, we modeled the expression of 217 genes (95 genes in blood and 135 genes in brain cortex). The average model R2 was 0.12 (range from 0.03 to 0.34). We also compared sex-stratified and sex-combined models on the X chromosome. We further investigated genes that escaped X chromosome inactivation (XCI) to determine if their genetic regulation patterns were distinct. We found ten genes associated with AD at p < 0.05, with only ARMCX6 in female brain cortex (p = 0.008) nearing the significance threshold after adjusting for multiple testing (α = 0.002). Conclusions We optimized the expression prediction of X chromosome genes, applied these models to sex-stratified AD GWAS summary statistics, and identified one putative AD risk gene, ARMCX6.
Collapse
Affiliation(s)
- Xueyi Zhang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA
| | - Lissette Gomez
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - Jennifer E Below
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Adam C Naj
- Department of Biostatistics, Epidemiology, and Informatics, Penn Neurodegeneration Genomics Center, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Eden R Martin
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - Brian W Kunkle
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - William S Bush
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA
| |
Collapse
|
7
|
Lopera-Maya EA, Li S, de Brouwer R, Nolte IM, van Breen J, Jongbloed JDH, Swertz MA, Snieder H, Franke L, Wijmenga C, de Boer RA, Deelen P, van der Zwaag PA, Sanna S. Phenotypic and Genetic Factors Associated with Absence of Cardiomyopathy Symptoms in PLN:c.40_42delAGA Carriers. J Cardiovasc Transl Res 2023; 16:1251-1266. [PMID: 36622581 PMCID: PMC10721704 DOI: 10.1007/s12265-022-10347-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 12/14/2022] [Indexed: 01/10/2023]
Abstract
The c.40_42delAGA variant in the phospholamban gene (PLN) has been associated with dilated and arrhythmogenic cardiomyopathy, with up to 70% of carriers experiencing a major cardiac event by age 70. However, there are carriers who remain asymptomatic at older ages. To understand the mechanisms behind this incomplete penetrance, we evaluated potential phenotypic and genetic modifiers in 74 PLN:c.40_42delAGA carriers identified in 36,339 participants of the Lifelines population cohort. Asymptomatic carriers (N = 48) showed shorter QRS duration (- 5.73 ms, q value = 0.001) compared to asymptomatic non-carriers, an effect we could replicate in two different independent cohorts. Furthermore, symptomatic carriers showed a higher correlation (rPearson = 0.17) between polygenic predisposition to higher QRS (PGSQRS) and QRS (p value = 1.98 × 10-8), suggesting that the effect of the genetic variation on cardiac rhythm might be increased in symptomatic carriers. Our results allow for improved clinical interpretation for asymptomatic carriers, while our approach could guide future studies on genetic diseases with incomplete penetrance.
Collapse
Affiliation(s)
- Esteban A Lopera-Maya
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Shuang Li
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Remco de Brouwer
- Department of Cardiology, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Ilja M Nolte
- Department of Epidemiology, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Justin van Breen
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Jan D H Jongbloed
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Morris A Swertz
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
- Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Harold Snieder
- Department of Epidemiology, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Lude Franke
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Cisca Wijmenga
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Rudolf A de Boer
- Department of Cardiology, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Patrick Deelen
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
- Oncode Institute, Utrecht, Netherlands
| | - Paul A van der Zwaag
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands.
| | - Serena Sanna
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, Netherlands.
- Institute for Genetic and Biomedical Research (IRGB), National Research Council (CNR), Cagliari, Italy.
| |
Collapse
|
8
|
Malakhov MM, Dai B, Shen XT, Pan W. A BOOTSTRAP MODEL COMPARISON TEST FOR IDENTIFYING GENES WITH CONTEXT-SPECIFIC PATTERNS OF GENETIC REGULATION. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.06.531446. [PMID: 36945657 PMCID: PMC10028853 DOI: 10.1101/2023.03.06.531446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
Understanding how genetic variation affects gene expression is essential for a complete picture of the functional pathways that give rise to complex traits. Although numerous studies have established that many genes are differentially expressed in distinct human tissues and cell types, no tools exist for identifying the genes whose expression is differentially regulated. Here we introduce DRAB (Differential Regulation Analysis by Bootstrapping), a gene-based method for testing whether patterns of genetic regulation are significantly different between tissues or other biological contexts. DRAB first leverages the elastic net to learn context-specific models of local genetic regulation and then applies a novel bootstrap-based model comparison test to check their equivalency. Unlike previous model comparison tests, our proposed approach can determine whether population-level models have equal predictive performance by accounting for the variability of feature selection and model training. We validated DRAB on mRNA expression data from a variety of human tissues in the Genotype-Tissue Expression (GTEx) Project. DRAB yielded biologically reasonable results and had sufficient power to detect genes with tissue-specific regulatory profiles while effectively controlling false positives. By providing a framework that facilitates the prioritization of differentially regulated genes, our study enables future discoveries on the genetic architecture of molecular phenotypes.
Collapse
Affiliation(s)
| | - Ben Dai
- Department of Statistics, The Chinese University of Hong Kong
| | | | - Wei Pan
- Division of Biostatistics, University of Minnesota
| |
Collapse
|
9
|
Liang Y, Nyasimi F, Im HK. On the problem of inflation in transcriptome-wide association studies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.17.562831. [PMID: 37904952 PMCID: PMC10614931 DOI: 10.1101/2023.10.17.562831] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/01/2023]
Abstract
Hundreds of thousands of loci have been associated with complex traits via genome-wide association studies (GWAS), but an understanding of the mechanistic connection between GWAS loci and disease remains elusive. Genetic predictors of molecular traits are useful for identifying the mediating roles of molecular traits and prioritizing actionable targets for intervention, as demonstrated in transcriptome-wide association studies (TWAS) and related studies. Given the widespread polygenicity of complex traits, it is imperative to understand the effect of polygenicity on the validity of these mediator-trait association tests. We found that for highly polygenic target traits, the standard test based on linear regression is inflated E χ twas 2 > 1 . This inflation has implications for all TWAS and related methods where the complex trait can be highly polygenic-even if the mediating trait is sparse. We derive an asymptotic expression of the inflation, estimate the inflation for gene expression, metabolites, and brain image derived features, and propose a solution to correct the inflation.
Collapse
Affiliation(s)
- Yanyu Liang
- Section of Genetic Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - Festus Nyasimi
- Section of Genetic Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - Hae Kyung Im
- Section of Genetic Medicine, University of Chicago, Chicago, Illinois, United States of America
- Computing Environment and Life Sciences Directorate, Argonne National Laboratory, Argonne, Illinois, United States of America
| |
Collapse
|
10
|
Zhang X, Gomez L, Below J, Naj A, Martin E, Kunkle B, Bush WS. An X Chromosome Transcriptome Wide Association Study Implicates ARMCX6 in Alzheimer's Disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.06.543877. [PMID: 37333116 PMCID: PMC10274627 DOI: 10.1101/2023.06.06.543877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Background The X chromosome is often omitted in disease association studies despite containing thousands of genes which may provide insight into well-known sex differences in the risk of Alzheimer's Disease. Objective To model the expression of X chromosome genes and evaluate their impact on Alzheimer's Disease risk in a sex-stratified manner. Methods Using elastic net, we evaluated multiple modeling strategies in a set of 175 whole blood samples and 126 brain cortex samples, with whole genome sequencing and RNA-seq data. SNPs (MAF>0.05) within the cis-regulatory window were used to train tissue-specific models of each gene. We apply the best models in both tissues to sex-stratified summary statistics from a meta-analysis of Alzheimer's disease Genetics Consortium (ADGC) studies to identify AD-related genes on the X chromosome. Results Across different model parameters, sample sex, and tissue types, we modeled the expression of 217 genes (95 genes in blood and 135 genes in brain cortex). The average model R2 was 0.12 (range from 0.03 to 0.34). We also compared sex-stratified and sex-combined models on the X chromosome. We further investigated genes that escaped X chromosome inactivation (XCI) to determine if their genetic regulation patterns were distinct. We found ten genes associated with AD at p 0.05, with only ARMCX6 in female brain cortex (p = 0.008) nearing the significance threshold after adjusting for multiple testing (α = 0.002). Conclusions We optimized the expression prediction of X chromosome genes, applied these models to sex-stratified AD GWAS summary statistics, and identified one putative AD risk gene, ARMCX6.
Collapse
Affiliation(s)
- Xueyi Zhang
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, USA
| | - Lissette Gomez
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL, 33136, USA
| | - Jennifer Below
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, 37235, USA
| | - Adam Naj
- Department of Biostatistics, Epidemiology, and Informatics, Penn Neurodegeneration Genomics Center, University of Pennsylvania Perelman School of Medicine, Philadelphia, 19104, USA; Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, 19104, USA
| | - Eden Martin
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, 33176, USA
| | - Brian Kunkle
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, 33176, USA
| | - William S Bush
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, 44106, USA
| |
Collapse
|
11
|
Duan H, Xue Z, Ju X, Yang L, Gao J, Sun L, Xu S, Li J, Xiong X, Sun Y, Wang Y, Zhang X, Ding D, Zhang X, Tang J. The genetic architecture of prolificacy in maize revealed by association mapping and bulk segregant analysis. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2023; 136:182. [PMID: 37555969 DOI: 10.1007/s00122-023-04434-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 06/26/2023] [Indexed: 08/10/2023]
Abstract
KEY MESSAGE Here, we revealed maize prolificacy highly correlated with domestication and identified a causal gene ZmEN1 located in one novel QTL qGEN261 that regulating maize prolificacy by using multiple-mapping methods. The development of maize prolificacy (EN) is crucial for enhancing yield and breeding specialty varieties. To achieve this goal, we employed a genome-wide association study (GWAS) to analyze the genetic architecture of EN in maize. Using 492 inbred lines with a wide range of EN variability, our results demonstrated significant differences in genetic, environmental, and interaction effects. The broad-sense heritability (H2) of EN was 0.60. Through GWAS, we identified 527 significant single nucleotide polymorphisms (SNPs), involved 290 quantitative trait loci (QTL) and 806 genes. Of these SNPs, 18 and 509 were classified as major effect loci and minor loci, respectively. In addition, we performed a bulk segregant analysis (BSA) in an F2 population constructed by a few-ears line Zheng58 and a multi-ears line 647. Our BSA results identified one significant QTL, qBEN1. Importantly, combining the GWAS and BSA, four co-located QTL, involving six genes, were identified. Three of them were expressed in vegetative meristem, shoot tip, internode and tip of ear primordium, with ZmEN1, encodes an unknown auxin-like protein, having the highest expression level in these tissues. It suggested that ZmEN1 plays a crucial role in promoting axillary bud and tillering to encourage the formation of prolificacy. Haplotype analysis of ZmEN1 revealed significant differences between different haplotypes, with inbred lines carrying hap6 having more EN. Overall, this is the first report about using GWAS and BSA to dissect the genetic architecture of EN in maize, which can be valuable for breeding specialty maize varieties and improving maize yield.
Collapse
Affiliation(s)
- Haiyang Duan
- National Key Laboratory of Wheat and Maize Crop Science, Department of Agronomy, College of Agronomy, Henan Agricultural University, No. 218 Ping'an Avenue, Zhengdong New District, Zhengzhou, 450046, People's Republic of China
| | - Zhengjie Xue
- National Key Laboratory of Wheat and Maize Crop Science, Department of Agronomy, College of Agronomy, Henan Agricultural University, No. 218 Ping'an Avenue, Zhengdong New District, Zhengzhou, 450046, People's Republic of China
| | - Xiaolong Ju
- National Key Laboratory of Wheat and Maize Crop Science, Department of Agronomy, College of Agronomy, Henan Agricultural University, No. 218 Ping'an Avenue, Zhengdong New District, Zhengzhou, 450046, People's Republic of China
| | - Lu Yang
- State Key Laboratory of Crop Stress Adaptation and Improvement, School of Life Sciences, Henan University, Kaifeng, People's Republic of China
| | - Jionghao Gao
- National Key Laboratory of Wheat and Maize Crop Science, Department of Agronomy, College of Agronomy, Henan Agricultural University, No. 218 Ping'an Avenue, Zhengdong New District, Zhengzhou, 450046, People's Republic of China
| | - Li Sun
- National Key Laboratory of Wheat and Maize Crop Science, Department of Agronomy, College of Agronomy, Henan Agricultural University, No. 218 Ping'an Avenue, Zhengdong New District, Zhengzhou, 450046, People's Republic of China
| | - Shuhao Xu
- National Key Laboratory of Wheat and Maize Crop Science, Department of Agronomy, College of Agronomy, Henan Agricultural University, No. 218 Ping'an Avenue, Zhengdong New District, Zhengzhou, 450046, People's Republic of China
| | - Jianxin Li
- National Key Laboratory of Wheat and Maize Crop Science, Department of Agronomy, College of Agronomy, Henan Agricultural University, No. 218 Ping'an Avenue, Zhengdong New District, Zhengzhou, 450046, People's Republic of China
| | - Xuehang Xiong
- National Key Laboratory of Wheat and Maize Crop Science, Department of Agronomy, College of Agronomy, Henan Agricultural University, No. 218 Ping'an Avenue, Zhengdong New District, Zhengzhou, 450046, People's Republic of China
| | - Yan Sun
- National Key Laboratory of Wheat and Maize Crop Science, Department of Agronomy, College of Agronomy, Henan Agricultural University, No. 218 Ping'an Avenue, Zhengdong New District, Zhengzhou, 450046, People's Republic of China
| | - Yan Wang
- Zhucheng Mingjue Tender Company Limited, Weifang, People's Republic of China
| | - Xuebin Zhang
- State Key Laboratory of Crop Stress Adaptation and Improvement, School of Life Sciences, Henan University, Kaifeng, People's Republic of China
| | - Dong Ding
- National Key Laboratory of Wheat and Maize Crop Science, Department of Agronomy, College of Agronomy, Henan Agricultural University, No. 218 Ping'an Avenue, Zhengdong New District, Zhengzhou, 450046, People's Republic of China
| | - Xuehai Zhang
- National Key Laboratory of Wheat and Maize Crop Science, Department of Agronomy, College of Agronomy, Henan Agricultural University, No. 218 Ping'an Avenue, Zhengdong New District, Zhengzhou, 450046, People's Republic of China.
| | - Jihua Tang
- National Key Laboratory of Wheat and Maize Crop Science, Department of Agronomy, College of Agronomy, Henan Agricultural University, No. 218 Ping'an Avenue, Zhengdong New District, Zhengzhou, 450046, People's Republic of China.
- The Shennong Laboratory, Zhengzhou, People's Republic of China.
| |
Collapse
|
12
|
A comparison of the genes and genesets identified by GWAS and EWAS of fifteen complex traits. Nat Commun 2022; 13:7816. [PMID: 36535946 PMCID: PMC9763500 DOI: 10.1038/s41467-022-35037-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 11/16/2022] [Indexed: 12/23/2022] Open
Abstract
Identifying genomic regions pertinent to complex traits is a common goal of genome-wide and epigenome-wide association studies (GWAS and EWAS). GWAS identify causal genetic variants, directly or via linkage disequilibrium, and EWAS identify variation in DNA methylation associated with a trait. While GWAS in principle will only detect variants due to causal genes, EWAS can also identify genes via confounding, or reverse causation. We systematically compare GWAS (N > 50,000) and EWAS (N > 4500) results of 15 complex traits. We evaluate if the genes or gene ontology terms flagged by GWAS and EWAS overlap, and find substantial overlap for diastolic blood pressure, (gene overlap P = 5.2 × 10-6; term overlap P = 0.001). We superimpose our empirical findings against simulated models of varying genetic and epigenetic architectures and observe that in most cases GWAS and EWAS are likely capturing distinct genesets. Our results indicate that GWAS and EWAS are capturing different aspects of the biology of complex traits.
Collapse
|
13
|
He R, Xue H, Pan W. Statistical power of transcriptome-wide association studies. Genet Epidemiol 2022; 46:572-588. [PMID: 35766062 PMCID: PMC9669108 DOI: 10.1002/gepi.22491] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 05/27/2022] [Accepted: 05/31/2022] [Indexed: 01/02/2023]
Abstract
Transcriptome-Wide Association Studies (TWASs) have become increasingly popular in identifying genes (or other endophenotypes or exposures) associated with complex traits. In TWAS, one first builds a predictive model for gene expressions using an expression quantitative trait loci (eQTL) data set in stage 1, then tests the association between the predicted gene expression and a trait based on a large, independent genome-wide association study (GWAS) data set in stage 2. However, since the sample size of the eQTL data set is usually small and the coefficient of multiple determination (i.e.,R 2 ${R}^{2}$ ) of the model for many genes is also small, a question of interest is to what extent these factors affect the statistical power of TWAS. In addition, in contrast to a standard (univariate) TWAS (UV-TWAS) considering only a single gene at a time, multivariate TWAS (MV-TWAS) methods have recently emerged to account for the effects of multiple genes, or a gene's nonlinear effects, simultaneously. With the absence of the power analysis for these MV-TWAS methods, it would be of interest to investigate whether one can gain or lose power by using the newly proposed MV-TWAS instead of UV-TWAS. In this paper, we first outline a general method for sample size/power calculations for two-sample TWAS, then use real data-the Alzheimer's Disease Neuroimaging Initiative (ADNI) expression quantitative trait loci (eQTL) data and the Genotype-Tissue Expression (GTEx) eQTL data for stage 1, the International Genomics of Alzheimer's Project Alzheimer's disease (AD) GWAS summary data and UK Biobank (UKB) individual-level data for stage 2-to empirically address these questions. Our most important conclusions are the following. First, a sample size of a few thousands (~8000) would suffice in stage 1, where the power of TWAS would be more determined by cis-heritability of gene expression. Second, as in the general case of simple regression versus multiple regression, the power of MV-TWAS may be higher or lower than that of UV-TWAS, depending on the specific relationships among the GWAS trait and multiple genes (or linear and nonlinear terms of the same gene's expression levels), such as their correlations and effect sizes. Interestingly, several top genes with large power gains in MV-TWAS (over that in UV-TWAS) were known to be (and in our data more significantly) associated with AD. We also reached similar conclusions in an application to the GTEx whole blood gene expression data and UKB GWAS data of high-density lipoprotein cholesterol. The proposed method and the conclusions are expected to be useful in planning and designing future TWAS and other related studies (e.g., Proteome- or Metabolome-Wide Association Studies) when determining the sample sizes for the two stages.
Collapse
Affiliation(s)
- Ruoyu He
- School of StatisticsUniversity of MinnesotaMinneapolisMinnesotaUSA
- University of MinnesotaDivision of Biostatistics, School of Public HealthMinneapolisMinnesotaUSA
| | - Haoran Xue
- University of MinnesotaDivision of Biostatistics, School of Public HealthMinneapolisMinnesotaUSA
| | - Wei Pan
- University of MinnesotaDivision of Biostatistics, School of Public HealthMinneapolisMinnesotaUSA
| | | |
Collapse
|
14
|
Fryett JJ, Morris AP, Cordell HJ. Investigating the prediction of CpG methylation levels from SNP genotype data to help elucidate relationships between methylation, gene expression and complex traits. Genet Epidemiol 2022; 46:629-643. [PMID: 35930604 PMCID: PMC9804820 DOI: 10.1002/gepi.22496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 06/27/2022] [Accepted: 07/19/2022] [Indexed: 01/09/2023]
Abstract
As popularised by PrediXcan (and related methods), transcriptome-wide association studies (TWAS), in which gene expression is imputed from single-nucleotide polymorphism (SNP) genotypes and tested for association with a phenotype, are a popular approach for investigating the role of gene expression in complex traits. Like gene expression, DNA methylation is an important biological process and, being under genetic regulation, may be imputable from SNP genotypes. Here, we investigate prediction of CpG methylation levels from SNP genotype data to help elucidate relationships between methylation, gene expression and complex traits. We start by examining how well CpG methylation can be predicted from SNP genotypes, comparing three penalised regression approaches and examining whether changing the window size improves prediction accuracy. Although methylation at most CpG sites cannot be accurately predicted from SNP genotypes, for a subset it can be predicted well. We next apply our methylation prediction models (trained using the optimal method and window size) to carry out a methylome-wide association study (MWAS) of primary biliary cholangitis. We intersect the regions identified via MWAS with those identified via TWAS, providing insight into the interplay between CpG methylation, gene expression and disease status. We conclude that MWAS has the potential to improve understanding of biological mechanisms in complex traits.
Collapse
Affiliation(s)
- James J. Fryett
- Population Health Sciences Institute, Faculty of Medical SciencesNewcastle UniversityNewcastle upon TyneUK
| | - Andrew P. Morris
- Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal ResearchUniversity of ManchesterManchesterUK
| | - Heather J. Cordell
- Population Health Sciences Institute, Faculty of Medical SciencesNewcastle UniversityNewcastle upon TyneUK
| |
Collapse
|
15
|
Munro D, Wang T, Chitre AS, Polesskaya O, Ehsan N, Gao J, Gusev A, Woods LS, Saba L, Chen H, Palmer A, Mohammadi P. The regulatory landscape of multiple brain regions in outbred heterogeneous stock rats. Nucleic Acids Res 2022; 50:10882-10895. [PMID: 36263809 PMCID: PMC9638908 DOI: 10.1093/nar/gkac912] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 08/17/2022] [Accepted: 10/05/2022] [Indexed: 11/14/2022] Open
Abstract
Heterogeneous Stock (HS) rats are a genetically diverse outbred rat population that is widely used for studying genetics of behavioral and physiological traits. Mapping Quantitative Trait Loci (QTL) associated with transcriptional changes would help to identify mechanisms underlying these traits. We generated genotype and transcriptome data for five brain regions from 88 HS rats. We identified 21 392 cis-QTLs associated with expression and splicing changes across all five brain regions and validated their effects using allele specific expression data. We identified 80 cases where eQTLs were colocalized with genome-wide association study (GWAS) results from nine physiological traits. Comparing our dataset to human data from the Genotype-Tissue Expression (GTEx) project, we found that the HS rat data yields twice as many significant eQTLs as a similarly sized human dataset. We also identified a modest but highly significant correlation between genetic regulatory variation among orthologous genes. Surprisingly, we found less genetic variation in gene regulation in HS rats relative to humans, though we still found eQTLs for the orthologs of many human genes for which eQTLs had not been found. These data are available from the RatGTEx data portal (RatGTEx.org) and will enable new discoveries of the genetic influences of complex traits.
Collapse
Affiliation(s)
- Daniel Munro
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA,Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA
| | - Tengfei Wang
- Department of Pharmacology, Addiction Science and Toxicology, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Apurva S Chitre
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Oksana Polesskaya
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Nava Ehsan
- Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA
| | - Jianjun Gao
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Alexander Gusev
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
| | - Leah C Solberg Woods
- Section of Molecular Medicine, Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Laura M Saba
- Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Hao Chen
- Department of Pharmacology, Addiction Science and Toxicology, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Abraham A Palmer
- Correspondence may also be addressed to Abraham A. Palmer. Tel: +1 858 534 2093;
| | - Pejman Mohammadi
- To whom correspondence should be addressed. Tel: +1 858 784 8746;
| |
Collapse
|
16
|
Brief overview of dietary intake, some types of gut microbiota, metabolic markers and research opportunities in sample of Egyptian women. Sci Rep 2022; 12:17291. [PMID: 36241870 PMCID: PMC9981617 DOI: 10.1038/s41598-022-21056-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 09/22/2022] [Indexed: 01/10/2023] Open
Abstract
Metabolic syndrome (MetS) is a phenotype caused by the interaction of host intrinsic factors such as genetics and gut microbiome, and extrinsic factors such as diet and lifestyle. To demonstrate the interplay of intestinal microbiota with obesity, MetS markers, and some dietary ingredients among samples of Egyptian women. This study was a cross-sectional one that included 115 Egyptian women; 82 were obese (59 without MetS and 23 with MetS) and 33 were normal weight. All participants were subjected to anthropometric assessment, 24 h dietary recall, laboratory evaluation of liver enzymes (AST and ALT), leptin, short chain fatty acids (SCFA), C-reactive protein, fasting blood glucose, insulin, and lipid profile, in addition to fecal microbiota analysis for Lactobacillus, Bifidobacteria, Firmicutes, and Bacteroid. Data showed that the obese women with MetS had the highest significant values of the anthropometric and the biochemical parameters. Obese MetS women consumed a diet high in calories, protein, fat, and carbohydrate, and low in fiber and micronutrients. The Bacteroidetes and Firmicutes were the abundant bacteria among the different gut microbiota, with low Firmicutes/Bacteroidetes ratio, and insignificant differences between the obese with and without MetS and normal weight women were reported. Firmicutes/Bacteroidetes ratio significantly correlated positively with total cholesterol and LDL-C and negatively with SCFA among obese women with MetS. Findings of this study revealed that dietary factors, dysbiosis, and the metabolic product short chain fatty acids have been implicated in causing metabolic defects.
Collapse
|
17
|
Lea AJ, Peng J, Ayroles JF. Diverse environmental perturbations reveal the evolution and context-dependency of genetic effects on gene expression levels. Genome Res 2022; 32:1826-1839. [PMID: 36229124 PMCID: PMC9712631 DOI: 10.1101/gr.276430.121] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 09/07/2022] [Indexed: 01/18/2023]
Abstract
There is increasing appreciation that, in addition to being shaped by an individual's genotype and environment, most complex traits are also determined by poorly understood interactions between these two factors. So-called "genotype × environment" (G×E) interactions remain difficult to map at the organismal level but can be uncovered using molecular phenotypes. To do so at large scale, we used TM3'seq to profile transcriptomes across 12 cellular environments in 544 immortalized B cell lines from the 1000 Genomes Project. We mapped the genetic basis of gene expression levels across environments and revealed a context-dependent genetic architecture: The average heritability of gene expression levels increased in treatment relative to control conditions, and on average, each treatment revealed new expression quantitative trait loci (eQTLs) at 11% of genes. Across our experiments, 22% of all identified eQTLs were context-dependent, and this group was enriched for trait- and disease-associated loci. Further, evolutionary analyses suggested that positive selection has shaped G×E loci involved in responding to immune challenges and hormones but not to man-made chemicals. We hypothesize that this reflects a reduced opportunity for selection to act on responses to molecules recently introduced into human environments. Together, our work highlights the importance of considering an exposure's evolutionary history when studying and interpreting G×E interactions, and provides new insight into the evolutionary mechanisms that maintain G×E loci in human populations.
Collapse
Affiliation(s)
- Amanda J. Lea
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey 08544, USA;,Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA
| | - Julie Peng
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey 08544, USA;,Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA
| | - Julien F. Ayroles
- Department of Ecology and Evolution, Princeton University, Princeton, New Jersey 08544, USA;,Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA
| |
Collapse
|
18
|
Thompson M, Gordon MG, Lu A, Tandon A, Halperin E, Gusev A, Ye CJ, Balliu B, Zaitlen N. Multi-context genetic modeling of transcriptional regulation resolves novel disease loci. Nat Commun 2022; 13:5704. [PMID: 36171194 PMCID: PMC9519579 DOI: 10.1038/s41467-022-33212-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 09/07/2022] [Indexed: 12/01/2022] Open
Abstract
A majority of the variants identified in genome-wide association studies fall in non-coding regions of the genome, indicating their mechanism of impact is mediated via gene expression. Leveraging this hypothesis, transcriptome-wide association studies (TWAS) have assisted in both the interpretation and discovery of additional genes associated with complex traits. However, existing methods for conducting TWAS do not take full advantage of the intra-individual correlation inherently present in multi-context expression studies and do not properly adjust for multiple testing across contexts. We introduce CONTENT-a computationally efficient method with proper cross-context false discovery correction that leverages correlation structure across contexts to improve power and generate context-specific and context-shared components of expression. We apply CONTENT to bulk multi-tissue and single-cell RNA-seq data sets and show that CONTENT leads to a 42% (bulk) and 110% (single cell) increase in the number of genetically predicted genes relative to previous approaches. We find the context-specific component of expression comprises 30% of heritability in tissue-level bulk data and 75% in single-cell data, consistent with cell-type heterogeneity in bulk tissue. In the context of TWAS, CONTENT increases the number of locus-phenotype associations discovered by over 51% relative to previous methods across 22 complex traits.
Collapse
Affiliation(s)
- Mike Thompson
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA.
| | - Mary Grace Gordon
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
- Biological and Medical Informatics Graduate Program, University of California, San Francisco, San Francisco, CA, USA
| | - Andrew Lu
- UCLA-Caltech Medical Scientist Training Program, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Anchit Tandon
- Department of Mathematics, Indian Institute of Technology Delhi, Hauz Khas, Delhi, India
| | - Eran Halperin
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California Los Angeles, Los Angeles, CA, USA
- Department of Anesthesiology and Perioperative Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Alexander Gusev
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, US
- Division of Genetics, Brigham and Women's Hospital, Boston, MA, US
| | - Chun Jimmie Ye
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
- Chan-Zuckerberg Biohub, San Francisco, CA, USA
- Division of Rheumatology, Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
- Institute for Computational Health Sciences, University of California, San Francisco, San Francisco, CA, USA
| | - Brunilda Balliu
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Noah Zaitlen
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA.
- Department of Neurology, University of California Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
19
|
Wolc A, Dekkers JCM. Application of Bayesian genomic prediction methods to genome-wide association analyses. Genet Sel Evol 2022; 54:31. [PMID: 35562659 PMCID: PMC9103490 DOI: 10.1186/s12711-022-00724-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 04/27/2022] [Indexed: 11/19/2022] Open
Abstract
Background Bayesian genomic prediction methods were developed to simultaneously fit all genotyped markers to a set of available phenotypes for prediction of breeding values for quantitative traits, allowing for differences in the genetic architecture (distribution of marker effects) of traits. These methods also provide a flexible and reliable framework for genome-wide association (GWA) studies. The objective here was to review developments in Bayesian hierarchical and variable selection models for GWA analyses. Results By fitting all genotyped markers simultaneously, Bayesian GWA methods implicitly account for population structure and the multiple-testing problem of classical single-marker GWA. Implemented using Markov chain Monte Carlo methods, Bayesian GWA methods allow for control of error rates using probabilities obtained from posterior distributions. Power of GWA studies using Bayesian methods can be enhanced by using informative priors based on previous association studies, gene expression analyses, or functional annotation information. Applied to multiple traits, Bayesian GWA analyses can give insight into pleiotropic effects by multi-trait, structural equation, or graphical models. Bayesian methods can also be used to combine genomic, transcriptomic, proteomic, and other -omics data to infer causal genotype to phenotype relationships and to suggest external interventions that can improve performance. Conclusions Bayesian hierarchical and variable selection methods provide a unified and powerful framework for genomic prediction, GWA, integration of prior information, and integration of information from other -omics platforms to identify causal mutations for complex quantitative traits.
Collapse
Affiliation(s)
- Anna Wolc
- Department of Animal Science, Iowa State University, 806 Stange Road, 239 Kildee Hall, Ames, IA, 50010, USA.,Hy-Line International, 2583 240th Street, Dallas Center, IA, 50063, USA
| | - Jack C M Dekkers
- Department of Animal Science, Iowa State University, 806 Stange Road, 239 Kildee Hall, Ames, IA, 50010, USA.
| |
Collapse
|
20
|
Barfield R, Huyghe JR, Lemire M, Dong X, Su YR, Brezina S, Buchanan DD, Figueiredo JC, Gallinger S, Giannakis M, Gsur A, Gunter MJ, Hampel H, Harrison TA, Hopper JL, Hudson TJ, Li CI, Moreno V, Newcomb PA, Pai RK, Pharoah PDP, Phipps AI, Qu C, Steinfelder RS, Sun W, Win AK, Zaidi SH, Campbell PT, Peters U, Hsu L. Genetic Regulation of DNA Methylation Yields Novel Discoveries in GWAS of Colorectal Cancer. Cancer Epidemiol Biomarkers Prev 2022; 31:1068-1076. [PMID: 35247911 PMCID: PMC9081265 DOI: 10.1158/1055-9965.epi-21-0724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 10/05/2021] [Accepted: 02/23/2022] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Colorectal cancer has a strong epigenetic component that is accompanied by frequent DNA methylation (DNAm) alterations in addition to heritable genetic risk. It is of interest to understand the interrelationship of germline genetics, DNAm, and colorectal cancer risk. METHODS We performed a genome-wide methylation quantitative trait locus (meQTL) analysis in 1,355 people, assessing the pairwise associations between genetic variants and lymphocytes methylation data. In addition, we used penalized regression with cis-genetic variants ± 1 Mb of methylation to identify genome-wide heritable DNAm. We evaluated the association of genetically predicted methylation with colorectal cancer risk based on genome-wide association studies (GWAS) of over 125,000 cases and controls using the multivariate sMiST as well as univariately via examination of marginal association with colorectal cancer risk. RESULTS Of the 142 known colorectal cancer GWAS loci, 47 were identified as meQTLs. We identified four novel colorectal cancer-associated loci (NID2, ATXN10, KLHDC10, and CEP41) that reside over 1 Mb outside of known colorectal cancer loci and 10 secondary signals within 1 Mb of known loci. CONCLUSIONS Leveraging information of DNAm regulation into genetic association of colorectal cancer risk reveals novel pathways in colorectal cancer tumorigenesis. Our summary statistics-based framework sMiST provides a powerful approach by combining information from the effect through methylation and residual direct effects of the meQTLs on disease risk. Further validation and functional follow-up of these novel pathways are needed. IMPACT Using genotype, DNAm, and GWAS, we identified four new colorectal cancer risk loci. We studied the landscape of genetic regulation of DNAm via single-SNP and multi-SNP meQTL analyses.
Collapse
Affiliation(s)
- Richard Barfield
- Department of Biostatistics and Bioinformatics, Duke University, Durham NC USA
| | - Jeroen R Huyghe
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Mathieu Lemire
- Neurosciences & Mental Health Program, Hospital for Sick Children, Toronto, ON, Canada
| | - Xinyuan Dong
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| | - Yu-Ru Su
- Biostatistics Unit, Kaiser Permanente Washington Health Research Institute, Seattle, Washington
| | - Stefanie Brezina
- Institute of Cancer Research, Department of Medicine I, Medical University Vienna, Vienna, Austria
| | - Daniel D Buchanan
- Colorectal Oncogenomics Group, Department of Clinical Pathology, The University of Melbourne, Parkville, Victoria 3010 Australia
- University of Melbourne Centre for Cancer Research, Victorian Comprehensive Cancer Centre, Parkville, Victoria 3010 Australia
- Genomic Medicine and Family Cancer Clinic, The Royal Melbourne Hospital, Parkville, Victoria, Australia
| | - Jane C Figueiredo
- Department of Medicine, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, USA
| | - Steven Gallinger
- Lunenfeld Tanenbaum Research Institute, Mount Sinai Hospital, University of Toronto, Toronto, Ontario, Canada
| | - Marios Giannakis
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts, USA
- The Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Andrea Gsur
- Institute of Cancer Research, Department of Medicine I, Medical University Vienna, Vienna, Austria
| | - Marc J Gunter
- International Agency for Research on Cancer (IARC/WHO), Nutrition and Metabolism Branch, Lyon, France
| | - Heather Hampel
- Division of Human Genetics, Department of Internal Medicine, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, USA
| | - Tabitha A Harrison
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - John L Hopper
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, Victoria, Australia
- Department of Epidemiology, School of Public Health and Institute of Health and Environment, Seoul National University, Seoul, South Korea
| | - Thomas J Hudson
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Christopher I Li
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Victor Moreno
- Oncology Data Analytics Program, Catalan Institute of Oncology-IDIBELL, L’Hospitalet de Llobregat, Barcelona, Spain
- CIBER Epidemiología y Salud Pública (CIBERESP), Madrid, Spain
- Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona, Spain
- ONCOBEL Program, Bellvitge Biomedical Research Institute (IDIBELL), L’Hospitalet de Llobregat, Barcelona, Spain
| | - Polly A Newcomb
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- School of Public Health, University of Washington, Seattle, Washington, USA
| | - Rish K Pai
- Department of Laboratory Medicine and Pathology, Mayo Clinic Arizona, Scottsdale, Arizona, USA
| | - Paul D P Pharoah
- Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | - Amanda I Phipps
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- Department of Epidemiology, University of Washington, Seattle, Washington, USA
| | - Conghui Qu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Robert S Steinfelder
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Wei Sun
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Aung Ko Win
- Department of Epidemiology, School of Public Health and Institute of Health and Environment, Seoul National University, Seoul, South Korea
| | - Syed H Zaidi
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
| | - Peter T Campbell
- Department of Population Science, American Cancer Society, Atlanta, Georgia, USA
| | - Ulrike Peters
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- Department of Epidemiology, University of Washington, Seattle, Washington, USA
| | - Li Hsu
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| |
Collapse
|
21
|
Wang T, Qiao J, Zhang S, Wei Y, Zeng P. Simultaneous test and estimation of total genetic effect in eQTL integrative analysis through mixed models. Brief Bioinform 2022; 23:6535679. [PMID: 35212359 DOI: 10.1093/bib/bbac038] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 01/22/2022] [Accepted: 02/07/2021] [Indexed: 11/14/2022] Open
Abstract
Integration of expression quantitative trait loci (eQTL) into genome-wide association studies (GWASs) is a promising manner to reveal functional roles of associated single-nucleotide polymorphisms (SNPs) in complex phenotypes and has become an active research field in post-GWAS era. However, how to efficiently incorporate eQTL mapping study into GWAS for prioritization of causal genes remains elusive. We herein proposed a novel method termed as Mixed transcriptome-wide association studies (TWAS) and mediated Variance estimation (MTV) by modeling the effects of cis-SNPs of a gene as a function of eQTL. MTV formulates the integrative method and TWAS within a unified framework via mixed models and therefore includes many prior methods/tests as special cases. We further justified MTV from another two statistical perspectives of mediation analysis and two-stage Mendelian randomization. Relative to existing methods, MTV is superior for pronounced features including the processing of direct effects of cis-SNPs on phenotypes, the powerful likelihood ratio test for assessment of joint effects of cis-SNPs and genetically regulated gene expression (GReX), two useful quantities to measure relative genetic contributions of GReX and cis-SNPs to phenotypic variance, and the computationally efferent parameter expansion expectation maximum algorithm. With extensive simulations, we identified that MTV correctly controlled the type I error in joint evaluation of the total genetic effect and proved more powerful to discover true association signals across various scenarios compared to existing methods. We finally applied MTV to 41 complex traits/diseases available from three GWASs and discovered many new associated genes that had otherwise been missed by existing methods. We also revealed that a small but substantial fraction of phenotypic variation was mediated by GReX. Overall, MTV constructs a robust and realistic modeling foundation for integrative omics analysis and has the advantage of offering more attractive biological interpretations of GWAS results.
Collapse
Affiliation(s)
- Ting Wang
- Department of Biostatistics at Xuzhou Medical University, China
| | - Jiahao Qiao
- Department of Biostatistics at Xuzhou Medical University, China
| | - Shuo Zhang
- Department of Biostatistics at Xuzhou Medical University, China
| | - Yongyue Wei
- Department of Biostatistics at Nanjing Medical University, China
| | - Ping Zeng
- Department of Biostatistics, Center for Medical Statistics and Data Analysis and Key Laboratory of Human Genetics and Environmental Medicine at Xuzhou Medical University, China
| |
Collapse
|
22
|
Hassan NE, El-Masry SA, Nageeb A, El Hussieny MS, Khalil A, Aly MM, Soliman MAT, Ismail A, El-Saeed G, Hashish A, Selim M. Correlation between Gut Microbiota, its Metabolic Products, and their Association with Liver Enzymes among Sample of Egyptian Females. Open Access Maced J Med Sci 2021. [DOI: 10.3889/oamjms.2022.7909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Background and Aim: The gut microbiota appears to play a critical role in the pathogenesis of obesity, liver metabolism and the associated diseases. The present study aimed to identify the existing gut microbiota enterotypes and its metabolic products profiles among a sample of normal weight and obese Egyptian females, and to investigate the correlation between gut microbiota; body mass index andliver enzymes among them.Methods: A case-control cross-sectional study, included 112 Egyptian females; 82obese and30 normal weight; with age ranged from 25 up to 60 years. For each participant, anthropometric measurements (weight, height and BMI), laboratory investigations (AST, ALT, SCFA, CRP) and microbiota analysis were done. Results: The obese females had higher significant values of CRP,AST, ALTand SCFA. In addition, obese females had insignificant higher values of log Bacteroidetes, log firmicutes, log firmicutes/ Bacteroidetes ratio, and log lactobacillus, and insignificant lower values of log bifidobacteria; than normal weight group.Among normal weight group, Lactobacillus shad significant positive correlations with SCFA, Bifidobacteria and Firmicutes, and significant negative correlations with AST, ALTand CRP. Bifidobacteria had significant negative correlations with Ht and ALT. Bacteroidetes bacteria had significant positive correlations with SCFA, and significant negative correlations with age and height. Firmicutes bacteria had significant negative correlations with AST and ALT. Firmicutes / Bacteroidetes Ratio had significant negative correlations with AST, ALTand SCFA. Among obese group, Lactobacillus and Bifidobacteria had significant negative correlations with Firmicutes / Bacteroidetes Ratio however; these correlations were insignificant among normal weight group. Moreover, there were insignificant correlations between any type of studied microbiota and any of the anthropometric or laboratory parameters; except Firmicutes bacteria had significant negative correlations with ALT.Conclusion: The beneficial Lactobacillus and bifidobacteria have its good impact in improving obesity status, liver function in form of ALT.
Collapse
|
23
|
Hassan NE, El-Masry SA, Nageeb A, El Hussieny MS, Khalil A, Aly M, Selim M, Alian K, Abdel Rasheed E, Abdel Wahed MM, Amine D. Linking Gut Microbiota, Metabolic Syndrome and Metabolic Health among a Sample of Obese Egyptian Females. Open Access Maced J Med Sci 2021. [DOI: 10.3889/oamjms.2021.7625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Background: Studies of the gut microbiota have revealed a great link to obesity and metabolic syndrome (MetS). The aim of this study was to review the dysbiosis of gut microbiota in terms of the components of MetS among a sample of obese Egyptian female patients and to assess current potential gut microbiota targeted therapies for the treatment of MetS. Methods: This study is a cross-sectional study included 82 obese Egyptian women. All participants were subjected to anthropometric assessment; and laboratory evaluation of fasting blood sugar (FBS), insulin, C-reactive protein (CRP), lipid profile and insulin resistance (HOMA), in addition to fecal microbiota analysis for Lactobacillus, Bifidobacteria, Firmicutes and Bacteroid. Results: Among obese group with MetS, Firmicutes / Bacteroidetes Ratio was negatively associated with HOMA and positively associated with serum cholesterol and LDL, while lactobacillus was negatively associated with serum cholesterol. Among obese group without MetS, Firmicutes/ Bacteroidetes ratio is negatively associated with WC (central obesity marker) and positively associated with CRP (inflammatory marker), while lactobacillus was positively correlated with FBS and HOMA, and Bifidobacteria was negatively associated with serum cholesterol and LDL.Conclusion: The two beneficial types the Lactobacillus and bifidobacteria supplementation in form of probiotic with therapeutic treatment and decreasing of WChave their important role in controlling and treating hypertension, serum cholesterol and LDL levels, among obese females even with MetS.
Collapse
|
24
|
Nadel BB, Oliva M, Shou BL, Mitchell K, Ma F, Montoya DJ, Mouton A, Kim-Hellmuth S, Stranger BE, Pellegrini M, Mangul S. Systematic evaluation of transcriptomics-based deconvolution methods and references using thousands of clinical samples. Brief Bioinform 2021; 22:bbab265. [PMID: 34346485 PMCID: PMC8768458 DOI: 10.1093/bib/bbab265] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 06/07/2021] [Accepted: 06/21/2021] [Indexed: 11/13/2022] Open
Abstract
Estimating cell type composition of blood and tissue samples is a biological challenge relevant in both laboratory studies and clinical care. In recent years, a number of computational tools have been developed to estimate cell type abundance using gene expression data. Although these tools use a variety of approaches, they all leverage expression profiles from purified cell types to evaluate the cell type composition within samples. In this study, we compare 12 cell type quantification tools and evaluate their performance while using each of 10 separate reference profiles. Specifically, we have run each tool on over 4000 samples with known cell type proportions, spanning both immune and stromal cell types. A total of 12 of these represent in vitro synthetic mixtures and 300 represent in silico synthetic mixtures prepared using single-cell data. A final 3728 clinical samples have been collected from the Framingham cohort, for which cell populations have been quantified using electrical impedance cell counting. When tools are applied to the Framingham dataset, the tool Estimating the Proportions of Immune and Cancer cells (EPIC) produces the highest correlation, whereas Gene Expression Deconvolution Interactive Tool (GEDIT) produces the lowest error. The best tool for other datasets is varied, but CIBERSORT and GEDIT most consistently produce accurate results. We find that optimal reference depends on the tool used, and report suggested references to be used with each tool. Most tools return results within minutes, but on large datasets runtimes for CIBERSORT can exceed hours or even days. We conclude that deconvolution methods are capable of returning high-quality results, but that proper reference selection is critical.
Collapse
Affiliation(s)
- Brian B Nadel
- Corresponding authors: Brian B. Nadel, Tel: 310-963-7077; E-mail: ; Matteo Pellegrini, Tel: 310-825-0012, E-mail: ; Serghei Mangul, Tel: 323-442-0043, E-mail:
| | | | | | | | | | | | | | | | | | | | - Serghei Mangul
- Corresponding authors: Brian B. Nadel, Tel: 310-963-7077; E-mail: ; Matteo Pellegrini, Tel: 310-825-0012, E-mail: ; Serghei Mangul, Tel: 323-442-0043, E-mail:
| |
Collapse
|
25
|
Zhu J, Yang Y, Kisiel JB, Mahoney DW, Michaud DS, Guo X, Taylor WR, Shu XO, Shu X, Liu D, Li B, Tao R, Cai Q, Zheng W, Long J, Wu L. Integrating Genome and Methylome Data to Identify Candidate DNA Methylation Biomarkers for Pancreatic Cancer Risk. Cancer Epidemiol Biomarkers Prev 2021; 30:2079-2087. [PMID: 34497089 PMCID: PMC8568683 DOI: 10.1158/1055-9965.epi-21-0400] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 05/20/2021] [Accepted: 08/21/2021] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND The role of methylation in pancreatic cancer risk remains unclear. We integrated genome and methylome data to identify CpG sites (CpG) with the genetically predicted methylation to be associated with pancreatic cancer risk. We also studied gene expression to understand the identified associations. METHODS Using genetic data and white blood cell methylation data from 1,595 subjects of European descent, we built genetic models to predict DNA methylation levels. After internal and external validation, we applied prediction models with satisfactory performance to the genetic data of 8,280 pancreatic cancer cases and 6,728 controls of European ancestry to investigate the associations of predicted methylation with pancreatic cancer risk. For associated CpGs, we compared their measured levels in pancreatic tumor versus benign tissue. RESULTS We identified 45 CpGs at nine loci showing an association with pancreatic cancer risk, including 15 CpGs showing an association independent from identified risk variants. We observed significant correlations between predicted methylation of 16 of the 45 CpGs and predicted expression of eight adjacent genes, of which six genes showed associations with pancreatic cancer risk. Of the 45 CpGs, we were able to compare measured methylation of 16 in pancreatic tumor versus benign pancreatic tissue. Of them, six showed differentiated methylation. CONCLUSIONS We identified methylation biomarker candidates associated with pancreatic cancer using genetic instruments and added additional insights into the role of methylation in regulating gene expression in pancreatic cancer development. IMPACT A comprehensive study using genetic instruments identifies 45 CpG sites at nine genomic loci for pancreatic cancer risk.
Collapse
Affiliation(s)
- Jingjing Zhu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, Hawaii
| | - Yaohua Yang
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee
| | - John B Kisiel
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota
| | - Douglas W Mahoney
- Department of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota
| | - Dominique S Michaud
- Department of Public Health and Community Medicine, Tufts University Medical School, Boston, Massachusetts
| | - Xingyi Guo
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee
| | - William R Taylor
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, Minnesota
| | - Xiao-Ou Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Xiang Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Duo Liu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, Hawaii
- Department of Pharmacy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Bingshan Li
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, Tennessee
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Ran Tao
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Qiuyin Cai
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Jirong Long
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, Tennessee.
| | - Lang Wu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, Hawaii.
| |
Collapse
|
26
|
Li B, Ritchie MD. From GWAS to Gene: Transcriptome-Wide Association Studies and Other Methods to Functionally Understand GWAS Discoveries. Front Genet 2021; 12:713230. [PMID: 34659337 PMCID: PMC8515949 DOI: 10.3389/fgene.2021.713230] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Accepted: 07/27/2021] [Indexed: 12/12/2022] Open
Abstract
Since their inception, genome-wide association studies (GWAS) have identified more than a hundred thousand single nucleotide polymorphism (SNP) loci that are associated with various complex human diseases or traits. The majority of GWAS discoveries are located in non-coding regions of the human genome and have unknown functions. The valley between non-coding GWAS discoveries and downstream affected genes hinders the investigation of complex disease mechanism and the utilization of human genetics for the improvement of clinical care. Meanwhile, advances in high-throughput sequencing technologies reveal important genomic regulatory roles that non-coding regions play in the transcriptional activities of genes. In this review, we focus on data integrative bioinformatics methods that combine GWAS with functional genomics knowledge to identify genetically regulated genes. We categorize and describe two types of data integrative methods. First, we describe fine-mapping methods. Fine-mapping is an exploratory approach that calibrates likely causal variants underneath GWAS signals. Fine-mapping methods connect GWAS signals to potentially causal genes through statistical methods and/or functional annotations. Second, we discuss gene-prioritization methods. These are hypothesis generating approaches that evaluate whether genetic variants regulate genes via certain genetic regulatory mechanisms to influence complex traits, including colocalization, mendelian randomization, and the transcriptome-wide association study (TWAS). TWAS is a gene-based association approach that investigates associations between genetically regulated gene expression and complex diseases or traits. TWAS has gained popularity over the years due to its ability to reduce multiple testing burden in comparison to other variant-based analytic approaches. Multiple types of TWAS methods have been developed with varied methodological designs and biological hypotheses over the past 5 years. We dive into discussions of how TWAS methods differ in many aspects and the challenges that different TWAS methods face. Overall, TWAS is a powerful tool for identifying complex trait-associated genes. With the advent of single-cell sequencing, chromosome conformation capture, gene editing technologies, and multiplexing reporter assays, we are expecting a more comprehensive understanding of genomic regulation and genetically regulated genes underlying complex human diseases and traits in the future.
Collapse
Affiliation(s)
- Binglan Li
- Department of Biomedical Data Science, Stanford University, Stanford, CA, United States
| | - Marylyn D Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, United States.,Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
27
|
McGowan MT, Zhang Z, Ficklin SP. Chromosomal characteristics of salt stress heritable gene expression in the rice genome. BMC Genom Data 2021; 22:17. [PMID: 34044788 PMCID: PMC8162008 DOI: 10.1186/s12863-021-00970-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 05/06/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Gene expression is potentially an important heritable quantitative trait that mediates between genetic variation and higher-level complex phenotypes through time and condition-dependent regulatory interactions. Therefore, we sought to explore both the genomic and condition-specific characteristics of gene expression heritability within the context of chromosomal structure. RESULTS Heritability was estimated for biological gene expression using a diverse, 84-line, Oryza sativa (rice) population under optimal and salt-stressed conditions. Overall, 5936 genes were found to have heritable expression regardless of condition and 1377 genes were found to have heritable expression only during salt stress. These genes with salt-specific heritable expression are enriched for functional terms associated with response to stimulus and transcription factor activity. Additionally, we discovered that highly and lowly expressed genes, and genes with heritable expression are distributed differently along the chromosomes in patterns that follow previously identified high-throughput chromosomal conformation capture (Hi-C) A/B chromatin compartments. Furthermore, multiple genomic hot-spots enriched for genes with salt-specific heritability were identified on chromosomes 1, 4, 6, and 8. These hotspots were found to contain genes functionally enriched for transcriptional regulation and overlaps with a previously identified major QTL for salt-tolerance in rice. CONCLUSIONS Investigating the heritability of traits, and in-particular gene expression traits, is important towards developing a basic understanding of how regulatory networks behave across a population. This work provides insights into spatial patterns of heritable gene expression at the chromosomal level.
Collapse
Affiliation(s)
- Matthew T McGowan
- Molecular Plant Sciences Program, Washington State University, French Ad 324G, Pullman, WA, 99164, USA.
| | - Zhiwu Zhang
- Molecular Plant Sciences Program, Washington State University, French Ad 324G, Pullman, WA, 99164, USA.,Department of Crops and Soils, Washington State University, 105 Johnson Hall, Pullman, WA, 99164, USA
| | - Stephen P Ficklin
- Molecular Plant Sciences Program, Washington State University, French Ad 324G, Pullman, WA, 99164, USA.,Department of Horticulture, Washington State University, 149 Johnson Hall, Pullman, WA, 99164, USA
| |
Collapse
|
28
|
Novikova G, Andrews SJ, Renton AE, Marcora E. Beyond association: successes and challenges in linking non-coding genetic variation to functional consequences that modulate Alzheimer's disease risk. Mol Neurodegener 2021; 16:27. [PMID: 33882988 PMCID: PMC8061035 DOI: 10.1186/s13024-021-00449-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 04/13/2021] [Indexed: 02/06/2023] Open
Abstract
Alzheimer's disease (AD) is the most common type of dementia, affecting millions of people worldwide; however, no disease-modifying treatments are currently available. Genome-wide association studies (GWASs) have identified more than 40 loci associated with AD risk. However, most of the disease-associated variants reside in non-coding regions of the genome, making it difficult to elucidate how they affect disease susceptibility. Nonetheless, identification of the regulatory elements, genes, pathways and cell type/tissue(s) impacted by these variants to modulate AD risk is critical to our understanding of disease pathogenesis and ability to develop effective therapeutics. In this review, we provide an overview of the methods and approaches used in the field to identify the functional effects of AD risk variants in the causal path to disease risk modification as well as describe the most recent findings. We first discuss efforts in cell type/tissue prioritization followed by recent progress in candidate causal variant and gene nomination. We discuss statistical methods for fine-mapping as well as approaches that integrate multiple levels of evidence, such as epigenomic and transcriptomic data, to identify causal variants and risk mechanisms of AD-associated loci. Additionally, we discuss experimental approaches and data resources that will be needed to validate and further elucidate the effects of these variants and genes on biological pathways, cellular phenotypes and disease risk. Finally, we discuss future steps that need to be taken to ensure that AD GWAS functional mapping efforts lead to novel findings and bring us closer to finding effective treatments for this devastating disease.
Collapse
Affiliation(s)
- Gloriia Novikova
- Ronald M. Loeb Center for Alzheimer's Disease, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Shea J Andrews
- Ronald M. Loeb Center for Alzheimer's Disease, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Alan E Renton
- Ronald M. Loeb Center for Alzheimer's Disease, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Edoardo Marcora
- Ronald M. Loeb Center for Alzheimer's Disease, Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
29
|
Okoro PC, Schubert R, Guo X, Johnson WC, Rotter JI, Hoeschele I, Liu Y, Im HK, Luke A, Dugas LR, Wheeler HE. Transcriptome prediction performance across machine learning models and diverse ancestries. HGG ADVANCES 2021; 2:100019. [PMID: 33937878 PMCID: PMC8087249 DOI: 10.1016/j.xhgg.2020.100019] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Accepted: 12/29/2020] [Indexed: 11/18/2022] Open
Abstract
Transcriptome prediction methods such as PrediXcan and FUSION have become popular in complex trait mapping. Most transcriptome prediction models have been trained in European populations using methods that make parametric linear assumptions like the elastic net (EN). To potentially further optimize imputation performance of gene expression across global populations, we built transcriptome prediction models using both linear and non-linear machine learning (ML) algorithms and evaluated their performance in comparison to EN. We trained models using genotype and blood monocyte transcriptome data from the Multi-Ethnic Study of Atherosclerosis (MESA) comprising individuals of African, Hispanic, and European ancestries and tested them using genotype and whole-blood transcriptome data from the Modeling the Epidemiology Transition Study (METS) comprising individuals of African ancestries. We show that the prediction performance is highest when the training and the testing population share similar ancestries regardless of the prediction algorithm used. While EN generally outperformed random forest (RF), support vector regression (SVR), and K nearest neighbor (KNN), we found that RF outperformed EN for some genes, particularly between disparate ancestries, suggesting potential robustness and reduced variability of RF imputation performance across global populations. When applied to a high-density lipoprotein (HDL) phenotype, we show including RF prediction models in PrediXcan revealed potential gene associations missed by EN models. Therefore, by integrating other ML modeling into PrediXcan and diversifying our training populations to include more global ancestries, we may uncover new genes associated with complex traits.
Collapse
Affiliation(s)
- Paul C. Okoro
- Program in Bioinformatics, Loyola University Chicago, Chicago, IL, USA
| | - Ryan Schubert
- Department of Mathematics and Statistics, Loyola University Chicago, Chicago, IL, USA
| | - Xiuqing Guo
- Institute for Translational Genomics and Population Sciences, The Lundquist Institute and Department of Pediatrics at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - W. Craig Johnson
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Jerome I. Rotter
- Institute for Translational Genomics and Population Sciences, The Lundquist Institute and Department of Pediatrics at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Ina Hoeschele
- Fralin Life Sciences Institute, Virginia Tech, Blacksburg, VA, USA
- Department of Statistics, Virginia Tech, Blacksburg, VA, USA
- Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Yongmei Liu
- Department of Medicine, Duke University School of Medicine, Durham, NC, USA
| | - Hae Kyung Im
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Amy Luke
- Department of Public Health Sciences, Parkinson School of Health Sciences and Public Health, Loyola University Chicago, Maywood, IL, USA
| | - Lara R. Dugas
- Department of Public Health Sciences, Parkinson School of Health Sciences and Public Health, Loyola University Chicago, Maywood, IL, USA
- Department of Human Biology, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Heather E. Wheeler
- Program in Bioinformatics, Loyola University Chicago, Chicago, IL, USA
- Department of Biology, Loyola University Chicago, Chicago, IL, USA
- Department of Computer Science, Loyola University Chicago, Chicago, IL, USA
| |
Collapse
|
30
|
Li B, Veturi Y, Verma A, Bradford Y, Daar ES, Gulick RM, Riddler SA, Robbins GK, Lennox JL, Haas DW, Ritchie MD. Tissue specificity-aware TWAS (TSA-TWAS) framework identifies novel associations with metabolic, immunologic, and virologic traits in HIV-positive adults. PLoS Genet 2021; 17:e1009464. [PMID: 33901188 PMCID: PMC8102009 DOI: 10.1371/journal.pgen.1009464] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 05/06/2021] [Accepted: 03/03/2021] [Indexed: 01/01/2023] Open
Abstract
As a type of relatively new methodology, the transcriptome-wide association study (TWAS) has gained interest due to capacity for gene-level association testing. However, the development of TWAS has outpaced statistical evaluation of TWAS gene prioritization performance. Current TWAS methods vary in underlying biological assumptions about tissue specificity of transcriptional regulatory mechanisms. In a previous study from our group, this may have affected whether TWAS methods better identified associations in single tissues versus multiple tissues. We therefore designed simulation analyses to examine how the interplay between particular TWAS methods and tissue specificity of gene expression affects power and type I error rates for gene prioritization. We found that cross-tissue identification of expression quantitative trait loci (eQTLs) improved TWAS power. Single-tissue TWAS (i.e., PrediXcan) had robust power to identify genes expressed in single tissues, but, often found significant associations in the wrong tissues as well (therefore had high false positive rates). Cross-tissue TWAS (i.e., UTMOST) had overall equal or greater power and controlled type I error rates for genes expressed in multiple tissues. Based on these simulation results, we applied a tissue specificity-aware TWAS (TSA-TWAS) analytic framework to look for gene-based associations with pre-treatment laboratory values from AIDS Clinical Trial Group (ACTG) studies. We replicated several proof-of-concept transcriptionally regulated gene-trait associations, including UGT1A1 (encoding bilirubin uridine diphosphate glucuronosyltransferase enzyme) and total bilirubin levels (p = 3.59×10-12), and CETP (cholesteryl ester transfer protein) with high-density lipoprotein cholesterol (p = 4.49×10-12). We also identified several novel genes associated with metabolic and virologic traits, as well as pleiotropic genes that linked plasma viral load, absolute basophil count, and/or triglyceride levels. By highlighting the advantages of different TWAS methods, our simulation study promotes a tissue specificity-aware TWAS analytic framework that revealed novel aspects of HIV-related traits.
Collapse
Affiliation(s)
- Binglan Li
- Department of Biomedical Data Science, Stanford University, Stanford, California, United States of America
| | - Yogasudha Veturi
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Anurag Verma
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Yuki Bradford
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Eric S. Daar
- Lundquist Institute at Harbor-UCLA Medical Center, Torrance, California, United States of America
| | - Roy M. Gulick
- Weill Cornell Medicine, New York City, New York, United States of America
| | - Sharon A. Riddler
- Department of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Gregory K. Robbins
- Division of Infectious Diseases, Massachusetts General Hospital, Boston, Massachusetts, United States of America
| | - Jeffrey L. Lennox
- Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - David W. Haas
- Departments of Medicine, Pharmacology, Pathology, Microbiology & Immunology, Vanderbilt University School of Medicine, Nashville, Tennessee, United States of America
- Department of Internal Medicine, Meharry Medical College, Nashville, Tennessee, United States of America
| | - Marylyn D. Ritchie
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
31
|
Zhu A, Matoba N, Wilson EP, Tapia AL, Li Y, Ibrahim JG, Stein JL, Love MI. MRLocus: Identifying causal genes mediating a trait through Bayesian estimation of allelic heterogeneity. PLoS Genet 2021; 17:e1009455. [PMID: 33872308 PMCID: PMC8084342 DOI: 10.1371/journal.pgen.1009455] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 04/29/2021] [Accepted: 02/26/2021] [Indexed: 11/18/2022] Open
Abstract
Expression quantitative trait loci (eQTL) studies are used to understand the regulatory function of non-coding genome-wide association study (GWAS) risk loci, but colocalization alone does not demonstrate a causal relationship of gene expression affecting a trait. Evidence for mediation, that perturbation of gene expression in a given tissue or developmental context will induce a change in the downstream GWAS trait, can be provided by two-sample Mendelian Randomization (MR). Here, we introduce a new statistical method, MRLocus, for Bayesian estimation of the gene-to-trait effect from eQTL and GWAS summary data for loci with evidence of allelic heterogeneity, that is, containing multiple causal variants. MRLocus makes use of a colocalization step applied to each nearly-LD-independent eQTL, followed by an MR analysis step across eQTLs. Additionally, our method involves estimation of the extent of allelic heterogeneity through a dispersion parameter, indicating variable mediation effects from each individual eQTL on the downstream trait. Our method is evaluated against other state-of-the-art methods for estimation of the gene-to-trait mediation effect, using an existing simulation framework. In simulation, MRLocus often has the highest accuracy among competing methods, and in each case provides more accurate estimation of uncertainty as assessed through interval coverage. MRLocus is then applied to five candidate causal genes for mediation of particular GWAS traits, where gene-to-trait effects are concordant with those previously reported. We find that MRLocus's estimation of the causal effect across eQTLs within a locus provides useful information for determining how perturbation of gene expression or individual regulatory elements will affect downstream traits. The MRLocus method is implemented as an R package available at https://mikelove.github.io/mrlocus.
Collapse
Affiliation(s)
- Anqi Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Nana Matoba
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Emma P. Wilson
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Amanda L. Tapia
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Joseph G. Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Jason L. Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Michael I. Love
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
32
|
Yao S, Wu H, Liu TT, Wang JH, Ding JM, Guo J, Rong Y, Ke X, Hao RH, Dong SS, Yang TL, Guo Y. Epigenetic Element-Based Transcriptome-Wide Association Study Identifies Novel Genes for Bipolar Disorder. Schizophr Bull 2021; 47:1642-1652. [PMID: 33772305 PMCID: PMC8530404 DOI: 10.1093/schbul/sbab023] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Since the bipolar disorder (BD) signals identified by genome-wide association study (GWAS) often reside in the non-coding regions, understanding the biological relevance of these genetic loci has proven to be complicated. Transcriptome-wide association studies (TWAS) providing a powerful approach to identify novel disease risk genes and uncover possible causal genes at loci identified previously by GWAS. However, these methods did not consider the importance of epigenetic regulation in gene expression. Here, we developed a novel epigenetic element-based transcriptome-wide association study (ETWAS) that tested the effects of genetic variants on gene expression levels with the epigenetic features as prior and further mediated the association between predicted expression and BD. We conducted an ETWAS consisting of 20 352 cases and 31 358 controls and identified 44 transcriptome-wide significant hits. We found 14 conditionally independent genes, and 10 genes that did not previously implicate with BD were regarded as novel candidate genes, such as ASB16 in the cerebellar hemisphere (P = 9.29 × 10-8). We demonstrated that several genome-wide significant signals from the BD GWAS driven by genetically regulated expression, and NEK4 explained 90.1% of the GWAS signal. Additionally, ETWAS identified genes could explain heritability beyond that explained by GWAS-associated SNPs (P = 5.60 × 10-66). By querying the SNPs in the final models of identified genes in phenome databases, we identified several phenotypes previously associated with BD, such as schizophrenia and depression. In conclusion, ETWAS is a powerful method, and we identified several novel candidate genes associated with BD.
Collapse
Affiliation(s)
- Shi Yao
- National and Local Joint Engineering Research Center of Biodiagnosis and Biotherapy, The Second Affiliated Hospital, Xi’an Jiaotong University, Xi’an, Shaanxi 710004, P. R. China,Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, P. R. China
| | - Hao Wu
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, P. R. China
| | - Tong-Tong Liu
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, P. R. China
| | - Jia-Hao Wang
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, P. R. China
| | - Jing-Miao Ding
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, P. R. China
| | - Jing Guo
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, P. R. China
| | - Yu Rong
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, P. R. China
| | - Xin Ke
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, P. R. China
| | - Ruo-Han Hao
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, P. R. China
| | - Shan-Shan Dong
- Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, P. R. China
| | - Tie-Lin Yang
- National and Local Joint Engineering Research Center of Biodiagnosis and Biotherapy, The Second Affiliated Hospital, Xi’an Jiaotong University, Xi’an, Shaanxi 710004, P. R. China,Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, P. R. China
| | - Yan Guo
- National and Local Joint Engineering Research Center of Biodiagnosis and Biotherapy, The Second Affiliated Hospital, Xi’an Jiaotong University, Xi’an, Shaanxi 710004, P. R. China,Key Laboratory of Biomedical Information Engineering of Ministry of Education, Biomedical Informatics & Genomics Center, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, P. R. China,To whom correspondence should be addressed; tel: +86-29-62818386, fax: +86-29-62818386, e-mail:
| |
Collapse
|
33
|
Bhat A, Irizar H, Thygesen JH, Kuchenbaecker K, Pain O, Adams RA, Zartaloudi E, Harju-Seppänen J, Austin-Zimmerman I, Wang B, Muir R, Summerfelt A, Du XM, Bruce H, O'Donnell P, Srivastava DP, Friston K, Hong LE, Hall MH, Bramon E. Transcriptome-wide association study reveals two genes that influence mismatch negativity. Cell Rep 2021; 34:108868. [PMID: 33730571 PMCID: PMC7972991 DOI: 10.1016/j.celrep.2021.108868] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Revised: 12/09/2020] [Accepted: 02/24/2021] [Indexed: 01/22/2023] Open
Abstract
Mismatch negativity (MMN) is a differential electrophysiological response measuring cortical adaptability to unpredictable stimuli. MMN is consistently attenuated in patients with psychosis. However, the genetics of MMN are uncharted, limiting the validation of MMN as a psychosis endophenotype. Here, we perform a transcriptome-wide association study of 728 individuals, which reveals 2 genes (FAM89A and ENGASE) whose expression in cortical tissues is associated with MMN. Enrichment analyses of neurodevelopmental expression signatures show that genes associated with MMN tend to be overexpressed in the frontal cortex during prenatal development but are significantly downregulated in adulthood. Endophenotype ranking value calculations comparing MMN and three other candidate psychosis endophenotypes (lateral ventricular volume and two auditory-verbal learning measures) find MMN to be considerably superior. These results yield promising insights into sensory processing in the cortex and endorse the notion of MMN as a psychosis endophenotype.
Collapse
Affiliation(s)
- Anjali Bhat
- Division of Psychiatry, University College London, London, UK; Wellcome Centre for Human Neuroimaging, University College London, London, UK; Institute of Psychiatry, Psychology, and Neuroscience, King's College London, London, UK; Department of Basic and Clinical Neuroscience, Institute of Psychiatry, Psychology, and Neuroscience, King's College London, London, UK; MRC Centre for Neurodevelopmental Disorders, King's College London, London, UK.
| | - Haritz Irizar
- Division of Psychiatry, University College London, London, UK
| | | | - Karoline Kuchenbaecker
- Division of Psychiatry, University College London, London, UK; UCL Genetics Institute, University College London, London, UK
| | - Oliver Pain
- Social, Genetic, and Developmental Psychiatry Centre, Institute of Psychiatry, Psychology, and Neuroscience, King's College London, London, UK
| | - Rick A Adams
- Division of Psychiatry, University College London, London, UK; Institute of Cognitive Neuroscience, University College London, London, UK
| | | | - Jasmine Harju-Seppänen
- Division of Psychiatry, University College London, London, UK; Department of Clinical, Educational and Health Psychology, University College London, London, UK
| | | | - Baihan Wang
- Division of Psychiatry, University College London, London, UK
| | - Rebecca Muir
- Division of Psychiatry, University College London, London, UK
| | - Ann Summerfelt
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland, Baltimore, MD, USA
| | - Xiaoming Michael Du
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland, Baltimore, MD, USA
| | - Heather Bruce
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland, Baltimore, MD, USA
| | - Patricio O'Donnell
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland, Baltimore, MD, USA; Department of Psychiatry, Harvard Medical School, Boston, MA, USA; Takeda Pharmaceuticals, Cambridge, MA, USA
| | - Deepak P Srivastava
- Institute of Psychiatry, Psychology, and Neuroscience, King's College London, London, UK; Department of Basic and Clinical Neuroscience, Institute of Psychiatry, Psychology, and Neuroscience, King's College London, London, UK; MRC Centre for Neurodevelopmental Disorders, King's College London, London, UK
| | - Karl Friston
- Wellcome Centre for Human Neuroimaging, University College London, London, UK
| | - L Elliot Hong
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland, Baltimore, MD, USA
| | - Mei-Hua Hall
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA; Psychosis Neurobiology Laboratory, McLean Hospital, Belmont, MA, USA
| | - Elvira Bramon
- Division of Psychiatry, University College London, London, UK; Institute of Cognitive Neuroscience, University College London, London, UK; Institute of Psychiatry, Psychology, and Neuroscience, King's College London, London, UK; Camden and Islington NHS Foundation Trust, London, UK.
| |
Collapse
|
34
|
Zeng P, Dai J, Jin S, Zhou X. Aggregating multiple expression prediction models improves the power of transcriptome-wide association studies. Hum Mol Genet 2021; 30:939-951. [PMID: 33615361 DOI: 10.1093/hmg/ddab056] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 02/10/2021] [Accepted: 02/15/2021] [Indexed: 12/11/2022] Open
Abstract
Transcriptome-wide association study (TWAS) is an important integrative method for identifying genes that are causally associated with phenotypes. A key step of TWAS involves the construction of expression prediction models for every gene in turn using its cis-SNPs as predictors. Different TWAS methods rely on different models for gene expression prediction, and each such model makes a distinct modeling assumption that is often suitable for a particular genetic architecture underlying expression. However, the genetic architectures underlying gene expression vary across genes throughout the transcriptome. Consequently, different TWAS methods may be beneficial in detecting genes with distinct genetic architectures. Here, we develop a new method, HMAT, which aggregates TWAS association evidence obtained across multiple gene expression prediction models by leveraging the harmonic mean P-value combination strategy. Because each expression prediction model is suited to capture a particular genetic architecture, aggregating TWAS associations across prediction models as in HMAT improves accurate expression prediction and enables subsequent powerful TWAS analysis across the transcriptome. A key feature of HMAT is its ability to accommodate the correlations among different TWAS test statistics and produce calibrated P-values after aggregation. Through numerical simulations, we illustrated the advantage of HMAT over commonly used TWAS methods as well as ad hoc P-value combination rules such as Fisher's method. We also applied HMAT to analyze summary statistics of nine common diseases. In the real data applications, HMAT was on average 30.6% more powerful compared to the next best method, detecting many new disease-associated genes that were otherwise not identified by existing TWAS approaches. In conclusion, HMAT represents a flexible and powerful TWAS method that enjoys robust performance across a range of genetic architectures underlying gene expression.
Collapse
Affiliation(s)
- Ping Zeng
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China.,Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China
| | - Jing Dai
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China
| | - Siyi Jin
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, Jiangsu 221004, China
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
35
|
Kammers K, Taub MA, Rodriguez B, Yanek LR, Ruczinski I, Martin J, Kanchan K, Battle A, Cheng L, Wang ZZ, Johnson AD, Leek JT, Faraday N, Becker LC, Mathias RA. Transcriptional profile of platelets and iPSC-derived megakaryocytes from whole-genome and RNA sequencing. Blood 2021; 137:959-968. [PMID: 33094331 PMCID: PMC7918180 DOI: 10.1182/blood.2020006115] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Accepted: 09/29/2020] [Indexed: 01/04/2023] Open
Abstract
Genome-wide association studies have identified common variants associated with platelet-related phenotypes, but because these variants are largely intronic or intergenic, their link to platelet biology is unclear. In 290 normal subjects from the GeneSTAR Research Study (110 African Americans [AAs] and 180 European Americans [EAs]), we generated whole-genome sequence data from whole blood and RNA sequence data from extracted nonribosomal RNA from 185 induced pluripotent stem cell-derived megakaryocyte (MK) cell lines (platelet precursor cells) and 290 blood platelet samples from these subjects. Using eigenMT software to select the peak single-nucleotide polymorphism (SNP) for each expressed gene, and meta-analyzing the results of AAs and EAs, we identify (q-value < 0.05) 946 cis-expression quantitative trait loci (eQTLs) in derived MKs and 1830 cis-eQTLs in blood platelets. Among the 57 eQTLs shared between the 2 tissues, the estimated directions of effect are very consistent (98.2% concordance). A high proportion of detected cis-eQTLs (74.9% in MKs and 84.3% in platelets) are unique to MKs and platelets compared with peak-associated SNP-expressed gene pairs of 48 other tissue types that are reported in version V7 of the Genotype-Tissue Expression Project. The locations of our identified eQTLs are significantly enriched for overlap with several annotation tracks highlighting genomic regions with specific functionality in MKs, including MK-specific DNAse hotspots, H3K27-acetylation marks, H3K4-methylation marks, enhancers, and superenhancers. These results offer insights into the regulatory signature of MKs and platelets, with significant overlap in genes expressed, eQTLs detected, and enrichment within known superenhancers relevant to platelet biology.
Collapse
Affiliation(s)
- Kai Kammers
- Division of Biostatistics and Bioinformatics, Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Margaret A Taub
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD
| | - Benjamin Rodriguez
- National Heart, Lung, and Blood Institute, Population Sciences Branch, The Framingham Heart Study, Framingham, MA; and
| | | | - Ingo Ruczinski
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD
| | | | | | | | - Linzhao Cheng
- Division of Hematology and Institute for Cell Engineering, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Zack Z Wang
- Division of Hematology and Institute for Cell Engineering, Johns Hopkins University School of Medicine, Baltimore, MD
| | - Andrew D Johnson
- National Heart, Lung, and Blood Institute, Population Sciences Branch, The Framingham Heart Study, Framingham, MA; and
| | - Jeffrey T Leek
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD
| | | | | | - Rasika A Mathias
- The GeneSTAR Research Program
- Division of Allergy and Clinical Immunology
| |
Collapse
|
36
|
Ward MC, Banovich NE, Sarkar A, Stephens M, Gilad Y. Dynamic effects of genetic variation on gene expression revealed following hypoxic stress in cardiomyocytes. eLife 2021; 10:57345. [PMID: 33554857 PMCID: PMC7906610 DOI: 10.7554/elife.57345] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Accepted: 02/06/2021] [Indexed: 12/13/2022] Open
Abstract
One life-threatening outcome of cardiovascular disease is myocardial infarction, where cardiomyocytes are deprived of oxygen. To study inter-individual differences in response to hypoxia, we established an in vitro model of induced pluripotent stem cell-derived cardiomyocytes from 15 individuals. We measured gene expression levels, chromatin accessibility, and methylation levels in four culturing conditions that correspond to normoxia, hypoxia, and short- or long-term re-oxygenation. We characterized thousands of gene regulatory changes as the cells transition between conditions. Using available genotypes, we identified 1,573 genes with a cis expression quantitative locus (eQTL) in at least one condition, as well as 367 dynamic eQTLs, which are classified as eQTLs in at least one, but not in all conditions. A subset of genes with dynamic eQTLs is associated with complex traits and disease. Our data demonstrate how dynamic genetic effects on gene expression, which are likely relevant for disease, can be uncovered under stress.
Collapse
Affiliation(s)
- Michelle C Ward
- Department of Medicine, University of Chicago, Chicago, United States.,Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, United States
| | - Nicholas E Banovich
- Department of Human Genetics, University of Chicago, Chicago, United States.,Integrated Cancer Genomics Division, Translational Genomics Research Institute, Phoenix, United States
| | - Abhishek Sarkar
- Department of Human Genetics, University of Chicago, Chicago, United States
| | - Matthew Stephens
- Department of Human Genetics, University of Chicago, Chicago, United States.,Department of Statistics, University of Chicago, Chicago, United States
| | - Yoav Gilad
- Department of Medicine, University of Chicago, Chicago, United States.,Department of Human Genetics, University of Chicago, Chicago, United States
| |
Collapse
|
37
|
Cooper RD, Shaffer HB. Allele-specific expression and gene regulation help explain transgressive thermal tolerance in non-native hybrids of the endangered California tiger salamander (Ambystoma californiense). Mol Ecol 2021; 30:987-1004. [PMID: 33338297 DOI: 10.1111/mec.15779] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Revised: 11/30/2020] [Accepted: 12/11/2020] [Indexed: 01/26/2023]
Abstract
Hybridization between native and non-native species is an ongoing global conservation threat. Hybrids that exhibit traits and tolerances that surpass parental values are of particular concern, given their potential to outperform native species. Effective management of hybrid populations requires an understanding of both physiological performance and the underlying mechanisms that drive transgressive hybrid traits. Here, we explore several aspects of the hybridization between the endangered California tiger salamander (Ambystoma californiense; CTS) and the introduced barred tiger salamander (Ambystoma mavortium; BTS). We assayed critical thermal maximum (CTMax) to compare the ability of CTS, BTS and F1 hybrids to tolerate acute thermal stress, and found that hybrids exhibit a wide range of CTMax values, with 33% (4/12) able to tolerate temperatures greater than either parent. We then quantified the genomic response, measured at the RNA transcript level, of each salamander, to explore the mechanisms underlying thermal tolerance strategies. We found that CTS and BTS have strikingly different values and tissue-specific patterns of overall gene expression, with hybrids expressing intermediate values. F1 hybrids display abundant and variable degrees of allele-specific expression (ASE), likely arising from extensive compensatory evolution in gene regulatory mechanisms between CTS and BTS. We found evidence that the proportion of genes with allelic imbalance in individual hybrids correlates with their CTMax, suggesting a link between ASE and expanded thermal tolerance that may contribute to the success of hybrid salamanders in California. Future climate change may further complicate management of CTS if hybrid salamanders are better equipped to deal with rising temperatures.
Collapse
Affiliation(s)
- Robert D Cooper
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA.,La Kretz Center for California Conservation Science, Institute of the Environment and Sustainability, University of California, Los Angeles, CA, USA
| | - H Bradley Shaffer
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, USA.,La Kretz Center for California Conservation Science, Institute of the Environment and Sustainability, University of California, Los Angeles, CA, USA
| |
Collapse
|
38
|
Shi X, Chai X, Yang Y, Cheng Q, Jiao Y, Chen H, Huang J, Yang C, Liu J. A tissue-specific collaborative mixed model for jointly analyzing multiple tissues in transcriptome-wide association studies. Nucleic Acids Res 2020; 48:e109. [PMID: 32978944 PMCID: PMC7641735 DOI: 10.1093/nar/gkaa767] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 08/14/2020] [Accepted: 09/03/2020] [Indexed: 12/13/2022] Open
Abstract
Transcriptome-wide association studies (TWASs) integrate expression quantitative trait loci (eQTLs) studies with genome-wide association studies (GWASs) to prioritize candidate target genes for complex traits. Several statistical methods have been recently proposed to improve the performance of TWASs in gene prioritization by integrating the expression regulatory information imputed from multiple tissues, and made significant achievements in improving the ability to detect gene-trait associations. Unfortunately, most existing multi-tissue methods focus on prioritization of candidate genes, and cannot directly infer the specific functional effects of candidate genes across different tissues. Here, we propose a tissue-specific collaborative mixed model (TisCoMM) for TWASs, leveraging the co-regulation of genetic variations across different tissues explicitly via a unified probabilistic model. TisCoMM not only performs hypothesis testing to prioritize gene-trait associations, but also detects the tissue-specific role of candidate target genes in complex traits. To make full use of widely available GWASs summary statistics, we extend TisCoMM to use summary-level data, namely, TisCoMM-S2. Using extensive simulation studies, we show that type I error is controlled at the nominal level, the statistical power of identifying associated genes is greatly improved, and the false-positive rate (FPR) for non-causal tissues is well controlled at decent levels. We further illustrate the benefits of our methods in applications to summary-level GWASs data of 33 complex traits. Notably, apart from better identifying potential trait-associated genes, we can elucidate the tissue-specific role of candidate target genes. The follow-up pathway analysis from tissue-specific genes for asthma shows that the immune system plays an essential function for asthma development in both thyroid and lung tissues.
Collapse
Affiliation(s)
- Xingjie Shi
- Department of Statistics, Nanjing University of Finance and Economics, Nanjing, China
- Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, Singapore
| | - Xiaoran Chai
- Beijing Advanced Innovation Center for Genomics (ICG) & Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
- School of Medicine, National University of Singapore, Singapore
| | - Yi Yang
- Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, Singapore
| | - Qing Cheng
- Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, Singapore
| | - Yuling Jiao
- School of Mathematics and Statistics, and Hubei Key Laboratory of Computational Science, Wuhan University, Wuhan, China
| | - Haoyue Chen
- School of International Studies, Zhejiang University, Hangzhou, China
| | - Jian Huang
- Department of Statistics and Actuarial Science, University of Iowa, USA
| | - Can Yang
- Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong, China
| | - Jin Liu
- Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, Singapore
| |
Collapse
|
39
|
Deutelmoser H, Lorenzo Bermejo J, Benner A, Weigl K, Park HA, Haffa M, Herpel E, Schneider M, Ulrich CM, Hoffmeister M, Chang-Claude J, Brenner H, Scherer D. Genotype-Based Gene Expression in Colon Tissue-Prediction Accuracy and Relationship with the Prognosis of Colorectal Cancer Patients. Int J Mol Sci 2020; 21:E8150. [PMID: 33142733 PMCID: PMC7662650 DOI: 10.3390/ijms21218150] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 10/26/2020] [Accepted: 10/27/2020] [Indexed: 12/24/2022] Open
Abstract
Colorectal cancer (CRC) survival has environmental and inherited components. The expression of specific genes can be inferred based on individual genotypes-so called expression quantitative trait loci. In this study, we used the PrediXcan method to predict gene expression in normal colon tissue using individual genotype data from 91 CRC patients and examined the correlation ρ between predicted and measured gene expression levels. Out of 5434 predicted genes, 58% showed a negative ρ value and only 16% presented a ρ higher than 0.10. We subsequently investigated the association between genotype-based gene expression in colon tissue for genes with ρ > 0.10 and survival of 4436 CRC patients. We identified an inverse association between the predicted expression of ARID3B and CRC-specific survival for patients with a body mass index greater than or equal to 30 kg/m2 (HR (hazard ratio) = 0.66 for an expression higher vs. lower than the median, p = 0.005). This association was validated using genotype and clinical data from the UK Biobank (HR = 0.74, p = 0.04). In addition to the identification of ARID3B expression in normal colon tissue as a candidate prognostic biomarker for obese CRC patients, our study illustrates the challenges of genotype-based prediction of gene expression, and the advantage of reassessing the prediction accuracy in a subset of the study population using measured gene expression data.
Collapse
Affiliation(s)
- Heike Deutelmoser
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany; (H.D.); (M.H.); (C.M.U.); (H.B.)
- Institute of Medical Biometry and Informatics, Medical Faculty, Heidelberg University, Im Neuenheimer Feld 130.3, 69120 Heidelberg, Germany;
| | - Justo Lorenzo Bermejo
- Institute of Medical Biometry and Informatics, Medical Faculty, Heidelberg University, Im Neuenheimer Feld 130.3, 69120 Heidelberg, Germany;
| | - Axel Benner
- Division of Biostatistics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69121 Heidelberg, Germany;
| | - Korbinian Weigl
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69121 Heidelberg, Germany; (K.W.); (M.H.)
| | - Hanla A. Park
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69121 Heidelberg, Germany; (H.A.P.); (J.C.-C.)
| | - Mariam Haffa
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany; (H.D.); (M.H.); (C.M.U.); (H.B.)
- Division of Translational Functional Cancer Genomics, National Center for Tumor Diseases (NCT) and German Cancer Research Center (DKFZ), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany
| | - Esther Herpel
- NCT Tissue Bank, National Center for Tumor Diseases (NCT) and University Hospital Heidelberg, Im Neuenheimer Feld 460, 69120 Heidelberg, Germany;
- Institute of Pathology, University Hospital Heidelberg, Im Neuenheimer Feld 224, 69120 Heidelberg, Germany
| | - Martin Schneider
- Department of General, Visceral, and Transplantation Surgery, University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany;
| | - Cornelia M. Ulrich
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany; (H.D.); (M.H.); (C.M.U.); (H.B.)
- Huntsman Cancer Institute, 2000 Cir of Hope Dr 1950, Salt Lake City, UT 84112, USA
- Department of Population Health Sciences, School of Medicine, University of Utah, Salt Lake City, UT 84112, USA
| | - Michael Hoffmeister
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69121 Heidelberg, Germany; (K.W.); (M.H.)
| | - Jenny Chang-Claude
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69121 Heidelberg, Germany; (H.A.P.); (J.C.-C.)
- Cancer Epidemiology Group, University Cancer Center Hamburg (UCCH), University Medical Center Hamburg-Eppendorf (UKE), Martinstraße 52, 20246 Hamburg, Germany
| | - Hermann Brenner
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany; (H.D.); (M.H.); (C.M.U.); (H.B.)
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 581, 69121 Heidelberg, Germany; (K.W.); (M.H.)
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Im Neuenheimer Feld 280, 69120 Heidelberg, Germany
| | - Dominique Scherer
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 460, 69120 Heidelberg, Germany; (H.D.); (M.H.); (C.M.U.); (H.B.)
- Institute of Medical Biometry and Informatics, Medical Faculty, Heidelberg University, Im Neuenheimer Feld 130.3, 69120 Heidelberg, Germany;
| |
Collapse
|
40
|
Zhong J, Jermusyk A, Wu L, Hoskins JW, Collins I, Mocci E, Zhang M, Song L, Chung CC, Zhang T, Xiao W, Albanes D, Andreotti G, Arslan AA, Babic A, Bamlet WR, Beane-Freeman L, Berndt S, Borgida A, Bracci PM, Brais L, Brennan P, Bueno-de-Mesquita B, Buring J, Canzian F, Childs EJ, Cotterchio M, Du M, Duell EJ, Fuchs C, Gallinger S, Gaziano JM, Giles GG, Giovannucci E, Goggins M, Goodman GE, Goodman PJ, Haiman C, Hartge P, Hasan M, Helzlsouer KJ, Holly EA, Klein EA, Kogevinas M, Kurtz RJ, LeMarchand L, Malats N, Männistö S, Milne R, Neale RE, Ng K, Obazee O, Oberg AL, Orlow I, Patel AV, Peters U, Porta M, Rothman N, Scelo G, Sesso HD, Severi G, Sieri S, Silverman D, Sund M, Tjønneland A, Thornquist MD, Tobias GS, Trichopoulou A, Van Den Eeden SK, Visvanathan K, Wactawski-Wende J, Wentzensen N, White E, Yu H, Yuan C, Zeleniuch-Jacquotte A, Hoover R, Brown K, Kooperberg C, Risch HA, Jacobs EJ, Li D, Yu K, Shu XO, Chanock SJ, Wolpin BM, Stolzenberg-Solomon RZ, Chatterjee N, Klein AP, Smith JP, Kraft P, Shi J, Petersen GM, Zheng W, Amundadottir LT. A Transcriptome-Wide Association Study Identifies Novel Candidate Susceptibility Genes for Pancreatic Cancer. J Natl Cancer Inst 2020; 112:1003-1012. [PMID: 31917448 PMCID: PMC7566474 DOI: 10.1093/jnci/djz246] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Revised: 09/12/2019] [Accepted: 12/30/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Although 20 pancreatic cancer susceptibility loci have been identified through genome-wide association studies in individuals of European ancestry, much of its heritability remains unexplained and the genes responsible largely unknown. METHODS To discover novel pancreatic cancer risk loci and possible causal genes, we performed a pancreatic cancer transcriptome-wide association study in Europeans using three approaches: FUSION, MetaXcan, and Summary-MulTiXcan. We integrated genome-wide association studies summary statistics from 9040 pancreatic cancer cases and 12 496 controls, with gene expression prediction models built using transcriptome data from histologically normal pancreatic tissue samples (NCI Laboratory of Translational Genomics [n = 95] and Genotype-Tissue Expression v7 [n = 174] datasets) and data from 48 different tissues (Genotype-Tissue Expression v7, n = 74-421 samples). RESULTS We identified 25 genes whose genetically predicted expression was statistically significantly associated with pancreatic cancer risk (false discovery rate < .05), including 14 candidate genes at 11 novel loci (1p36.12: CELA3B; 9q31.1: SMC2, SMC2-AS1; 10q23.31: RP11-80H5.9; 12q13.13: SMUG1; 14q32.33: BTBD6; 15q23: HEXA; 15q26.1: RCCD1; 17q12: PNMT, CDK12, PGAP3; 17q22: SUPT4H1; 18q11.22: RP11-888D10.3; and 19p13.11: PGPEP1) and 11 at six known risk loci (5p15.33: TERT, CLPTM1L, ZDHHC11B; 7p14.1: INHBA; 9q34.2: ABO; 13q12.2: PDX1; 13q22.1: KLF5; and 16q23.1: WDR59, CFDP1, BCAR1, TMEM170A). The association for 12 of these genes (CELA3B, SMC2, and PNMT at novel risk loci and TERT, CLPTM1L, INHBA, ABO, PDX1, KLF5, WDR59, CFDP1, and BCAR1 at known loci) remained statistically significant after Bonferroni correction. CONCLUSIONS By integrating gene expression and genotype data, we identified novel pancreatic cancer risk loci and candidate functional genes that warrant further investigation.
Collapse
Affiliation(s)
- Jun Zhong
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ashley Jermusyk
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lang Wu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jason W Hoskins
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Irene Collins
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Evelina Mocci
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Mingfeng Zhang
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
- US Food and Drug Administration, Silver Spring, MD, USA
| | - Lei Song
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Charles C Chung
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Tongwu Zhang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Wenming Xiao
- National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
- Division of Molecular Genetics and Pathology, Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD, USA
| | - Demetrius Albanes
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Gabriella Andreotti
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Alan A Arslan
- Department of Obstetrics and Gynecology, New York University School of Medicine, New York, NY, USA
- Department of Population Health, New York University School of Medicine, New York, NY, USA
- Department of Environmental Medicine, New York University School of Medicine, New York, NY, USA
| | - Ana Babic
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - William R Bamlet
- Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, MN, USA
| | - Laura Beane-Freeman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sonja Berndt
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ayelet Borgida
- Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Paige M Bracci
- Department of Epidemiology and Biostatistics, University of California, CA, USA
| | - Lauren Brais
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Paul Brennan
- International Agency for Research on Cancer, Lyon, France
| | - Bas Bueno-de-Mesquita
- Department for Determinants of Chronic Diseases, National Institute for Public Health and the Environment, BA, Bilthoven, The Netherlands
- Department of Gastroenterology and Hepatology, University Medical Centre, Utrecht, The Netherlands
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK
- Department of Social and Preventive Medicine, Faculty of Medicine, University of Malaya, Kuala Lumpur, Malaysia
| | - Julie Buring
- Division of Preventive Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Federico Canzian
- Genomic Epidemiology Group, German Cancer Research Center, Heidelberg, Germany
| | - Erica J Childs
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Michelle Cotterchio
- Cancer Care Ontario, University of Toronto, Toronto, Ontario, Canada
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Mengmeng Du
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Eric J Duell
- Unit of Nutrition and Cancer, Cancer Epidemiology Research Program, Bellvitge Biomedical Research Institute, Catalan Institute of Oncology, Barcelona, Spain
| | | | - Steven Gallinger
- Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Ontario, Canada
| | - J Michael Gaziano
- Division of Preventive Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Aging, Brigham and Women’s Hospital, Boston, MA, USA
- Boston VA Healthcare System, Boston, MA, USA
| | - Graham G Giles
- Cancer Epidemiology and Intelligence Division, Cancer Council Victoria, Melbourne, VIC, Australia
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, VIC, Australia
- Department of Epidemiology and Preventive Medicine, Monash University, Melbourne, VIC, Australia
| | - Edward Giovannucci
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Michael Goggins
- Department of Pathology, Sol Goldman Pancreatic Cancer Research Center, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Gary E Goodman
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Phyllis J Goodman
- SWOG Statistical Center, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Christopher Haiman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Patricia Hartge
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Manal Hasan
- Department of Epidemiology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Kathy J Helzlsouer
- Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Elizabeth A Holly
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, USA
| | - Eric A Klein
- Glickman Urological and Kidney Institute, Cleveland Clinic, Cleveland, OH, USA
| | - Manolis Kogevinas
- ISGlobal, Centre for Research in Environmental Epidemiology, Barcelona, Spain
- CIBER Epidemiología y Salud Pública, Barcelona, Spain
- Hospital del Mar Institute of Medical Research, Universitat Autònoma de Barcelona, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Robert J Kurtz
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Loic LeMarchand
- Cancer Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA
| | - Núria Malats
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Center, Madrid, Spain
| | - Satu Männistö
- Department of Public Health Solutions, National Institute for Health and Welfare, Helsinki, Finland
| | - Roger Milne
- Cancer Epidemiology and Intelligence Division, Cancer Council Victoria, Melbourne, VIC, Australia
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, VIC, Australia
- Precision Medicine, School of Clinical Sciences at Monash Health, Monash University, Melbourne, VIC, Australia
| | - Rachel E Neale
- Population Health Department, QIMR Berghofer Medical Research Institute, Brisbane, Australia
| | - Kimmie Ng
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ofure Obazee
- Genomic Epidemiology Group, German Cancer Research Center, Heidelberg, Germany
| | - Ann L Oberg
- Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, MN, USA
| | - Irene Orlow
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Alpa V Patel
- Epidemiology Research Program, American Cancer Society, Atlanta, GA, USA
| | - Ulrike Peters
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Miquel Porta
- CIBER Epidemiología y Salud Pública, Barcelona, Spain
- Hospital del Mar Institute of Medical Research, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Nathaniel Rothman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ghislaine Scelo
- International Agency for Research on Cancer, Lyon, France
- Cancer Epidemiology and Intelligence Division, Cancer Council Victoria, Melbourne, VIC, Australia
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, VIC, Australia
| | - Howard D Sesso
- Division of Preventive Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Gianluca Severi
- Centre de Recherche en Épidémiologie et Santé des Populations (CESP, Inserm U1018), Facultés de Medicine, Université Paris-Saclay, UPS, UVSQ, Gustave Roussy, Villejuif, France
| | - Sabina Sieri
- Epidemiology and Prevention Unit, Fondazione IRCCS Istituto Nazionale dei Tumori di Milano, Milan, Italy
| | - Debra Silverman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Malin Sund
- Department of Surgical and Perioperative Sciences, Umeå University, Umeå, Sweden
| | - Anne Tjønneland
- Danish Cancer Society Research Center, Copenhagen, Denmark
- Department of Public Health, University of Copenhagen, Copenhagen, Denmark
- Hellenic Health Foundation, Athens, Greece
| | - Mark D Thornquist
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Geoffrey S Tobias
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Kala Visvanathan
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Jean Wactawski-Wende
- Department of Epidemiology and Environmental Health, University at Buffalo, Buffalo, NY, USA
| | - Nicolas Wentzensen
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Emily White
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Herbert Yu
- Cancer Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA
| | - Chen Yuan
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Anne Zeleniuch-Jacquotte
- Department of Population Health, New York University School of Medicine, New York, NY, USA
- Perlmutter Cancer Center, New York University School of Medicine, New York, NY, USA
| | - Robert Hoover
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Kevin Brown
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Harvey A Risch
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, CT, USA
| | - Eric J Jacobs
- Behavioral and Epidemiology Research Group, American Cancer Society, Atlanta, GA, USA
| | - Donghui Li
- Department of Gastrointestinal Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Kai Yu
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Xiao-Ou Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Stephen J Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Brian M Wolpin
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Rachael Z Stolzenberg-Solomon
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Nilanjan Chatterjee
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
- Department of Biostatistics, Bloomberg School of Public Health, Baltimore, MD, USA
| | - Alison P Klein
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, USA
- Department of Pathology, Sol Goldman Pancreatic Cancer Research Center, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Jill P Smith
- Department of Medicine, Georgetown University, Washington, DC, USA
| | - Peter Kraft
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - Jianxin Shi
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Gloria M Petersen
- Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, MN, USA
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Laufey T Amundadottir
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
41
|
Yang Y, Shi X, Jiao Y, Huang J, Chen M, Zhou X, Sun L, Lin X, Yang C, Liu J. CoMM-S2: a collaborative mixed model using summary statistics in transcriptome-wide association studies. Bioinformatics 2020; 36:2009-2016. [PMID: 31755899 DOI: 10.1093/bioinformatics/btz880] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Revised: 09/25/2019] [Accepted: 11/21/2019] [Indexed: 12/23/2022] Open
Abstract
MOTIVATION Although genome-wide association studies (GWAS) have deepened our understanding of the genetic architecture of complex traits, the mechanistic links that underlie how genetic variants cause complex traits remains elusive. To advance our understanding of the underlying mechanistic links, various consortia have collected a vast volume of genomic data that enable us to investigate the role that genetic variants play in gene expression regulation. Recently, a collaborative mixed model (CoMM) was proposed to jointly interrogate genome on complex traits by integrating both the GWAS dataset and the expression quantitative trait loci (eQTL) dataset. Although CoMM is a powerful approach that leverages regulatory information while accounting for the uncertainty in using an eQTL dataset, it requires individual-level GWAS data and cannot fully make use of widely available GWAS summary statistics. Therefore, statistically efficient methods that leverages transcriptome information using only summary statistics information from GWAS data are required. RESULTS In this study, we propose a novel probabilistic model, CoMM-S2, to examine the mechanistic role that genetic variants play, by using only GWAS summary statistics instead of individual-level GWAS data. Similar to CoMM which uses individual-level GWAS data, CoMM-S2 combines two models: the first model examines the relationship between gene expression and genotype, while the second model examines the relationship between the phenotype and the predicted gene expression from the first model. Distinct from CoMM, CoMM-S2 requires only GWAS summary statistics. Using both simulation studies and real data analysis, we demonstrate that even though CoMM-S2 utilizes GWAS summary statistics, it has comparable performance as CoMM, which uses individual-level GWAS data. AVAILABILITY AND IMPLEMENTATION The implement of CoMM-S2 is included in the CoMM package that can be downloaded from https://github.com/gordonliu810822/CoMM. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yi Yang
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200433, China.,Centre for Quantitative Medicine, Program in Health Services & Systems Research, Duke-NUS Medical School, 169857, Singapore
| | - Xingjie Shi
- Centre for Quantitative Medicine, Program in Health Services & Systems Research, Duke-NUS Medical School, 169857, Singapore.,Department of Statistics, Nanjing University of Finance and Economics, Nanjing 210046, China
| | - Yuling Jiao
- School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan 430073, China
| | - Jian Huang
- Department of Statistics and Actuarial Science, University of Iowa, Iowa City, IA 52242, USA
| | - Min Chen
- Academy of Mathematics and Systems Science, The Chinese Academy of Sciences, Beijing 100190, China
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Lei Sun
- Cardiovascular and Metabolic Disorders Program, Duke-NUS Medical School, 169857, Singapore
| | - Xinyi Lin
- Centre for Quantitative Medicine, Program in Health Services & Systems Research, Duke-NUS Medical School, 169857, Singapore.,Singapore Clinical Research Institute, 138669, Singapore.,Singapore Institute for Clinical Sciences, A*STAR, 117609, Singapore
| | - Can Yang
- Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong 999077, China
| | - Jin Liu
- Centre for Quantitative Medicine, Program in Health Services & Systems Research, Duke-NUS Medical School, 169857, Singapore
| |
Collapse
|
42
|
Kim-Hellmuth S, Aguet F, Oliva M, Muñoz-Aguirre M, Kasela S, Wucher V, Castel SE, Hamel AR, Viñuela A, Roberts AL, Mangul S, Wen X, Wang G, Barbeira AN, Garrido-Martín D, Nadel BB, Zou Y, Bonazzola R, Quan J, Brown A, Martinez-Perez A, Soria JM, Getz G, Dermitzakis ET, Small KS, Stephens M, Xi HS, Im HK, Guigó R, Segrè AV, Stranger BE, Ardlie KG, Lappalainen T. Cell type-specific genetic regulation of gene expression across human tissues. Science 2020; 369:eaaz8528. [PMID: 32913075 PMCID: PMC8051643 DOI: 10.1126/science.aaz8528] [Citation(s) in RCA: 178] [Impact Index Per Article: 44.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 07/31/2020] [Indexed: 12/15/2022]
Abstract
The Genotype-Tissue Expression (GTEx) project has identified expression and splicing quantitative trait loci in cis (QTLs) for the majority of genes across a wide range of human tissues. However, the functional characterization of these QTLs has been limited by the heterogeneous cellular composition of GTEx tissue samples. We mapped interactions between computational estimates of cell type abundance and genotype to identify cell type-interaction QTLs for seven cell types and show that cell type-interaction expression QTLs (eQTLs) provide finer resolution to tissue specificity than bulk tissue cis-eQTLs. Analyses of genetic associations with 87 complex traits show a contribution from cell type-interaction QTLs and enables the discovery of hundreds of previously unidentified colocalized loci that are masked in bulk tissue.
Collapse
Affiliation(s)
- Sarah Kim-Hellmuth
- Statistical Genetics, Max Planck Institute of Psychiatry, Munich, Germany.
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - François Aguet
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Meritxell Oliva
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
- Department of Public Health Sciences, University of Chicago, Chicago, IL, USA
| | - Manuel Muñoz-Aguirre
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
- Department of Statistics and Operations Research, Universitat Politècnica de Catalunya (UPC), Barcelona, Catalonia, Spain
| | - Silva Kasela
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Valentin Wucher
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
| | - Stephane E Castel
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Andrew R Hamel
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Ocular Genomics Institute, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA
| | - Ana Viñuela
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
- Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, Geneva, Switzerland
- Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Amy L Roberts
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Serghei Mangul
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, Los Angeles, CA, USA
| | - Xiaoquan Wen
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Gao Wang
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Alvaro N Barbeira
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Diego Garrido-Martín
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
| | - Brian B Nadel
- Department of Molecular, Cellular, and Developmental Biology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Yuxin Zou
- Department of Statistics, University of Chicago, Chicago, IL, USA
| | - Rodrigo Bonazzola
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Jie Quan
- Inflammation & Immunology, Pfizer, Cambridge, MA, USA
| | - Andrew Brown
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
- Population Health and Genomics, University of Dundee, Dundee, Scotland, UK
| | - Angel Martinez-Perez
- Unit of Genomic of Complex Diseases, Institut d'Investigació Biomèdica Sant Pau (IIB-Sant Pau), Barcelona, Spain
| | - José Manuel Soria
- Unit of Genomic of Complex Diseases, Institut d'Investigació Biomèdica Sant Pau (IIB-Sant Pau), Barcelona, Spain
| | - Gad Getz
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cancer Center and Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Emmanouil T Dermitzakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
- Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, Geneva, Switzerland
- Swiss Institute of Bioinformatics, Geneva, Switzerland
| | - Kerrin S Small
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Matthew Stephens
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Hualin S Xi
- Foundational Neuroscience Center, AbbVie, Cambridge, MA, USA
| | - Hae Kyung Im
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain
| | - Ayellet V Segrè
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Ocular Genomics Institute, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA
| | - Barbara E Stranger
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
- Center for Genetic Medicine, Department of Pharmacology, Northwestern University, Feinberg School of Medicine, Chicago, IL, USA
| | | | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA
| |
Collapse
|
43
|
Zhang Y, Quick C, Yu K, Barbeira A, Luca F, Pique-Regi R, Kyung Im H, Wen X. PTWAS: investigating tissue-relevant causal molecular mechanisms of complex traits using probabilistic TWAS analysis. Genome Biol 2020; 21:232. [PMID: 32912253 PMCID: PMC7488550 DOI: 10.1186/s13059-020-02026-y] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Accepted: 04/20/2020] [Indexed: 01/02/2023] Open
Abstract
We propose a new computational framework, probabilistic transcriptome-wide association study (PTWAS), to investigate causal relationships between gene expressions and complex traits. PTWAS applies the established principles from instrumental variables analysis and takes advantage of probabilistic eQTL annotations to delineate and tackle the unique challenges arising in TWAS. PTWAS not only confers higher power than the existing methods but also provides novel functionalities to evaluate the causal assumptions and estimate tissue- or cell-type-specific gene-to-trait effects. We illustrate the power of PTWAS by analyzing the eQTL data across 49 tissues from GTEx (v8) and GWAS summary statistics from 114 complex traits.
Collapse
Affiliation(s)
- Yuhua Zhang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Corbin Quick
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
- Department of Biostatistics, Harvard University, Cambridge, MA, USA
| | - Ketian Yu
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Alvaro Barbeira
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Francesca Luca
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA
| | - Roger Pique-Regi
- Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA
| | - Hae Kyung Im
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Xiaoquan Wen
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
44
|
Barbeira AN, Melia OJ, Liang Y, Bonazzola R, Wang G, Wheeler HE, Aguet F, Ardlie KG, Wen X, Im HK. Fine-mapping and QTL tissue-sharing information improves the reliability of causal gene identification. Genet Epidemiol 2020; 44:854-867. [PMID: 32964524 PMCID: PMC7693040 DOI: 10.1002/gepi.22346] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 06/26/2020] [Accepted: 06/26/2020] [Indexed: 01/01/2023]
Abstract
The integration of transcriptomic studies and genome-wide association studies (GWAS) via imputed expression has seen extensive application in recent years, enabling the functional characterization and causal gene prioritization of GWAS loci. However, the techniques for imputing transcriptomic traits from DNA variation remain underdeveloped. Furthermore, associations found when linking eQTL studies to complex traits through methods like PrediXcan can lead to false positives due to linkage disequilibrium between distinct causal variants. Therefore, the best prediction performance models may not necessarily lead to more reliable causal gene discovery. With the goal of improving discoveries without increasing false positives, we develop and compare multiple transcriptomic imputation approaches using the most recent GTEx release of expression and splicing data on 17,382 RNA-sequencing samples from 948 post-mortem donors in 54 tissues. We find that informing prediction models with posterior causal probability from fine-mapping (dap-g) and borrowing information across tissues (mashr) can lead to better performance in terms of number and proportion of significant associations that are colocalized and the proportion of silver standard genes identified as indicated by precision-recall and receiver operating characteristic curves. All prediction models are made publicly available at predictdb.org.
Collapse
Affiliation(s)
- Alvaro N. Barbeira
- Section of Genetic Medicine, Department of MedicineThe University of ChicagoChicagoIllinois
| | - Owen J. Melia
- Section of Genetic Medicine, Department of MedicineThe University of ChicagoChicagoIllinois
| | - Yanyu Liang
- Section of Genetic Medicine, Department of MedicineThe University of ChicagoChicagoIllinois
| | - Rodrigo Bonazzola
- Section of Genetic Medicine, Department of MedicineThe University of ChicagoChicagoIllinois
| | - Gao Wang
- Department of Human GeneticsThe University of ChicagoChicagoIllinois
| | - Heather E. Wheeler
- Department of BiologyLoyola University ChicagoChicagoIllinois
- Department of Computer ScienceLoyola University ChicagoChicagoIllinois
- Department of Public Health Sciences, Stritch School of MedicineLoyola University ChicagoMaywoodIllinois
| | - François Aguet
- The Broad Institute of MIT and HarvardCambridgeMassachusetts
| | | | - Xiaoquan Wen
- Department of BiostatisticsUniversity of MichiganAnn ArborMichigan
| | - Hae K. Im
- Section of Genetic Medicine, Department of MedicineThe University of ChicagoChicagoIllinois
- Department of Human GeneticsThe University of ChicagoChicagoIllinois
| |
Collapse
|
45
|
Wu L, Yang Y, Guo X, Shu XO, Cai Q, Shu X, Li B, Tao R, Wu C, Nikas JB, Sun Y, Zhu J, Roobol MJ, Giles GG, Brenner H, John EM, Clements J, Grindedal EM, Park JY, Stanford JL, Kote-Jarai Z, Haiman CA, Eeles RA, Zheng W, Long J. An integrative multi-omics analysis to identify candidate DNA methylation biomarkers related to prostate cancer risk. Nat Commun 2020; 11:3905. [PMID: 32764609 PMCID: PMC7413371 DOI: 10.1038/s41467-020-17673-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 06/28/2020] [Indexed: 12/21/2022] Open
Abstract
It remains elusive whether some of the associations identified in genome-wide association studies of prostate cancer (PrCa) may be due to regulatory effects of genetic variants on CpG sites, which may further influence expression of PrCa target genes. To search for CpG sites associated with PrCa risk, here we establish genetic models to predict methylation (N = 1,595) and conduct association analyses with PrCa risk (79,194 cases and 61,112 controls). We identify 759 CpG sites showing an association, including 15 located at novel loci. Among those 759 CpG sites, methylation of 42 is associated with expression of 28 adjacent genes. Among 22 genes, 18 show an association with PrCa risk. Overall, 25 CpG sites show consistent association directions for the methylation-gene expression-PrCa pathway. We identify DNA methylation biomarkers associated with PrCa, and our findings suggest that specific CpG sites may influence PrCa via regulating expression of candidate PrCa target genes.
Collapse
Affiliation(s)
- Lang Wu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, USA.
| | - Yaohua Yang
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Xingyi Guo
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Xiao-Ou Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Qiuyin Cai
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Xiang Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Bingshan Li
- Department of Molecular Physiology & Biophysics, Vanderbilt University, Nashville, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Ran Tao
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Chong Wu
- Department of Statistics, Florida State University, Tallahassee, FL, USA
| | - Jason B Nikas
- Research & Development, Genomix Inc, Minneapolis, MN, USA
| | - Yanfa Sun
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, USA
- College of Life Science, Longyan University, Longyan, Fujian, P. R. China
| | - Jingjing Zhu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa, Honolulu, HI, USA
| | - Monique J Roobol
- Department of Urology, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Graham G Giles
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, 207 Bouverie St, Melbourne, VIC, 3010, Australia
- Cancer Epidemiology & Intelligence Division, Cancer Council Victoria, 615 St Kilda Rd, Melbourne, VIC, 3004, Australia
| | - Hermann Brenner
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
- Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | - Esther M John
- Department of Medicine (Oncology) and Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Judith Clements
- Australian Prostate Cancer Research Centre-QLD, Institute of Health and Biomedical Innovation and School of Biomedical Science, Queensland University of Technology, Brisbane, QLD, Australia
- Translational Research Institute, Brisbane, QLD, Australia
| | | | - Jong Y Park
- Department of Cancer Epidemiology, Moffitt Cancer Center, Tampa, FL, USA
| | - Janet L Stanford
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Epidemiology, School of Public Health, University of Washington, Seattle, WA, USA
| | - Zsofia Kote-Jarai
- Division of Genetics and Epidemiology, The Institute of Cancer Research, and The Royal Marsden NHS Foundation Trust, London, UK
| | - Christopher A Haiman
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
| | - Rosalind A Eeles
- Division of Genetics and Epidemiology, The Institute of Cancer Research, and The Royal Marsden NHS Foundation Trust, London, UK
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jirong Long
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN, USA.
| |
Collapse
|
46
|
The statistical practice of the GTEx Project: from single to multiple tissues. QUANTITATIVE BIOLOGY 2020. [DOI: 10.1007/s40484-020-0210-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
47
|
Matoba N, Liang D, Sun H, Aygün N, McAfee JC, Davis JE, Raffield LM, Qian H, Piven J, Li Y, Kosuri S, Won H, Stein JL. Common genetic risk variants identified in the SPARK cohort support DDHD2 as a candidate risk gene for autism. Transl Psychiatry 2020; 10:265. [PMID: 32747698 PMCID: PMC7400671 DOI: 10.1038/s41398-020-00953-9] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Accepted: 07/22/2020] [Indexed: 12/13/2022] Open
Abstract
Autism spectrum disorder (ASD) is a highly heritable neurodevelopmental disorder. Large genetically informative cohorts of individuals with ASD have led to the identification of a limited number of common genome-wide significant (GWS) risk loci to date. However, many more common genetic variants are expected to contribute to ASD risk given the high heritability. Here, we performed a genome-wide association study (GWAS) on 6222 case-pseudocontrol pairs from the Simons Foundation Powering Autism Research for Knowledge (SPARK) dataset to identify additional common genetic risk factors and molecular mechanisms underlying risk for ASD. We identified one novel GWS locus from the SPARK GWAS and four significant loci, including an additional novel locus from meta-analysis with a previous GWAS. We replicated the previous observation of significant enrichment of ASD heritability within regulatory regions of the developing cortex, indicating that disruption of gene regulation during neurodevelopment is critical for ASD risk. We further employed a massively parallel reporter assay (MPRA) and identified a putative causal variant at the novel locus from SPARK GWAS with strong impacts on gene regulation (rs7001340). Expression quantitative trait loci data demonstrated an association between the risk allele and decreased expression of DDHD2 (DDHD domain containing 2) in both adult and prenatal brains. In conclusion, by integrating genetic association data with multi-omic gene regulatory annotations and experimental validation, we fine-mapped a causal risk variant and demonstrated that DDHD2 is a novel gene associated with ASD risk.
Collapse
Affiliation(s)
- Nana Matoba
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Dan Liang
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Huaigu Sun
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Nil Aygün
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Jessica C McAfee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Jessica E Davis
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Huijun Qian
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Joseph Piven
- Department of Psychiatry and the Carolina Institute for Developmental Disabilities, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Yun Li
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Sriam Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- UCLA-DOE Institute for Genomics and Proteomics, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Quantitative and Computational Biology Institute, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA, 90095, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Hyejung Won
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
48
|
Keys KL, Mak ACY, White MJ, Eckalbar WL, Dahl AW, Mefford J, Mikhaylova AV, Contreras MG, Elhawary JR, Eng C, Hu D, Huntsman S, Oh SS, Salazar S, Lenoir MA, Ye JC, Thornton TA, Zaitlen N, Burchard EG, Gignoux CR. On the cross-population generalizability of gene expression prediction models. PLoS Genet 2020; 16:e1008927. [PMID: 32797036 PMCID: PMC7449671 DOI: 10.1371/journal.pgen.1008927] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 08/26/2020] [Accepted: 06/10/2020] [Indexed: 11/21/2022] Open
Abstract
The genetic control of gene expression is a core component of human physiology. For the past several years, transcriptome-wide association studies have leveraged large datasets of linked genotype and RNA sequencing information to create a powerful gene-based test of association that has been used in dozens of studies. While numerous discoveries have been made, the populations in the training data are overwhelmingly of European descent, and little is known about the generalizability of these models to other populations. Here, we test for cross-population generalizability of gene expression prediction models using a dataset of African American individuals with RNA-Seq data in whole blood. We find that the default models trained in large datasets such as GTEx and DGN fare poorly in African Americans, with a notable reduction in prediction accuracy when compared to European Americans. We replicate these limitations in cross-population generalizability using the five populations in the GEUVADIS dataset. Via realistic simulations of both populations and gene expression, we show that accurate cross-population generalizability of transcriptome prediction only arises when eQTL architecture is substantially shared across populations. In contrast, models with non-identical eQTLs showed patterns similar to real-world data. Therefore, generating RNA-Seq data in diverse populations is a critical step towards multi-ethnic utility of gene expression prediction.
Collapse
Affiliation(s)
- Kevin L. Keys
- Department of Medicine, University of California, San Francisco, California, United States of America
- Berkeley Institute for Data Science, University of California, Berkeley, California, United States of America
| | - Angel C. Y. Mak
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Marquitta J. White
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Walter L. Eckalbar
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Andrew W. Dahl
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Joel Mefford
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Anna V. Mikhaylova
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | - María G. Contreras
- Department of Medicine, University of California, San Francisco, California, United States of America
- San Francisco State University, San Francisco, California, United States of America
| | - Jennifer R. Elhawary
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Celeste Eng
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Donglei Hu
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Scott Huntsman
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Sam S. Oh
- Department of Medicine, University of California, San Francisco, California, United States of America
| | - Sandra Salazar
- Department of Medicine, University of California, San Francisco, California, United States of America
| | | | - Jimmie C. Ye
- Department of Epidemiology and Biostatistics, University of California, San Francisco, California, United States of America
- Department of Bioengineering and Therapeutic Biosciences, University of California, San Francisco, California, United States of America
| | - Timothy A. Thornton
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | - Noah Zaitlen
- Department of Neurology, University of California, Los Angeles, California, United States of America
| | - Esteban G. Burchard
- Department of Medicine, University of California, San Francisco, California, United States of America
- Department of Bioengineering and Therapeutic Biosciences, University of California, San Francisco, California, United States of America
| | - Christopher R. Gignoux
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
- Department of Biostatistics and Informatics, School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
| |
Collapse
|
49
|
Fryett JJ, Morris AP, Cordell HJ. Investigation of prediction accuracy and the impact of sample size, ancestry, and tissue in transcriptome-wide association studies. Genet Epidemiol 2020; 44:425-441. [PMID: 32190932 PMCID: PMC8641384 DOI: 10.1002/gepi.22290] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Revised: 02/05/2020] [Accepted: 03/06/2020] [Indexed: 01/14/2023]
Abstract
In transcriptome-wide association studies (TWAS), gene expression values are predicted using genotype data and tested for association with a phenotype. The power of this approach to detect associations relies, at least in part, on the accuracy of the prediction. Here we compare the prediction accuracy of six different methods-LASSO, Ridge regression, Elastic net, Best Linear Unbiased Predictor, Bayesian Sparse Linear Mixed Model, and Random Forests-by performing cross-validation using data from the Geuvadis Project. We also examine prediction accuracy (a) at different sample sizes, (b) when ancestry of the prediction model training and testing populations is different, and (c) when the tissue used to train the model is different from the tissue to be predicted. We find that, for most genes, the expression cannot be accurately predicted, but in general sparse statistical models tend to outperform polygenic models at prediction. Average prediction accuracy is reduced when the model training set size is reduced or when predicting across ancestries and is marginally reduced when predicting across tissues. We conclude that using sparse statistical models and the development of large reference panels across multiple ethnicities and tissues will lead to better prediction of gene expression, and thus may improve TWAS power.
Collapse
Affiliation(s)
- James J. Fryett
- Population Health Sciences Institute, Faculty of Medical SciencesNewcastle UniversityNewcastle upon TyneUK
| | - Andrew P. Morris
- Division of Musculoskeletal and Dermatological SciencesUniversity of ManchesterManchesterUK
| | - Heather J. Cordell
- Population Health Sciences Institute, Faculty of Medical SciencesNewcastle UniversityNewcastle upon TyneUK
| |
Collapse
|
50
|
Zhu H, Zhou X. Statistical methods for SNP heritability estimation and partition: A review. Comput Struct Biotechnol J 2020; 18:1557-1568. [PMID: 32637052 PMCID: PMC7330487 DOI: 10.1016/j.csbj.2020.06.011] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 06/03/2020] [Accepted: 06/07/2020] [Indexed: 02/06/2023] Open
Abstract
In GWAS studies, SNP heritability measures the proportion of phenotypic variance explained by all measured SNPs. Accurate estimation of SNP heritability can help us better understand the degree to which measured genetic variants influence phenotypes. Over the last decade, a variety of statistical methods and software tools have been developed for SNP heritability estimation with different data types including genotype array data, imputed genotype data, whole-genome sequencing data, RNA sequencing data, and bisulfite sequencing data. However, a thorough technical review of these methods, especially from a statistical and computational viewpoint, is currently missing. To fill this knowledge gap, we present a comprehensive review on a broad category of recently developed and commonly used SNP heritability estimation methods. We focus on their modeling assumptions; their interconnected relationships; their applicability to quantitative, binary and count phenotypes; their use of individual level data versus summary statistics, as well as their utility for SNP heritability partitioning. We hope that this review will serve as a useful reference for both methodologists who develop heritability estimation methods and practitioners who perform heritability analysis.
Collapse
Affiliation(s)
- Huanhuan Zhu
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|